ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi,
a lot of people think that C (or C++) is faster than python, yes I agree, but I think that's not the case with numpy, I believe numpy is faster than C, at least in some cases.
Is there another explanation ?
Or where can find a doc speaking about the subject?
Thanks a lot
Regards
Numpy implements vectorization for arrays, or I'm wrong. Anyway here is an example Let's look at the following case:
Here is the result on my laptop i3:
Labs$ python3 tempsExe.py 50000
sum with Python: 1250025000 and NumPy 1250025000
time used Python Sum: 37.28 sec
time used Numpy Sum: 1.85 sec
Labs$ ./tt 50000
CPU time :7.521730
The value : 1250025000
--------------------------------------------
This is the Python3 program :
import timeit as it
import numpy as np
import sys
try :
n=eval(sys.argv[1])
except:
print ("needs integer as argument") ; exit()
print(f"sum with Python: {func1()} and NumPy {func2()} ")
tm1=it.timeit(stmt=func1, number=n)
print(f"time used Python Sum: {round(tm1,2)} sec")
tm2=it.timeit(stmt=func2, number=n)
print(f"time used Numpy Sum: {round(tm2,2)} sec")
and Here the C program:
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
long func1(int n){
long r=0;
for (int i=1; i<= n;i++) r+= i;
return r;
}
int main(int argc, char* argv[]){
clock_t c0, c1;
long v,count; int n;
if ( argc < 2) {
printf("Please give an argument");
return -1;
}
n=atoi(argv[1]);
c0 = clock();
for (int j=0;j < n;j++) v=func1(n);
c1 = clock();
printf ("\tCPU time :%.2f sec", (float)(c1 - c0)/CLOCKS_PER_SEC);
printf("\n\tThe value : %ld\n", v);
}
I suppose numpyis C, so the actual question is: can an algorithm/implementation faster than another? Especially if it uses CPU-features (SMD) the other doesn't.
So you're comparing a python time function output which is in seconds versus a system library call output which is in clock cycles probably running on an above 1 GHz CPU.
Further as NevemTeve points out, "efficiency of an algorithm".
You achieved that performance in C with a function call overhead, "maybe", because it may have been optimized out by the compiler.
Right now your tests seems to show that a C program very much is faster by a factor greater than 10,000.
1. It almost certainly doesn't matter which is faster. (What might matter is if what's currently used is too slow, but that's a different question.)
2. If it did, simply iterating 50,000 times is rarely an accurate test for real-world performance. Even more so when you're using completely different timing methods!
Python is a language interpreter that was written in "C(++)." There is an overhead associated with interpretation which is considered to be insignificant. When performing "intense" operations like numpy, it actually invokes binary libraries which are written directly in "C(++)." Some of these libraries are extremely clever. The code that you are invoking was not written in Python, but directly in executable machine code. Python simply offers a convenient way to get to it.
actually I have to agree with all the replies, so it is just another view, a different aspect:
numpy is a python module which defines special data types and uses functions in c specially written for those type of data. That means it can be almost as fast as the same math module on pure c data structures.
In some cases numpy can be definitely slower than the regular/built-in implementation.
Besides optimizing out the function call, if the compiler used register only arithmetic for the calculations, it also would stand to be faster. Things like that can be specified using C definitions. Since I've never used python for any performance rated things or tried to write it to work in a minimal form, I cannot opine much about how fast it "could" be. Reading between the lines and understanding my obvious bias, I'd never try to use python for any performance required calculations. I probably would if someone wrote something that accomplished a thing commonly calculated, and it was readily available in my environment. But if at some point it became glacial in operation, I'd use C instead.
The languages which have made huge investments in compiling to machine code are always going to win: C, C++, Rust, Common Lisp, FORTRAN. Languages compiled to intermediate forms will be slower: Java, Perl, Python. Last will be purely interpreted languages like Bash.
Pragmatically speaking, this also comes down to the so-called "80/20 rule: 80% of the time is spent in 20% of the code." Therefore, time spent within "the [Python ...] interpreter" is of no consequence. Attention has been lavished on the binary modules, because this is the only thing that really matters.
Last edited by sundialsvcs; 02-26-2022 at 12:31 PM.
The speed increase increase of Numpy over C is almost certainly due to vectorization. You could vectorize the C code as well.
However... an algorithm improvement could reduce the execution time to nearly zero: use two loops, the first looping by some modulo (with the sum pre-computed for that modulo), and then the second looping by one. With a modulo of 50,000, the first loop would iterate once and the second not at all.
Ed
However... an algorithm improvement could reduce the execution time to nearly zero: use two loops, the first looping by some modulo (with the sum pre-computed for that modulo), and then the second looping by one. With a modulo of 50,000, the first loop would iterate once and the second not at all.
If we're going to allow algorithm improvements, there's a closed form for sum of a series. Look, Python is more than x100 times faster than numpy!
Code:
python lq-4175708558-is-numpy-faster-than-c.py 50000
sum with Python: 1250025000
time used Python Sum: 0.01 sec
Code:
#!/usr/bin/python
import timeit as it
import sys
try :
n=eval(sys.argv[1])
except:
print ("needs integer as argument") ; exit()
def func3(): return (n * (n+1))//2
print(f"sum with Python: {func3()} ")
tm1=it.timeit(stmt=func3, number=n)
print(f"time used Python Sum: {round(tm1,2)} sec")
You're not doing matrix math in your tests, but I'll make the following point anyway, since numpy is often used for that.
Whether your matrixes are row-major or column-major actually has a large effect on performance. That's just a consequence of the way CPUs work. A prebuilt linear-algebra library (including numpy) would be optimized for that, but it's more difficult to optimize for that in C than it is in some other languages. Such as Fortran. That matters if you're rolling your own.
Also:
Quote:
Originally Posted by ntubski
If we're going to allow algorithm improvements, there's a closed form for sum of a series.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.