## Benchmarking Javascript for computational purposes

Since Javascript at the moment is the simplest way to execute code on any client computer around the world, i thought it could be an interesting exercise trying to figure out how fast it became and if it’s a viable way to use it for a distributed computing platform.

From some sources around the Web it appears a lot of work has been done to make Javascript as fast as possible and found this wonderful snippet who claimed:

On my four year old Atom processor running Firefox, the Javascript code takes 990 milliseconds to sieve 78498 primes between zero and one million. The equivalent C code (shown below) once compiled and executed on the same machine takes 98 milliseconds.

Interesting! Let’s see how it goes with my even older 5 years laptop with an i5-2520M CPU @ 2.50GHz.

I just opened a Firefox tab and in the console (press F12) i copy pasted:

```var N = 10000000;

var main = function() {
var i, j, cnt;
var composite = new Array(N + 1);
var started = new Date();
for (cnt = 0, i = 2; i <= N; i++) {
if (! composite[i]) {
// i is prime
cnt ++;
// sieve out all multiples of i
for (j = i*i; j <= N; j += i) {
composite[j] = 1;
}
}
}
var stopped = new Date();
var elapsed = stopped - started;
console.log("There are " + cnt + " primes between zero and " + N);
console.log("Sieving " + cnt + " primes took " + elapsed + " milliseconds.");
};
main();
```

these were the results:

```There are 664579 primes between zero and 10000000<br /> Sieving 664579 primes took 8172 milliseconds.```

Let’s compare it with the equivalent C code :

```#include <sys/time.h>
#include <stdio.h>
#define N 10000000

int main() {
struct timeval start, stop;
long unsigned i, j, cnt, elapsed;
static int composite[N + 1];
gettimeofday(&start, NULL);
for (cnt = 0, i = 2; i <= N; i++) {
if (! composite[i]) {
// i is prime
cnt ++;
// sieve out all multiples of i
for (j = i*i; j <= N; j += i) {
composite[j] = 1;
}
}
}
gettimeofday(&stop, NULL);
elapsed = (stop.tv_sec - start.tv_sec)*1000000 +
(stop.tv_usec - start.tv_usec);
elapsed = (elapsed + 500) / 1000;
printf("There are %lu primes between zero and %lu\n", cnt, (long unsigned) N);
printf("Sieving %lu primes took %lu milliseconds.\n", cnt, elapsed);
return 0;
}
```

```\$gcc benchmark.c -o bench<br /> \$./bench<br /> There are 664579 primes between zero and 10000000<br /> Sieving 664579 primes took 279 milliseconds.```

With O2 optimization we get:

```\$gcc -O2 benchmark.c -o bench<br /> \$./bench<br /> There are 664579 primes between zero and 10000000<br /> Sieving 664579 primes took 171 milliseconds.```

Ok, in the worst case it’s 40 times slower on Javascript than with C this isn’t so good.

But of course, this could be because we chose the worst example to choose from, let’s try with md5 and see where we go: we will generate 10 millions md5s and we will measure how long it takes to do that.

Let’s find some md5 implementations for javascript and let’s see how it goes.

First one i took was from here ; i just copy pasted that library in a firefox console and then copy pasted this piece of code :

```var iteratemd5 = function(){

var N = 10000000;
var cnt;
var started = new Date();
for (cnt = 0; cnt <= N; cnt++) {
md5(cnt.toString());
}
var stopped = new Date();
var elapsed = stopped - started;
console.log("got " + cnt + " md5 in " + elapsed + " milliseconds.");
}
```

which gave me

`got 10000001 md5 in 49466 milliseconds. <br/>`

Uhm, it doesn’t look so fast. Let’s compare it with a quick python snippets:

```import time
import hashlib

N = 10000000
x = 0
start = time.time()
while x < N:
pp = hashlib.md5(str(x)).hexdigest()
x += 1
stop = time.time()
print "got {} md5 in {} ms".format(x,(stop-start)*1000)```

with these results

`got 10000001 md5 in 12322 milliseconds. <br/>`

Ok so Python is 4x times faster than this md5 javascript implementation, maybe it’s not so optimized, let’s try another.

I got this one from blueimp; since it’s a client library, i thought it would be best to write a simple html page and load it on the browser, this is the html

```<html>
<script src="https://blueimp.github.io/JavaScript-MD5/js/md5.js"></script>
<body>
<div id="mytext"></div>
<script>
var N = 10000000;
var cnt;
var started = new Date();
for (cnt = 0; cnt <= N; cnt++) {
md5(cnt.toString());
}
var stopped = new Date();
var elapsed = stopped - started;
var fieldNameElement = document.getElementById('mytext');
fieldNameElement.innerHTML = "got " + cnt + " md5 in " + elapsed + " milliseconds.";
</script>
</body>
</html>
```

a quick http server can be achieved by running python -m SimpleHTTPServer on the same directory of that page.

These are the results:

`got 10000001 md5 in 40150 milliseconds.`

Let’s try one more time: from stackoverflow it looks “spark-md5” it’s the fastest one.

Again, another client library so we create an html page:

```<html>
<script src="//cdn.rawgit.com/satazor/SparkMD5/master/spark-md5.min.js"></script>
<body>
<div id="mytext"></div>
<script>
var N = 10000000;
var cnt;
var started = new Date();
for (cnt = 0; cnt <= N; cnt++) {
var hexHash = SparkMD5.hash('Hi there');
}
var stopped = new Date();
var elapsed = stopped - started;
var fieldNameElement = document.getElementById('mytext');
fieldNameElement.innerHTML = "got " + cnt + " md5 in " + elapsed + " milliseconds.";
</script>
</body>
</html>
```

This doesn’t look fast at all, what would happen if we compare it with a C implementation ?

Since i didn’t want spend the rest of the day by writing a C program for MD5 i went for one available on stackoverflow and changed it a bit for my needs.

This one will basically hash the string “hello” 10000000 times using openssl library, there are many reasons why this code is far from good but let’s try it anyway

```#include <stdio.h>
#include <sys/time.h>
#include <stdlib.h>
#include <string.h>
#if defined(__APPLE__)
#  define COMMON_DIGEST_FOR_OPENSSL
#  include <CommonCrypto/CommonDigest.h>
#  define SHA1 CC_SHA1
#else
#  include <openssl/md5.h>
#endif

char *str2md5(const char *str, int length) {
int n;
MD5_CTX c;
unsigned char digest;
char *out = (char*)malloc(33);
MD5_Init(&c);
while (length > 0) {
if (length > 512) {
MD5_Update(&c, str, 512);
} else {
MD5_Update(&c, str, length);
}
length -= 512;
str += 512;
}
MD5_Final(digest, &c);
for (n = 0; n < 16; ++n) {
snprintf(&(out[n*2]), 16*2, "%02x", (unsigned int)digest[n]);
}
return out;
}
int main(int argc, char **argv) {
long int c, elapsed;
int N = 10000000 ;
struct timeval start, stop;
char *output;
gettimeofday(&start, NULL);
for (c=0; c < N; c++){
output = str2md5("hello", strlen("hello"));
}
gettimeofday(&stop, NULL);
free(output);
elapsed = (stop.tv_sec - start.tv_sec)*1000000 + (stop.tv_usec - start.tv_usec);
elapsed = (elapsed + 500) / 1000;
printf("Calculating %lu md5 took %lu milliseconds.\n", c, elapsed);
return 0;
}
```

this is compiled with (you need to have openssl devel libs):

`gcc md5string.c -o md5string -lcrypto -lssl`

and gives a surprisingly:

`Calculating 10000000 md5 took 16714 milliseconds.`

Big surprise, this one is actually slower than the one in python, how come? Two reasons: first one python hashlib module is well known on being optimized and second, as said earlier, this code is far from being good.

With some simple changes we can do this:

```#include <stdio.h>
#include <sys/time.h>
#include <string.h>
#include <openssl/md5.h>

int main() {
unsigned char cmd5[MD5_DIGEST_LENGTH];
MD5_CTX mdContext;
unsigned char data;
struct timeval start, stop;
long int c, elapsed;
int N = 10000000 ;
gettimeofday(&start, NULL);
for (c=0; c < N; c++){
snprintf(data,9,"%lu",c);
MD5_Init (&mdContext);
MD5_Update (&mdContext, data, strlen(data));
MD5_Final (cmd5,&mdContext);
}
gettimeofday(&stop, NULL);
elapsed = (stop.tv_sec - start.tv_sec)*1000000 + (stop.tv_usec - start.tv_usec);
elapsed = (elapsed + 500) / 1000;
printf("Calculating %lu md5 took %lu milliseconds.\n", c, elapsed);

return 0;
}
```

that gives us

`Calculating 10000000 md5 took 2557 milliseconds.`

Conclusions

In conclusion we have tried some comparisons between Javascript and C (and python). Since my javascript is really not that good, i suppose there is somewhere out there some code that could reach better results but, meanwhile:

Javascript MD5 Hash, with the best scenario of 40150 milliseconds for 10 millions hash we get around 250k hash per second

Python MD5 Hash, with the best scenario of 12322 milliseconds for 10 millions hash we get around 810k hash per second

C MD5 Hash, with the best scenario of 2557 milliseconds for 10 millions hash we get around 3910k hash per second

Since we can avoid the speed bottleneck of Javascript by having multiple clients at the same time and being able to reach more clients than with a C app – even with some easier framework like boinc – are these good numbers, or the complexity of having at least 40 clients to get the same speed of one single computer is still too high to make it a viable solution for distributed computing?

Time will tell us.