Monday, November 17, 2008

Fortran vs C++ vs Java

Here are the latest from the labs. I had a few surprises myself. Not trying to make fun - but already doing - Visual C++ was a shame. It is amazing how much different scenarios you can have on this lab testing. Here is what I did:

1) Visual C++ against Java - Windows/Form -based application
Implementing the exact same algorithm on a Windows-based application, the Visual C++ code was two and a half times slower than Java. I implemented the exact same user interface in both environments but I guess that Visual Studio adds so much controlling code that the performance becomes extremely poor. Java (120sec) x C++ (260sec)

2) Visual C++ against Java - Command Line
The command line user interface unleashes the real power of the C++ language. The game turns completely in favor of C++; the C++ code becomes twice as fast as the similar Java implementation. Java (95sec) x C++(52sec)

3) C++ against Java against FORTRAN (yes, I did learn FORTRAN :-) )
Here, there is another catch. I was not very surprised with my initial results. I translated my C++ code to FORTRAN and it was running ten (REALLY 10) times slower than C++. Then, I have realized that there is an entire community that loves FORTRAN and that this result would never make sense.

The catch I was telling you about is on the memory allocation for arrays. This algorithm manipulates very large arrays in order to reach the result. This is something that I have never paid attention before but it made a huge difference.

Even that arrays are considered multi-dimensional entities. We need to remember that the computer memory is, in fact, single dimensional. Arrays are allocated as a single string of memory slots. This means that when you are accessing random positions from the array, there is a pointer moving back and forward on that string. If the position is near, the reading is really fast but if you have to read from a part that has been swapped to the disk, it can get veeerrrryy slow.

This particularity was exactly what was causing the slowness on my code. The way FORTRAN allocates that memory string is different than C. It would be like this:

In C++ the array is organized by columns: a[0,0] a[0,1] a[1,0] a[1,1]
In FORTRAN the array is organized by lines: a[0,0] a[1,0] a[0,1] a[1,1]

Since my program was running the second dimension first, this was making my system go back and forward on the string just to read the very first elements. After modifying my software to turn my lines into columns and vice-versa, the FORTRAN power came to place.

This was my surprise. After this very simple change, the FORTRAN code started behaving even faster than the C++ code. Java(95sec) C++(52sec) FORTRAN (50sec).

As soon as I find somewhere to host the code I will make the several versions available for download.

-Luciano

No comments: