Monday, November 17, 2008
1) Visual C++ against Java - Windows/Form -based application
Implementing the exact same algorithm on a Windows-based application, the Visual C++ code was two and a half times slower than Java. I implemented the exact same user interface in both environments but I guess that Visual Studio adds so much controlling code that the performance becomes extremely poor. Java (120sec) x C++ (260sec)
2) Visual C++ against Java - Command Line
The command line user interface unleashes the real power of the C++ language. The game turns completely in favor of C++; the C++ code becomes twice as fast as the similar Java implementation. Java (95sec) x C++(52sec)
3) C++ against Java against FORTRAN (yes, I did learn FORTRAN :-) )
Here, there is another catch. I was not very surprised with my initial results. I translated my C++ code to FORTRAN and it was running ten (REALLY 10) times slower than C++. Then, I have realized that there is an entire community that loves FORTRAN and that this result would never make sense.
The catch I was telling you about is on the memory allocation for arrays. This algorithm manipulates very large arrays in order to reach the result. This is something that I have never paid attention before but it made a huge difference.
Even that arrays are considered multi-dimensional entities. We need to remember that the computer memory is, in fact, single dimensional. Arrays are allocated as a single string of memory slots. This means that when you are accessing random positions from the array, there is a pointer moving back and forward on that string. If the position is near, the reading is really fast but if you have to read from a part that has been swapped to the disk, it can get veeerrrryy slow.
This particularity was exactly what was causing the slowness on my code. The way FORTRAN allocates that memory string is different than C. It would be like this:
In C++ the array is organized by columns: a[0,0] a[0,1] a[1,0] a[1,1]
In FORTRAN the array is organized by lines: a[0,0] a[1,0] a[0,1] a[1,1]
Since my program was running the second dimension first, this was making my system go back and forward on the string just to read the very first elements. After modifying my software to turn my lines into columns and vice-versa, the FORTRAN power came to place.
This was my surprise. After this very simple change, the FORTRAN code started behaving even faster than the C++ code. Java(95sec) C++(52sec) FORTRAN (50sec).
As soon as I find somewhere to host the code I will make the several versions available for download.
Tuesday, November 4, 2008
This time I'm going a little slower on the posts because I have just got into one of those crazy phases at work. As usual, there are not enough hours on the day :-)
After implementing the initial algorithm in Java language, I was requested to implement the same algorithm in C and FORTRAN. The objective of those implementations is comparing the speed of the algorithm into the different languages.
This algorithm is very CPU intensive. I have noticed that it does not take much benefit of parallel processing because it is basically accumulating operations. I could not see any way to make it parallel because each cycle depends on values generated by the previous one.
At this point, my expectation is that Java and C will have about the same processing speed. I have executed a few comparisons between these languages on the past (good and old undergraduate days :-) ).
As one might know, Java uses a pre-compiler. Based on previous experience, the only speed issue I have ever found with Java was when the Garbage Collector got triggered on the middle of some processing. After disabling the Garbage Collector and assuring that enough memory would be available for the processing, the speed was pretty much the same as the C/C++ code.
I have never been a big FORTRAN fan, but it seems that it is still widely used by the academical community. I will implement (learn) this in FORTRAN and compare the results as well.
Anyway, the thing that I thought really interesting during the algorithm discussions was when we were talking about other methods to solve the Radiative Transfer Equation. We got into a discussion about the mathematical methods versus the simulation methods. The mathematical methods try solving the equation by applying techniques to solve the equation itself. The simulation methods use equations to create a simulated test-field by where the simulated photons would actually cross the simulated liquid.
I do not think I will have the time to implement this other set of methods during the course, but I do believe that it would be very nice to tell you a little bit about how it works.
In general, the simulation methods use computers to simulate the path of each photon across the liquid. They use equations in order to define a set of behaviors that would represent the same situations that the photon would find on the real liquid. Those behaviors are linked together as if they were a tunnel. The photon will enter on that tunnel, it will be transformed by one equation and the result will be the remaining of such photon what will be used as input to the next equation and so on.
For each equation on that test-tunnel, there are random events that might affect different photons in different ways. Each photon passing by the test-tunnel will record a new system history.
In order to calculate the final result of the equation, an statistical analysis is executed in order to consolidate the thousands/millions of recorded histories. This way, the system seems to reach an acceptable level of accuracy in order to solve the equation.
I have ever been a big fan of simulations. You do not need to guess much about which method I'd prefer :)
Just a side note for someone who may be interested on working on this field. I heard that people use pseudo-random algorithms to simulate the events to interfere on the system. As one may know, pseudo-random generators are not really random. They do offer a nice randomicity but they are not truly random. I wonder if it would make sense try optimizing the histories generated on the systems by running a pattern analysis on the stream of pseudo-random numbers.
I'm not even close to be an expert on the simulation methods, I already apologize in case I'm saying something completely wrong. I mean, the pseudo-random numbers could be easily pre-generated by using the same generator seed. If the numbers were used as actual values to add/sub/mul/div on the equations, this would not work. However, if the pseudo-random numbers were used in order to take decisions, those decisions could be predicted by analysing the pseudo-random stream of numbers. By doing this, lots of processing could be saved by knowing in advance in what stage the photon would be lost for example.
As my professor said, understanding these methods would require a PhD on its own. In this case, based on as little as I know about them, I believe this makes sense. It would be where I'd focus my research in case I'd be engaged on this subject.
Next post, I hope I will have more news about the comparison of the algorithm implementation on the several languages.
Friday, October 10, 2008
Method Sn - Discrete Ordinates with finite Differences method for solving Radiative Transfer Equation
There are a few methods that can be used to solve RTEs. The Sn Method is supposed to provide a fair balance between computational requirements and accuracy.
From the mathematical perspective, I do not feel confident enough to provide a very detailed explanation. In this case, I will tell you what I know from the computational perspective.
This method uses two special math operations in order to solve the equation. As one may see in the original equation, there are two particularly hard components to implement on a computer. One derivative and one integral.
Calculating this for each of the regions would give us the derivative for each region.
The integral is much more complicated. On the integral case, we use a numeric quadrature. We replace the single equation by a set of attached differential equations. In this case it is used the Gaussian Quadrature.
The reason why we have several equations instead of just one is because the original equation accumulated the intensity of all photons coming from all angles into a single value. With the quadrature approach, there is a weight representation for each angle. Each new equation represents the photon intensity for each evaluated angle. It is possible to use quadratures with lower or higher degrees. As high the degree of the quadrature is, as high the accuracy is going to be. The degree of the quadrature represents the number of different angles being evaluated for the equation.
After simplifying the general equation by removing the derivative and the integral components, one will get to these two versions of the same equation:
After getting to these equations, the general algorithm for solving this problem is:
----- Variables -------
precision <- Desired precision
I <- hold RTE results for each angle in each interface - N (angles) lines by (# interfaces) columns
H <- previous iteration of I
---- Initialization -------
set first column of I with the external source
set last column of I with the Internal Source
set second column of I with 0.5 (*)
------ Body ----------
Copy all values from I to H
for ( each layer interface )
for ( each angle on gaussian quadrature table )
if ( angle cosine > 0)
} while (I-H > precision)
At this point, even if you do not have a clue about Math and can understand algorithms, you should be asking yourself: "OK! On equation (2), I'm calculating the value for [k] interface based on interface [k+1]. How is that possible if interface [k+1] has never been calculated?"
This is a fair question. That is why we have to get back to the initialization of the second column (*) with 0.5 values. This method uses several iterations in order to reach the desired results. For the very first iteration, calculating the very first column, you are right! There is no value on column number 2. We solve this by guessing a number for the second position, any value between the interval ]0 1] (any number between 0 and 1, but 0). This way, the algorithm will be able to run in order to populate the entire matrix. After this initial guessing, the matrix will be entirely populated and the next iteration will use values from previous iterations to keep going.
The stop condition is found when the values from the current iteration, compared to the values from the previous iteration matches the desired precision.
At this point, I'm comfortable with the algorithm itself. This convergence to a single result after many iterations is the part that is still bugging me. On next class, I will spend a lot of time with the professor trying to understand this behavior because for me it still looks like magic. I could not find a good reason yet on why the values will converge to the result we seek.
I will post again when I have more information.
Thursday, September 25, 2008
As I said before, I have just started taking PhD Classes at INPE. I'm the only students on this cycle (3rd/2008) attending the Computational Optics course. I have decided to go for Applied Computing as the Area for my PhD. I was a little tired on doing research only in Informatics. I'm glad to say that I found an area that I know absolutely nothing about :-).
On these classes, we will study a single equation. Yeap, I also thought it would be easy when the professor mentioned. The trick is that the equation looks like this:
Radiative Transfer . It is going to be my baby for the next three months. I'm not going to even try explaining it to you. However, its purpose is really nice.
In general, this equation allows scientists to evaluate the interactions of a flow of particles when they move around the space and go from one medium to the other. In my particular case, I'm learning how to understand the interactions that take place when a flow of photons (light) hits the ocean.
Behind the algorithm, there is a very noble cause. By understanding the amount of light that reaches the ocean and the amount of light that returns (refracts) from the ocean, scientists can estimate the characteristics of the water on that particular area. For example, it would be possible to evaluate the amount of chlorophyll dissolved into the water what would allow additional calculation in order to identify how much life (fishes, etc) is also present/supported on such area. This would be a nice feature for Google Earth, isn't it? Checking out the fish populations on the move from one spot of the earth to the other, how cool would it be?
The "real" scientists, not students like me, use the most of the equation. However, I'm glad to say that we have defined several simplifications that will allow even an IT guy like me to understand (or at least try understanding) how to solve the problem. Here are some of the assumed simplifications:
- The flow of photons will not vary on time. It is the same as assuming that the Sun would always have the same intensity of light, does not matter what time of the day.
- We are not evaluating a particular spot on the ocean. We assume that we have an infinite geometry. We assume that all photons emitted by the light source will always reach our ocean, does not matter where.
- Our light source is isotropic; it means that photons are equally emitted in all directions. There is no preference(different orientation) for any single spot.
- The photons will reach a single surface. In this case, the equation will not consider the flow of photons going through clouds before they reach the ocean, for example.
- There is also a set of frequencies(light colors) being studied. In our case, we will consider all frequencies of the visual spectrum (red to violet) as a single band. Using the average values for the entire band also makes the equation simpler. We trade accuracy by simplification on this one.
- We also consider that there is no internal source of light. On another words, we assume that no luminous fish were swimming on our ocean during the satellite scanning.
There are a few other simplifications that I cannot really remember now. They make the equation "much" simpler than its base form. Here its simplified version is:
Well, after the initial 30 second that you have been cursing yourself because you have no clue on what this means, I should tell that there are a few methods already tested and that can solve this equation. Here are the ones we will implements as/if time permits:
Method Sn and Method Monte Carlo.
At this point, I'm still working on understanding the problem. I will post something about its resolution as soon as I get there.
-Luciano - Don't worry, I'm completely freaked out too :-)
Monday, September 8, 2008
Last week, the most of my work was related to getting my lab infrastructure ready for new projects. I have had a few performance problems with VMware Free Server and I was looking for other options.
I was deploying an Interactive Voice Response Unit on the top of a VMware Server. My Server was composed by a Dual Xeon processor with 8 GRAM and about 500G HD. The system used to work just fine with several CTI/CRM suites. However, it seems that my previous real-time applications were not as "real-time" as the IVR is.
It was really annoying to hear the IVR scripts playing that slow. I could hear the prompts when I dialed-in with the phone but it was like a kid telling the words rrreeeaaaallllyyyy sslllloooowwww. This is very likely because of the problem I mentioned in a previous post that even with 8 processors available, only one of them was being used to each of my Virtual Machines.
The good thing is that I was checking out a candidate list of supported platforms for the next releases of our products and I saw that VMware ESXi is one of the options. Even better that VMware just released a free (limited-) version of this software. I have been able to download and start playing with it. Here are some findings from the tests executed last week with VMware ESXi 3.5:
Lab Environment: IBM x3650, 16GRAM, 600HD.
Installation: The ESXi software is distributed as a bootable CD. It is a sort of Linux that installs on the top of the raw equipment. I was not sure if it would work with my equipment since it was brand new for me; however, I had absolutely no problem to make it work. I did not have to enter almost any information during the installation process. Even things like IP addresses had to be configured after the entire system was already deployed. The default installation was enough to bring the entire system up.
Initial Configuration: The very first thing to do with ESXi is getting to its web-based interface. The system is initially configured with Dynamic DHCP. From the console, you should be able to see the IP address and change it to whatever fits into your network. With the new IP defined (or even the old one if you rather use your DHCP), just use you internet browser and get to it. From there you will be able to download the "VMware Infrastructure Client". Just install this on your client machine and you will be ready to go.
The GUI is very nice. It allows you to take a look on all your system resources. I have not tried anything advanced yet. It is useful to check all the system components and - of course - play with your virtual machines.
Importing Virtual Machines: This is not as easy as it sounds. This is an area that I think ESXi still needs lots of improvements. VMware has released a VMware Converter. The concept is great but I could not make it work for Linux Guests. Converter provides a wizard that asks you where the source machine is and to where it is supposed to be transfered. I was very impressed initially because it even offered me to convert directly from my VMWare Server format to the ESXi format. I just selected the source, entered the IP address and credentials of my new ESXi server and the magic started. To be fair, I get to say that the conversion of my Nuance Server which was working on a Windows box got converted and imported to ESXi perfectly. It was really like magic, never did anything easier. The nightmare started when I tried transferring my Redhat 5 system from VMServer to ESXi. I wasted a few hours but nothing worked. Luck me, it was fairly quick to create a new ESXi-based virtual machine and load the RedHat directly into it. Bottom line, converter needs improvements for conversion of Linux boxes. I have tried copying the virtual HDs (vmdk) from one side to the other but I got no luck to make the virtual machines work either.
Storage Devices: ESXi works with its own file system (vmfs3). Based on conversations with some friends, it seems that the management of the file systems on previous releases was really painful. On the 3.5 release, I get to say that I see no problems at all. The GUI really does a great job in order to create folders, move virtual machines from one place to the other, etc. The main issue (or not depending of your expectations) I found was because ESXi does not accept/recognize any kind of USB Storage. I was hoping that I could plug an additional External HD for my Virtual machines but it did not work. The way ESXi manages the disks is via the GUI. You'd have to add a new "Storage" device. The "Storage" device can be iSCSI/SCSI HDs or a Network File System. This may be a little problematic if you are not used to manage UNIX servers.
In my case, I used this as an excuse to install my first Windows 2008 Server. Since I had to find some way to translate my windows shared folder into NFS, I have decided to install windows 2008 in order to use the "Services for UNIX" package. This package is available via download from Microsoft for Win2003/2008 Server. I had Win2003 Server 64-bits and the package is not available for that platform. Since I'd have to go back to a 32-bit platform anyway, I decided to go to win2008 to have some fun.
I have installed the Windows 2008 Data Center - Without Hyper-V edition. It installs just fine (it is windows right :-)). The standard GUI looks a lot like Windows Vista but without all the cool colors. On this new version of windows, you will notice that not all packages and services are installed right away. The main change I saw was that you have tens of optional items to install after the initial deployment. For the "Services for UNIX", you just have to add this "Function" from the File Server tree. In order to have the UNIX IDs synchronized to the Windows Users, you will need to have one Active Directory installed as well. If you did everything on the Windows side correctly, all you have to do is creating a windows NFS share and creating a Storage Device from the ESXi side.
From that point on, ESXi server can access whatever files you have on your NFS share. This is particularly important if you want to access a common repository of CD/DVD images. For CD/DVD images, ESXi-based virtual machines can only see files on a "Client Device"(**) (DVD-ROM on your client computer), Host device (DVD-ROM on your host computer or Data Store ISO Image (CD/DVD Images inside a valid Storage Device).
(**) The "Client Device" mode is supposed to allow you to select a local ISO image as well. However, it did not work for my Client running on a Vista environment, connecting to an ISO image on a shared windows folder. If someone know how to make it work I'd be more than happy to hear it.
Final Comments: I'm not sure if this is important for most of you, but it is for sure important for me. ESXi 3.5 cannot use IDE Hard Drives. Because I use a few simulators that run on the top of old Linux distributions, this is something that prevented me of migrating all my servers to ESXi. Unfortunately, my PBX simulator could not be installed on ESXi because it does not have SCSI drivers. I still have to keep my VMWare Server working for these old Virtual Machines.
I think this is all I can recall from this week. I will post additional information in case I find anything new.
Wednesday, August 27, 2008
Here is some info I put together about a major event going-on on the Brazilian Telecom Marketing.
Brazilian Government has received thousands of complaints from customers over the last several years against Customer Service Centers on several different service areas. They have consolidated the "most common" complaints and put together a set of rules that companies will have to comply in order to avoid penalties.
On July 31st, 2008, the Brazilian Government made officially available a new set of rules on how the Customer Services Centers (Contact Centers) must deal with their customers. The new set of rules is supposed to make the customer interaction a lot more effective. Things like "All Customer Service representatives must be enabled in order to cancel immediately any services as per the customer request", "the customer cannot be transferred more than once after he/she reaches the first live agent", "the customer must be offered the option of talking to a live agent in the very first interaction with an Automatic Response System", "All Customer Service Centers must be enabled with TDD (Telecommunications device for the deaf) , "All interaction costs must be charged to the Enterprise and not to the customer", etc.
From the customer perspective I'm really happy - I'm one of those thousands of guys who have complained about the end-less waiting time on my cable company phone queues being brain-washed by advertisements!
Unfortunately, I could not find any English version of the regulations. The original document can be found at "Decreto No 6523" from the Republic Presidency website.
Here is a quick summary with my own free (non-official, of course) translation:
Art 1 and 2 – This rule intends regulating all Contact Center activities regarding the Customer assistance. The rules defined here are not applied to Active Telemarketing.
Art 3 – All costs must be charged to the CC.
Art 4 – The IVR will always offer on the very first script the possibility to transfer to a live agent, complaints and Service Cancelation
- Live assistance must be available from every IVR script
- The CC will never be allowed to hung-up on a customer
- No identification is mandatory before transferring to a live agent
- Max wait time in queues is still to be defined
Art 5 – CC must be open for business 24x7
Art 6 – Hearing impairing access is mandatory for all CCs
Art 7 – CC phone number must be readily available in all Enterprise documents
Art 8 - CC must be polite / honest / etc
Art 9 – Agents must have proper training to do their job
Art 10 – Customers can be transferred only once in case the first agent cannot complete its request.
- Call transfers must be completed in up to 60 seconds
- For service cancelation and complaints, no transfer is allowed. All agents must be able to handle such requests.
- CTI apps must allow access to customer request history
Art 11 – Customer data must be kept private
Art 12 – It is not allowed to ask the customer needs again after it was first initially asked during the first conversation with the live agent.
Art 13 – CTI apps must be in place to make sure the service quality matches the customer needs
Art 14 – It is not allowed to play marketing messages during call wait unless it is previously accepted by the customer.
Art 15 – A protocol must be generated to the customer for each request and that number will be used by the customer in follow-up contacts
- All contacts must be recorded and kept for at least a 90 days period. The customer may request access to it based on its protocol number.
- The recording must be kept available for verification purposes for at least two years
Art 16 – Customer have the right to receive in up to 72 hours by regular mail or electronic media the entire history of its request.
Art 17 – Information requests must be provided immediately and without complaints. Issues must be fixed in up to 5 days counting from the day they got registered on the CC.
Art 18 – CC must be able to promptly answer any Cancelation request made by the customer.
Art 19 – There will be punishments in case these rules are not followed.
Art 20 – The regulatory agencies will provide clarification in case any of these items is not clear enough.
Art 21 – The customer rights defined here do not change the other customer rights already defined by other laws and regulations.
Art 22 – The current set of rules is public and will be effective on December 1st 2008.
Tuesday, August 26, 2008
Virtualization is the capability of running multiples virtual computers inside the same "real" computer. I used to consider this as a BIOS emulation. However, the technology has evolved so much that they now emulate, BIOS, Video drivers, Sound Card, NIC, etc. It makes my life a lot easier and my lab a lot smaller :-).
These days, I have done some quick testing on Virtual box. This seems a very interesting option for the future. It works in Windows Vista and several other platforms. Unfortunately, it still misses some of the functionalities that can be found on commercial softwares such as VMWare.
I was very impressed with Virtual box performance. Even that the virtual PC supports only a single processor, the response time is really fast. I particularly enjoy the fact that it can load the exact same HD file images from VMware (Makes the transition very smooth). What I did not like was the fact that each virtual network card from the guest operating system gets mapped to a virtual network card inside the host PC. In my case, it is not unusual to see me running 4~6 machines on the same physical box, with 10~15 machines created; each server would required two network cards; with simple math we would realize that Virtual box would require at least 30 virtual network cards installed on my host server (a little too much for me since VMWare requires only three, does not matter how many virtual machines you have created). Of course that one might say that I could reuse some of the virtual NICs for other machines that are not running, but I really do not want to keep changing my Virtual machine setting every time that I need to start it. I also require working with multiple processors; this is another feature that I cannot afford to loose.
With VMware Free server, I do have pretty much everything I need. Multiple processors (only 2 in fact), fairly good processing power, etc.
There is one thing that I really need to mention regarding Virtualization. This is the one thing that bothers me the most. And this seems to be true for all (or at least all the free ones) virtualization platforms. I have noticed that does not matter how many processors you have in you system, the virtual computers always use only a single processor at a time. I mean, if you have two quad-core processors, your Host Operating System will identify eight processors total. The virtual machine will run as a single Operating System process, what means that this process will keep switching from processor to processor at your Host PC. Sometimes it will occupy 100% of its processor while the other 7 processors will be doing nearly nothing. It is true that if you have several machines running at the same time, you see a better overall CPU utilization because with 2 or three machines, you will see two or three processors running close to 100%. Of course, the other 5 or 6 processors will keep doing nothing. I do not think that this is an easy issue to be addressed because all virtual platforms seem to have the same limitation. However, I think that the one which find a way to really distribute the full host potential to the guest PCs will become the main choice for all business on the market (at least, would become the favorite on my lab).
At this time, I will keep working with VMWare Free Server. However, I really encourage the Virtual Box developers to keep up with their great work because I really think that they are on the right path.
Monday, August 25, 2008
You will realize that I run several activities at the same time - may be that is why I cannot keep track of everything by heart. By now, I think it is fair to mention the subjects that I'm working-on right now so I can provide some more context about whatever findings I decide to record here.
Project#1 - At this point I'm working on the Contact Center area. We are developing a Business Intelligence type of tool that I obviously cannot provide much more details because it is part of the company Intellectual property. Since I cannot really provide in-depth details about the product itself, you can expect to hear (or read :-) ) a lot about Software virtualization, High performance tricks, Linux Operating system and, may be, some database tricks.
Project #2 - Contact Center self-service - This is a particular cool one. I'm new into this as well, just started about one month ago and I love the possibilities. Since we are using several open standards, I'd expect to write a lot about Java Language, Voice XML, Call Control XML, Text to Speech, Voice Recognition, etc. Lots of fun to come from this end.
Project #3 - Space Science - I'm not even sure if I should be mentioning this here. Honestly, I have not even started learning anything about this. Next week, on Sep 1st I'm planning on going to INPE (National Institute of Space Research in Brazil) in order to start taking classes for my PhD. However, I do not have the time to attend their regular PhD program. My plan is attending as many classes as possible as an "external" student (as we call the guys who attend a few classes without being officially part of the program) and when they finally kick me out I will join the official program and try taking advantage of the credits that I got as an "external" student in order to not be required to attend all classes. Well, even that I already got some not so friendly messages from there about doing this, I'm not sure how far I'm going to go. Keep you posted :)
Well, other than these three main items, I'm always looking for an excuse to do something different. Sometimes you will read things about my patents, the jokes from the sales-enablement activities, etc.
Hopefully, I will not give-up this blog at any time soon and even more hopefully I will find interesting things to post.