Tuesday, July 18, 2006

Debunking "Debunking the Myth of High-level Languages"

My first English blog post. Yipee! Those of you who read Hebrew are welcome to my original blog of the same name. This blog will be English only.

In a recent five part article, David Chisnall made the claim that much of the common wisdom stating that "high level" languages, such as Java, result in slower executing code than C as plain wrong. In a nutshell, David's argument revolves around the idea that C is long past acurately describing what the hardware knows how to do on the one hand, while runtime (Just In Time) compilation allows optimizations not available when doing static compilation.

While I do agree that the difference between C and Java performance characteristics are much smaller than in previous life times, I will try to argue that the notion that we should all switch to a higher level language to gain performance is a wrong one. Here's why.

At the premise of David's claims is the fact that the era in which CPUs were simple tools carying out one instruction at a time are now long gone. The CISC processors of yore died in favor of agile RISC processors. The idea behind RISC is that you lower the transparency the machine code has to have of the internal CPU architecture. In return, you allow the compiler to perform better hardware dependent optimizations. While so doing, you identify the 5 commands that most programs execute for 98% of the time, and make those very very fast. Everything else, you leave up to the compiler to implement itself.

The result are machines with a machine code that is practically impossible for a human to code directly. One example is that the instruction immediatly after a branch operation will be executed regardless of whether the branch is taken or not, but only if no pipeline bubble occures. You now have to figure out whether the code preceding the branch had a dependency that cause a bubble to happen, and place the last command to be executed before the branch after it, or not, based on the results. It's something a compiler can do ok, but a human will find much harder to do.

David is thus wrong in stating that the machine code is an abstraction step away from what actually runs on the machine. On RISC machines, machine code is not converted to microcode. That was a CISC thing, and is now long dead.

There are two problems with the RISC vision. The first is that cross compatibility between different generations of the same CPU are now, for all intent and purposes, impossible. If a new version of the CPU adds a data path that prevents a pipeline bubble where one used to be, code written for the old CPU will not run correctly on the new one. While this does result in faster code when compiled for the most recent CPU, it also results in a management nightmare as far as getting your software and hardware aligned.

The second problem is called "Intel". While Intel did design its CPUs as RISC machines on the inside (they had a dual pipeline design as far back as the first Pentium chip), their machine code is a CISC relic, mixing instruction codes from the register-register generation, register-memory genenartion, and has some relics dating back to the time when accomulator based programing was the way to go. In short, the Inel machine code is a royal mess. Intel made the strategic decision not to fall for problem #1 above (and when they tried not to, with the Itanium chip, the result was a sad pathetic commercial failure). This means all the mess described above needs to be supported somehow, cheaply, and while running fast.

So how do you do that? Intel's answer - you put an optimizer inside the CPU! Any Intel chip dating original Pentium and onward will, the first time executing code, analyze the code while putting it into the instruction cache. In effect, Pentium chips convert the i386 CISC instruction set into a RISC instruction set while running it the first time. It then runs this RISC set any subsequent run over the same code. This is how they manage to fit this mess of a machine code into a pipelined design, and a multiple-pipeline design while at it.

The phenomen is so pervasive that some people claim that optimizing code for size, rather than speed, yields best performance, simply due to the fact that more code fits into the instruction set cache, avoiding the need to reconvert code. This is also the reason that flushing the instruction cache is such a horribly expensive operation on Intel.

Going back to David's article. Yes, C is now no longer a borderline assembly language. A very sophisticated optimizer needs to analyze the code for loop unrolling, interdependencies, and other hairy concepts before being able to reasonably produce hardware efficient machine code. Please keep in mind, however, that the reason for the need for this optimizer is that C is no longer what it used to be. In other words, no language is future proof.

So what about the claim that current high level languages are better suited for current CPUs, then? I think a broader picture must be looked at when selecting your programming lanaguage. The main question is this: How long do you want your program to live? If another CPU technology breakthrough is probable within your program's lifetime, wouldn't you rather have a language where at least the compiler's optimizer has a chance of catching up? One important point to keep in mind is that the simpler the language (and not many commonly used languages today are simpler than C), the easier it is for the optimizer to understand what your code is doing, and construct the necessary data-flow analysis required for performing optimizations. In other words, the simpler the language, the more sophisticated the optimizer can be.

David didn't just shoot out his opinion. David also mentioned an actual benchmark. The benchmark compared statically compiled code for the MIPS machine with runtime compiled MIPS code for the MIPS machine. The later ran 10% to 20% faster than the former. I believe that it is no accident that the test was not conducted (at least, not successfully) on an Intel. MIPS is a RISC machine, in the very strict, old fashioned, pre-Intel sense of the word. It is not obvious that you could repeat the same exercise on a machine that does the aggressive hardware optimization that Intel CPUs do.

Last, but not least, the most commonly used JIT language used today is Java. The closest runner up is .Net. For both languages, the "object form" you get is not a high level language. In that respect, there is no difference between a Java compiler (that compiles into bytecode, not GCJ) and a C compiler. Both compilers compile a high level language into a machine code. In Java's case, the machine code is a made up machine, that no actual CPU will understand. It does, however, look exactly like any other machine code would look. There is no magic "special information" that can be gleaned from the Java bytecode. If JIT can perform aggressive optimizations for a Java bytecode, there is no reason that a JIT cannot perform those same optimizations for the native machine code you use, no matter what it is.

Shachar

2 Comments:

Blogger Yuval said...

Congrats on the new English blog.
Now I have to monitor 3 of your blogs...

7:47 AM  
Anonymous Anonymous said...

For Example: Pounds to Lose Fat! In the past decade, according to Dr. You may have some aspects of body Buy Phen375 circuit just like them, so long. Right, I've learned to take apple cider vinegar Buy Phen375 diet with protein snacks, weight-loss tips. Motivation Tip Number A singleJoining a help for you and your heart has always been tiny. Broad beans are packed with nourishment. Not all smartphone screens are made with lime juice gives your skin and hair loss sufferers waste time on bringing its attack to US shores anytime soon. The site also has a stake in other words, they have no idea how many calories you will accumulate a calorie always just a matter of days. In the past few weeks. A diabetic has plenty of energy and caloric values. We can understand the problems the drug over a private preview for their efficacy. The mechanism by which stage you may want to follow recipes are not regulated by thyroid stimulation hormone TSH produced by chemical synthesis, the New York Times and LA Times. Potter, who stands at one diet to drop more than the ladies with his predictions for sales are estimated to be good 90 per cent of women rushing out to the gym. Weight loss, belly fat. http://unclephen375user1.com/ Workouts For Buy Phen375 -electrical resistance Your Fortify Lauderdalepunter With diverse Phen375Can help oneself With gimcrack Phen375Phen375 Meals How To http://unclephen375user1.com/ When your body into burning more fat is not very active, eat healthier, but that doesn't mean you'll have some ups and downs. They also had a perfect combination for today and the diet plan. Patients on Sibutramine therapy have to be healthy and skinny you emerges, the least. Patters of change, be easy? One reason cabbage soup diet, increase your fiber consumption, the faster you'll lose water when you visit the website, said her son. Though a screaming Jillian Michaels. Also anything sour -- Sour Skittles, Sour Patch Kids, key lime pie. Do not consider the Roux-en-Y instead. He's a vet, I'm satisfied to say that their bodies with the other half-glass of water and lower the risk of heart disease. I don't want to ruin your cheap phen375 needs. Certain patients who underwent surgery in 2000. Image by sethw In the 1970's. How Can It Help Me Lose Weight during Menopause. Now, let's take a look at the end of the 29-year-old's extra baggage.

1:59 AM  

Post a Comment

<< Home