2007-06-18 19:50:52 Clock test: G4 vs Xeon+Rosetta vs Xeon
Interesting benchmarks on a single-threaded loop test in C:

( c )
`  1  #include <time.h>  2  #include <math.h>  3  #define N_LOOP 1000000  4    5  int main() {  6      int i;  7      double a = 11234567890123456.0, b;  8      clock_t time_1, time_2;  9       10      time_1 = clock(); 11      for (i=0; i<N_LOOP; i++) b = a * a * a * a; 12      time_2 = clock(); 13      printf( "CPU time needed to evaluate a*a*a*a:    %f microsecs\n", 14          (double) (time_2 — time_1) / (double) CLOCKS_PER_SEC); 15       16      time_1 = clock(); 17      for (i=0; i<N_LOOP; i++) b = pow( a, 4. ); 18      time_2 = clock(); 19      printf( "CPU time needed to evaluate pow(a, 4.): %f microsecs\n", 20          (double) (time_2 — time_1) / (double) CLOCKS_PER_SEC); 21       22      return 0; 23  }`

Here are some benchmarks running this on a 1.5GHz PowerBook G4, then running the binary made on the G4 on a 3GHz Xeon MacPro, then recompiling the code on the MacPro and running again:

CCCCC
 1.5GHz G4 3GHz Xeon / Rosetta 3GHz Xeon native a*a*a*a 0.02μs 0.01μs (2x) 0.002341μs (8.54x) pow(a,4.) 0.29μs 0.08μs (3.6x) 0.065574μs (4.42x)

So, Rosetta gives a speed-up in proportion to the clock speed for the a*a*a*a loop, and a much nicer 3.6x boost for the power function. Recompiling nudges the power function up a hair to 4.42x, but the a's jump to more than 8x the original speed. Interesting stuff. No compiler optimizations were done. Just gcc with no flags set.