2007-06-18 19:50:52Clock test: G4 vs Xeon+Rosetta vs Xeon
Interesting benchmarks on a single-threaded loop test in C:
( c ) ✂
#include <time.h>
#include <math.h>
#define N_LOOP 1000000
int main() {
int i;
double a = 11234567890123456.0, b;
clock_t time_1, time_2;
time_1 = clock();
for (i=0; i<N_LOOP; i++) b = a * a * a * a;
time_2 = clock();
printf( "CPU time needed to evaluate a*a*a*a: %f microsecs\n",
(double) (time_2 — time_1) / (double) CLOCKS_PER_SEC);
time_1 = clock();
for (i=0; i<N_LOOP; i++) b = pow( a, 4. );
time_2 = clock();
printf( "CPU time needed to evaluate pow(a, 4.): %f microsecs\n",
(double) (time_2 — time_1) / (double) CLOCKS_PER_SEC);
return 0;
}
1 #include <time.h> 2 #include <math.h> 3 #define N_LOOP 1000000 4 5 intmain() { 6 int i; 7 double a = 11234567890123456.0, b; 8 clock_t time_1, time_2; 9 10 time_1 = clock(); 11 for (i=0; i<N_LOOP; i++) b = a * a * a * a; 12 time_2 = clock(); 13 printf( "CPU time needed to evaluate a*a*a*a: %f microsecs\n", 14 (double) (time_2 — time_1) / (double) CLOCKS_PER_SEC); 15 16 time_1 = clock(); 17 for (i=0; i<N_LOOP; i++) b = pow( a, 4. ); 18 time_2 = clock(); 19 printf( "CPU time needed to evaluate pow(a, 4.): %f microsecs\n", 20 (double) (time_2 — time_1) / (double) CLOCKS_PER_SEC); 21 22 return0; 23 }
Here are some benchmarks running this on a 1.5GHz PowerBook G4, then running the binary made on the G4 on a 3GHz Xeon MacPro, then recompiling the code on the MacPro and running again:
1.5GHz G4
3GHz Xeon / Rosetta
3GHz Xeon native
a*a*a*a
0.02μs
0.01μs (2x)
0.002341μs (8.54x)
pow(a,4.)
0.29μs
0.08μs (3.6x)
0.065574μs (4.42x)
CCCCC
So, Rosetta gives a speed-up in proportion to the clock speed for the a*a*a*a loop, and a much nicer 3.6x boost for the power function. Recompiling nudges the power function up a hair to 4.42x, but the a's jump to more than 8x the original speed. Interesting stuff. No compiler optimizations were done. Just gcc with no flags set.