2007-06-18 19:50:52Clock test: G4 vs Xeon+Rosetta vs Xeon
Interesting benchmarks on a single-threaded loop test in C:
( c ) ✂
#include <time.h>
#include <math.h>
#define NLOOP 1000000
int main() {
int i;
double a = 11234567890123456.0, b;
clockt time1, time2;
time1 = clock();
for (i=0; i<NLOOP; i++) b = a * a * a * a;
time2 = clock();
printf( "CPU time needed to evaluate a*a*a*a: %f microsecs\n",
(double) (time2 — time1) / (double) CLOCKSPERSEC);
time1 = clock();
for (i=0; i<NLOOP; i++) b = pow( a, 4. );
time2 = clock();
printf( "CPU time needed to evaluate pow(a, 4.): %f microsecs\n",
(double) (time2 — time1) / (double) CLOCKSPERSEC);
return 0;
}
1#include <time.h> 2#include <math.h> 3#define NLOOP 1000000 4 5intmain() { 6int i; 7double a = 11234567890123456.0, b; 8 clockt time1, time2; 9 10 time1 = clock(); 11 for (i=0; i<NLOOP; i++) b = a * a * a * a; 12 time2 = clock(); 13printf( "CPU time needed to evaluate a*a*a*a: %f microsecs\n", 14 (double) (time2 — time1) / (double) CLOCKSPERSEC); 15 16 time1 = clock(); 17 for (i=0; i<NLOOP; i++) b = pow( a, 4. ); 18 time2 = clock(); 19printf( "CPU time needed to evaluate pow(a, 4.): %f microsecs\n", 20 (double) (time2 — time1) / (double) CLOCKSPERSEC); 21 22return0; 23 }
Here are some benchmarks running this on a 1.5GHz PowerBook G4, then running the binary made on the G4 on a 3GHz Xeon MacPro, then recompiling the code on the MacPro and running again:
1.5GHz G4
3GHz Xeon / Rosetta
3GHz Xeon native
a*a*a*a
0.02μs
0.01μs (2x)
0.002341μs (8.54x)
pow(a,4.)
0.29μs
0.08μs (3.6x)
0.065574μs (4.42x)
CCCCC
So, Rosetta gives a speed-up in proportion to the clock speed for the a*a*a*a loop, and a much nicer 3.6x boost for the power function. Recompiling nudges the power function up a hair to 4.42x, but the a's jump to more than 8x the original speed. Interesting stuff. No compiler optimizations were done. Just gcc with no flags set.