Intel/AMD Floating-Point Speed Test

Have you ever wondered how fast your computer's Intel or AMD processor really is? Intel quotes processor speeds in gigahertz, but that's the clock speed. Most operations take several clock cycles, so the clock speed is not a true indication of how fast your computer can do useful work.

We use an Intel Core i7 machine to do scientific calculations, and for this kind of work, the relevant measure of speed is how many millions of floating-point operations the processor can perform per second.

We created a simple test program, written in C and Intel assembler, which actually performs a set of floating-point calculations and reports the speed directly. The program is designed to run under Linux, since that is the operating system we use.

The program tests the major Intel/AMD floating-point opcodes in a fetch-calculate-store cycle which emulates the kind of work that a real scientific application would perform. In addition to addition, multiplication and division, the program tests several special functions: square root, cosine, two-argument inverse tangent, logarithm, exponential, and the FSINCOS opcode, which calculates both sine and cosine.

You can obtain the source code from the GitHub repository. To build the program, use make:

make a_speedtest

If you run the program without any arguments, it displays usage information. To run the speed tests, you must specify the size of the arrays and the number of iterations. For example:

./a_speedtest -nsize 1000000 -niters 100

The output will look something like this:

EXECUTING FLOATING POINT SPEED TESTS
Allocating data arrays ... done.
Setting input arrays to random number ... done.

Running ALL speed tests.

It took 0.054 seconds to perform 100.000 million additions.
That corresponds to 1843.148 million ops/second.

It took 0.015 seconds to perform 100.000 million multiplications.
That corresponds to 6609.822 million ops/second.

It took 0.033 seconds to perform 100.000 million divisions.
That corresponds to 3068.049 million ops/second.

It took 2.534 seconds to perform 100.000 million cosines.
That corresponds to 39.467 million ops/second.

It took 0.048 seconds to perform 100.000 million square roots.
That corresponds to 2078.484 million ops/second.

It took 2.892 seconds to perform 100.000 million arc-tangents.
That corresponds to 34.583 million ops/second.

It took 1.178 seconds to perform 100.000 million y.log2(x) operations.
That corresponds to 84.898 million ops/second.

It took 2.719 seconds to perform 100.000 million combined sine/cosines.
That corresponds to 36.776 million ops/second.

It took 1.201 seconds to perform 100.000 million binary exponentials.
That corresponds to 83.246 million ops/second.

That example was run on a 2.5GHz Intel Core i7-11700 processor. Your mileage may vary.