Falsehoods Programmers Believe About Floating Point Numbers

This is a list of falsehoods programmers tend to believe about floating point numbers- specifically the IEEE-754 floating point numbers used ubiquitously today.

When using floating point, it is easy to write programs which may seem to compute the right answer but are actually hiding subtle bugs. In serious applications numerical computing quickly gets complicated, requiring the consideration of many factors, like the accumulation of error, numerical stability, and the how the numbers flow throughout the program. Knowing some floating point quirks provides a good foundation for when your math starts to look off.

While this list does not go into the details of correctly using floating point numbers, it does enumerate a number of assumptions often made by programmers.

All of these assumptions are wrong

Floating point arithmetic is exact
Floating point arithmetic is always inexact
The properties of arithmetic (commutativity, associativity, distributivity, inverse) hold
The error in floating point math always tends to average itself out
Floating point math is precise enough for programs which manage money
A list of numbers can be summed in any order without affecting the result
A list of numbers can be multiplied in any order without affecting the result
Floating point can’t be used for integer math
Floating point numbers are either 64 or 32 bits
Floating point numbers have 2^n bits
If two floating point numbers have different bits, they are not equal
If two floating point numbers have the same bits, they are equal
The reciprocal of two equal numbers is also equal
There is only one way to encode NaN
Floating point functions supported by the CPU are computed as accurately as possible
Arithmetic operations execute in a constant amount of time
Addition/multiplication operations execute in a constant amount of time
Floating point math is always executed on specialized hardware
Exceptions in floating point math always throw
Floating point math always rounds the same way
Programs built with the same compiler brand will produce the exact same results
Programs build with the same compiler version will produce the exact same results
Debug and release mode give identical results
CPUs with the same instruction set produce the exact same results executing floating point instructions
32 bit and 64 bit versions of the same program running on the same machine will produce the same results

Falsehoods Programmers Believe About Floating Point Numbers

Ethan Shea

Further Reading