Are these two floating computations results similar enough?
This is the most challenging question to answer when implementing and validating numerical algorithms. Various methods are often found in existing testing frameworks or are used by developers. But if floating-point arithmetic can be tricky, floating-point comparisons are even trickier.
In the rest of this section, we assume the reader is familiar with the basic notions of floating-point computation and related problems. If not, we strongly recommend having a deep look at Goldberg's paper on the subject or a simplified version.
The first thing people learn (and often they learn it the hard way) is that strict equality for floating-point numbers is often meaningless or very hard to achieve. Once this state of fact is integrated, people often go on to use a simple absolute difference with an arbitrary threshold. If this method looks sound, it's easy to fold and may lead to false positives. The proper way to compare non-zero or non-invalid floating-point numbers is to use the Unit in the Last Place metric.
Let us define ε – the machine epsilon – as being the smallest positive floating-point number such that 1+ε is different from 1. Usually, ε is equal to \(2^-52\) for double precision numbers and \(2^-23\) for single precision numbers. In fact, 1+ε and 1 only differ by one bit in the least significant digit: the unit in the last place (ULP). For IEEE754 numbers, the ULP between a floating-point number \(x\) and its immediate successor is \(2^E \times \epsilon\) where E is the exponent of x.
The ULP distance (or ulpdist) is a way to compare floating-point numbers by estimating the number of significant bits between their respective representations. The IEEE 754 specification – followed by most modern floating-point hardware – requires that the result of an elementary arithmetic operation (addition, subtraction, multiplication, division, and square root) must be within 0.5 ULP of the mathematically exact result. Achieving 0.5-1 ULP for computationally complex functions like transcendental functions is what a proper numerical library should aim for.
The complete algorithm can be expressed in standard C++ in the following way:
Put in another way, one can estimate the ulpdist between two floating point numbers as the number of representable floating point values between them. This estimation leads to the following properties:
\( \begin{align} ulpdist(N,N \times ε) & = 0.5 \\ ulpdist(N,N+N \times \frac{ε}{2}) & = 0 \\ \end{align} \)
Note that if a double is compared to the double representation of its single precision conversion (they are exceptions as for fully representable reals), their ulpdist will be around \(2^{26.5}\) (or \(10^8\)). For example: ulpdist( double(float(M_PI)), double(M_PI) ) == 9.84293e+07
What to do then when writing a unit test that handles floating-point numbers? You basically have three cases :
Take extreme care not overestimate the value of ULP measures. Some classical algorithms may produce hundreds of ULPs but remain meaningful.