Are these two floating computations results similar enough?
This is maybe the most difficult question to answer when implementing and validating numerical algorithms. Various methods are often found in existing testing frameworks or are used by developers. But if floating-point arithmetic can be tricky, floating-point comparisons are even trickier.
In the rest of this section, we take for granted that the basic notions of floating-point computations and their related problems are known by the reader. If not, we strongly recommend having a deep look at Goldberg's paper on the subject or a simplified version.
The first thing people learn (and often they learn it the hard way) is that strict equality for floating points numbers is often meaningless or very hard to achieve. Once this state of fact is integrated, people often go to use a simple absolute difference with an arbitrary threshold. If this method looks sound, it's easy to fold and may lead to false positives. The proper way to compare non-zero or non-invalid floating-point numbers is to use the Unit in the Last Place metric.
Let us define ε – the machine epsilon – as being the smallest positive floating-point number such that 1+ε is different from 1. Usually, ε is equal to \(2^-52\) for double precision numbers and \(2^-23\) for single precision numbers. In fact, 1+ε and 1 only differ by one bit in the least significant digit: the unit in the last place (ULP). For IEEE754 numbers, the ULP between a floating-point number \(x\) and its immediate successor is \(2^E \times \epsilon\) where E is the exponent of x.
The ULP distance (or ulpdist
) is a way to compare floating-point numbers by estimating the number of significant bits between their respective representations. The IEEE 754 specification – followed by most modern floating-point hardware – requires that the result of an elementary arithmetic operation (addition, subtraction, multiplication, division, and square root) must be within 0.5 ULP of the mathematically exact result. Achieving 0.5-1 ULP for computationally complex functions like transcendental functions is what a proper numerical library should aim for.
The full algorithm can be expressed in standard C++ in the following way:
Put in another way, one can estimate the ulpdist
between two floating point numbers as the number of representable floating points values between them. This estimation leads to the following properties:
\( \begin{align} ulpdist(N,N \times ε) & = 0.5 \\ ulpdist(N,N+N \times \frac{ε}{2}) & = 0 \\ \end{align} \)
Note that if a double
is compared to the double representation of its single precision conversion (they are exceptions as for fully representable reals), their ulpdist
will be around \(2^{26.5}\) (or \(10^8\)). For example: ulpdist( double(float(M_PI)), double(M_PI) ) == 9.84293e+07
What to do then when writing an unit test that handles floating points number ? You basically have three cases :
The value you compare must be equal by design. In this case, use [**`TTS_EQUAL`**](reference.html#tts_equal) to clearly state your intent. One such case can be for example functions that construct a floating point value bitwise. The value you compare are the results of an undetermined number of other floating point operations. In this case, use [**`TTS_ULP_EQUAL`**](reference.html#tts_ulp_equal) and try to estimate the maximum amount of ULP your implementation should give. This can be either done by a strict analysis of the function behavior or by some guess work. The value you compare are the results of an undetermined number of other floating point operations but stands in a predictable absolute range of error independent of the architecture and magnitude of the input. In this case, use [**`TTS_RELATIVE_EQUAL`**](reference.html#tts_relative_equal).
Take extreme care to not overestimate the value of ULP measures. Some classical algorithms may ends up with hundreds of ULPs but still be meaningful.