Inconsistent Printing Of Floats. Why Does It Work Sometimes?
Solution 1:
You are printing numpy.float64
objects, not the Python built-in float
type, which uses David Gay's dtoa algorithm.
As of version 1.14, numpy uses the dragon4 algorithm to print floating point values, tuned to approach the same output as the David Gay algorithm used for the Python float
type:
Numpy scalars use the dragon4 algorithm in "unique" mode (see below) for str/repr, in a way that tries to match python float output.
The numpy.format_float_positional()
function documents this in a bit more detail:
unique
: boolean, optionalIf
True
, use a digit-generation strategy which gives the shortest representation which uniquely identifies the floating-point number from other values of the same type, by judicious rounding. If precision was omitted, print out all necessary digits, otherwise digit generation is cut off after precision digits and the remaining value is rounded.
So 0.2
can uniquely be presented by only printing 0.2
, but the next value in the series (0.30000000000000004
) can't, you have to include the extra digits to uniquely represent the exact value.
The how of this is actually quite involved; you can read a full report on this in Bungie's Destiny gameplay engineer Ryan Juckett's Printing Floating-Point Numbers series.
But basically the code outputting the string needs to determine what shortest representation exists for all decimal numbers clustering around the possible floating point number that can't be interpreted as the next or preceding possible floating point number:
This image comes from The Shortest Decimal String That Round-Trips: Examples by Rick Regan, which covers some other cases as well. Numbers in blue are possible float64
values, in green are possible representations of decimal numbers. Note the grey half-way point markers, any representation that fits between those two half-way points around a float value are fair game, as all of those representations would produce the same value.
The goal of both the David Gay and Dragon4 algorithms is to find the shortest decimal string output that would produce the exact same float value again. From the Python 3.1 What's New section on the David Gay approach:
Python now uses David Gay’s algorithm for finding the shortest floating point representation that doesn’t change its value. This should help mitigate some of the confusion surrounding binary floating point numbers.
The significance is easily seen with a number like
1.1
which does not have an exact equivalent in binary floating point. Since there is no exact equivalent, an expression likefloat('1.1')
evaluates to the nearest representable value which is0x1.199999999999ap+0
in hex or1.100000000000000088817841970012523233890533447265625
in decimal. That nearest value was and still is used in subsequent floating point calculations.What is new is how the number gets displayed. Formerly, Python used a simple approach. The value of
repr(1.1)
was computed asformat(1.1, '.17g')
which evaluated to'1.1000000000000001'
. The advantage of using 17 digits was that it relied on IEEE-754 guarantees to assure thateval(repr(1.1))
would round-trip exactly to its original value. The disadvantage is that many people found the output to be confusing (mistaking intrinsic limitations of binary floating point representation as being a problem with Python itself).The new algorithm for
repr(1.1)
is smarter and returns'1.1'
. Effectively, it searches all equivalent string representations (ones that get stored with the same underlying float value) and returns the shortest representation.The new algorithm tends to emit cleaner representations when possible, but it does not change the underlying values. So, it is still the case that
1.1 + 2.2 != 3.3
even though the representations may suggest otherwise.
Post a Comment for "Inconsistent Printing Of Floats. Why Does It Work Sometimes?"