Now Reading
Examples of floating level issues

Examples of floating level issues

2023-01-13 08:59:21

Whats up! I’ve been desirous about writing a zine about how issues are represented on computer systems in bytes, so I used to be desirous about floating level.

I’ve heard one million occasions in regards to the risks of floating level arithmetic, like:

  • addition isn’t associative (x + (y + z) is completely different from (x + y) + z)
  • in case you add very large values to very small values, you will get inaccurate outcomes (the small numbers get misplaced!)
  • you’ll be able to’t signify very giant integers as floating numbers
  • NaN/infinity values can propagate and trigger chaos
  • there are two zeros (+0 and -0), they usually’re not represented the identical approach
  • denormal/subnormal values are bizarre

However I discover all of this somewhat summary by itself, and I actually wished some
particular examples of floating level bugs in real-world applications.

So I asked on Mastodon for
examples of how floating level has gone incorrect for them in actual applications, and as
all the time of us delivered! Listed below are a bunch of examples. I’ve additionally written some
instance applications for a few of them to see precisely what occurs. Right here’s a desk of contents:

how does floating point work?
floating point isn’t “bad” or random
example 1: the odometer that stopped
example 2: tweet IDs in Javascript
example 3: a variance calculation gone wrong
example 4: different languages sometimes do the same floating point calculation differently
example 5: the deep space kraken
example 6: the inaccurate timestamp
example 7: splitting a page into columns
example 8: collision checking

None of those 8 examples speak about NaNs or +0/-0 or infinity values or
subnormals, but it surely’s not as a result of these issues don’t trigger issues – it’s simply
that I acquired uninterested in writing sooner or later :).

Additionally I’ve most likely made some errors on this submit.

how does floating level work?

I’m not going to write down a protracted rationalization of how floating level works on this submit, however right here’s a comic book I wrote just a few years in the past that talks in regards to the fundamentals:

floating level isn’t “unhealthy” or random

I don’t need you to learn this submit and conclude that floating level is unhealthy.
It’s an incredible instrument for doing numerical calculations. So many sensible folks
have accomplished a lot work to make numerical calculations on computer systems environment friendly and
correct! Two factors about how all of this isn’t floating level’s fault:

  • Doing numerical computations on a pc inherently entails
    some approximation and rounding, particularly if you wish to do it
    effectively. You’ll be able to’t all the time retailer an arbitrary quantity of precision for
    each single quantity you’re working with.
  • Floating level is standardized (IEEE 754), so operations like addition on
    floating level numbers are deterministic – my understanding is that 0.1 +
    0.2 will all the time provide the very same outcome (0.30000000000000004), even
    throughout completely different architectures. It may not be the outcome you anticipated,
    but it surely’s truly very predictable.

My objective for this submit is simply to clarify what sort of issues can provide you with
floating level numbers and why they occur in order that you understand when to be
cautious with them, and after they’re not applicable.

Now let’s get into the examples.

instance 1: the odometer that stopped

One particular person stated that they had been engaged on an odometer that was repeatedly
including small quantities to a 32-bit float to measure distance travelled, and
issues went very incorrect.

To make this concrete, let’s say that we’re including numbers to the odometer 1cm
at a time. What does it appear to be after 10,000 kilometers?

Right here’s a C program that simulates that:

#embody <stdio.h>
int primary() {
    float meters = 0;
    int iterations = 100000000;
    for (int i = 0; i < iterations; i++) {
        meters += 0.01;
    }
    printf("Anticipated: %f kmn", 0.01 * iterations / 1000 );
    printf("Received: %f km n", meters / 1000);
}

and right here’s the output:

Anticipated: 10000.000000 km
Received: 262.144012 km

That is VERY unhealthy – it’s not a small error, 262km is a LOT lower than 10,000km. What went incorrect?

what went incorrect: gaps between floating level numbers get large

The issue on this case is that, for 32-bit floats, 262144.0 + 0.01 = 262144.0.
So it’s not simply that the quantity is inaccurate, it’ll truly by no means improve
in any respect! If we travelled one other 10,000 kilometers, the odometer would nonetheless be
caught at 262144 meters (aka 262.144km).

Why is that this taking place? Nicely, floating level numbers get farther aside as they get larger. On this instance, for 32-bit floats, listed below are 3 consecutive floating level numbers:

  • 262144.0
  • 262144.03125
  • 262144.0625

I acquired these numbers by going to https://float.exposed/0x48800000 and incrementing the ‘significand’ quantity a few occasions.

So, there are not any 32-bit floating level numbers between 262144.0 and 262144.03125. Why is that an issue?

The issue is that 262144.03125 is about 262144.0 + 0.03. So after we attempt to
add 0.01 to 262144.0, it doesn’t make sense to spherical as much as the following quantity. So
the sum simply stays at 262144.0.

Additionally, it’s not a coincidence that 262144 is an influence of two (it’s 2^18). The gaps
been floating level numbers change after each energy of two, and at 2^18 the hole
between 32-bit floats is 0.03125, growing from 0.016ish.

one option to resolve this: use a double

Utilizing a 64-bit float fixes this – if we substitute float with double within the above C program, all the pieces works quite a bit higher. Right here’s the output:

Anticipated: 10000.000000 km
Received: 9999.999825 km

There are nonetheless some small inaccuracies right here – we’re off about 17 centimeters.
Whether or not this issues or not will depend on the context: being barely off might very
properly be disastrous if we had been doing a precision house maneuver or one thing, however
it’s most likely high quality for an odometer.

One other approach to enhance this might be to increment the odometer in larger chunks
– as a substitute of including 1cm at a time, perhaps we might replace it much less incessantly,
like each 50cm.

If we use a double and increment by 50cm as a substitute of 1cm, we get the precise
appropriate reply:

Anticipated: 10000.000000 km
Received: 10000.000000 km

A 3rd option to resolve this may very well be to make use of an integer: perhaps we determine that
the smallest unit we care about is 0.1mm, after which measure all the pieces as
integer multiples of 0.1mm. I’ve by no means constructed an odometer so I can’t say what
the very best strategy is.

instance 2: tweet IDs in Javascript

Javascript solely has floating level numbers – it doesn’t have an integer sort.
The most important integer you’ll be able to signify in a 64-bit floating level quantity is
2^53.

However tweet IDs are large numbers, larger than 2^53. The Twitter API now returns
them as each integers and strings, in order that in Javascript you’ll be able to simply use the
string ID (like “1612850010110005250”), however in case you tried to make use of the integer
model in JS, issues would go very incorrect.

You’ll be able to examine this your self by taking a tweet ID and placing it within the
Javascript console, like this:

>> 1612850010110005250 
   1612850010110005200

Discover that 1612850010110005200 is NOT the identical quantity as 1612850010110005250!! It’s 50 much less!

This explicit difficulty doesn’t occur in Python (or every other language that I
know of), as a result of Python has integers. Right here’s what occurs if we enter the identical quantity in a Python REPL:

In [3]: 1612850010110005250
Out[3]: 1612850010110005250

Similar quantity, as you’d count on.

instance 2.1: the corrupted JSON knowledge

This can be a small variant of the “tweet IDs in Javascript” difficulty, however even when
you’re not truly writing Javascript code, numbers in JSON are nonetheless typically
handled as in the event that they’re floats. This principally is sensible to me as a result of JSON has
“Javascript” within the title, so it appears cheap to decode the values the way in which
Javascript would.

For instance, if we cross some JSON by way of jq, we see the very same difficulty:
the quantity 1612850010110005250 will get became 1612850010110005200.

$ echo '{"id": 1612850010110005250}' | jq '.'
{
  "id": 1612850010110005200
}

But it surely’s not constant throughout all JSON libraries Python’s json module will decode 1612850010110005250 as the right integer.

A number of folks talked about points with sending floats in JSON, whether or not both
they had been attempting to ship a big integer (like a pointer tackle) in JSON and
it acquired corrupted, or sending smaller floating level values forwards and backwards
repeatedly and the worth slowly diverging over time.

instance 3: a variance calculation gone incorrect

Let’s say you’re performing some statistics, and also you wish to calculate the variance
of many numbers. Perhaps extra numbers than you’ll be able to simply slot in reminiscence, so that you
wish to do it in a single cross.

There’s a easy (however unhealthy!!!) algorithm you should use to calculate the variance in a single cross,
from this blog post. Right here’s some Python code:

def calculate_bad_variance(nums):
    sum_of_squares = 0
    sum_of_nums = 0
    N = len(nums)
    for num in nums:
        sum_of_squares += num**2
        sum_of_nums += num
    imply = sum_of_nums / N
    variance = (sum_of_squares - N * imply**2) / N

    print(f"Actual variance: {np.var(nums)}")
    print(f"Unhealthy variance: {variance}")

First, let’s use this unhealthy algorithm to calculate the variance of 5 small numbers. All the pieces seems to be fairly good:

In [2]: calculate_bad_variance([2, 7, 3, 12, 9])
Actual variance: 13.84
Unhealthy variance: 13.840000000000003 <- fairly shut!

Now, let’s attempt it the identical 100,000 giant numbers which might be very shut collectively (distributed between 100000000 and 100000000.06)

In [7]: calculate_bad_variance(np.random.uniform(100000000, 100000000.06, 100000))
Actual variance: 0.00029959105209321173
Unhealthy variance: -138.93632 <- OH NO

That is extraordinarily unhealthy: not solely is the unhealthy variance approach off, it’s NEGATIVE! (the variance is rarely imagined to be destructive, it’s all the time zero or extra)

what went incorrect: catastrophic cancellation

What’s going right here is much like our odometer quantity downside: the
sum_of_squares quantity will get extraordinarily large (about 10^21 or 2^69), and at that time, the
hole between consecutive floating level numbers can also be very large – it’s 2**46.
So we simply lose all precision in our calculations.

The time period for this downside is “catastrophic cancellation” – we’re subtracting
two very giant floating level numbers which might be very shut collectively, and that
simply isn’t going to provide the proper reply.

The blog post I mentioned before
talks in regards to the a greater algorithm folks use to compute variance referred to as
Welford’s algorithm, which doesn’t have the catastrophic cancellation difficulty.

See Also

And naturally, the answer for most individuals is to only use a scientific
computing library like Numpy to calculate variance as a substitute of attempting to do it
your self 🙂

instance 4: completely different languages typically do the identical floating level calculation otherwise

A bunch of individuals talked about that completely different platforms will do the identical
calculation in numerous methods. A technique this exhibits up in apply is – perhaps
you’ve gotten some frontend code and a few backend code that do the very same
floating level calculation. But it surely’s accomplished barely otherwise in Javascript
and in PHP, so that you customers find yourself seeing discrepencies and getting confused.

In precept you would possibly assume that completely different implementations ought to work the
identical approach due to the IEEE 754 commonplace for floating level, however listed below are a
couple of caveats that had been talked about:

  • math operations in libc (like sin/log) behave otherwise in numerous
    implementations. So code utilizing glibc might provide you with completely different outcomes than
    code utilizing musl
  • some x86 directions can use 80 bit precision for some double operations
    internally as a substitute of 64 bit precision. Here’s a GitHub issue talking about
    that

I’m not very certain about these factors and I don’t have concrete examples I can reproduce.

instance 5: the deep house kraken

Kerbal House Program is an area simulation sport, and it used to have a bug
referred to as the Deep Space Kraken the place when
you moved very quick, your ship would begin getting destroyed attributable to floating level points. That is much like the opposite issues we’ve talked out involving large floating numbers (just like the variance downside), however I wished to say it as a result of:

  1. it has a humorous title
  2. it looks as if a quite common bug in video video games / astrophysics / simulations typically – in case you have factors which might be very removed from the origin, your math will get tousled

One other instance of that is the Far Lands in Minecraft.

instance 6: the incorrect timestamp

I promise that is the final instance of “very giant floating numbers can damage your day”.
However! Only one extra! Let’s think about that we attempt to signify the present Unix epoch in nanoseconds
(about 1673580409000000000) as a 64-bit floating level quantity.

That is no good! 1673580409000000000 is about 2^60 (crucially, larger than 2^53), and the following 64-bit float after it’s 1673580409000000256.

So this might be a good way to finish up with inaccuracies in your time math. Of
course, time libraries truly signify occasions as integers, so this isn’t
normally an issue. (there’s all the time nonetheless the year 2038 problem, however that’s not
associated to floats)

On the whole, the lesson right here is that it’s higher to make use of integers in case you can.

instance 7: splitting a web page into columns

Now that we’ve talked about issues with large floating level numbers, let’s do
an issue with small floating level numbers.

Let’s say you’ve gotten a web page width, and a column width, and also you wish to work out:

  1. what number of columns match on the web page
  2. how a lot house is left over

You would possibly fairly attempt ground(page_width / column_width) for the primary
query and page_width % column_width for the second query. As a result of
that might work simply high quality with integers!

In [5]: math.ground(13.716 / 4.572)
Out[5]: 3

In [6]: 13.716 % 4.572
Out[6]: 4.571999999999999

That is incorrect! The quantity of house left is 0!

A greater option to calculate the quantity of house left might need been
13.716 - 3 * 4.572, which supplies us a really small destructive quantity.

I feel the lesson right here is to by no means calculate the identical factor in 2 other ways with floats.

This can be a very fundamental instance however I can form of see how this might create all
sorts of issues if I used to be doing web page structure with floating level numbers, or
doing CAD drawings.

instance 8: collision checking

Right here’s a really foolish Python program, that begins a variable at 1000 and
decrements it till it collides with 0. You’ll be able to think about that that is a part of a
pong sport or one thing, adn that a is a ball that’s imagined to collide with
a wall.

a = 1000
whereas a != 0:
    a -= 0.001

You would possibly count on this program to terminate. But it surely doesn’t! a is rarely 0,
as a substitute it goes from 1.673494676862619e-08 to -0.0009999832650532314.

The lesson right here is that as a substitute of checking for float equality, normally you
wish to examine if two numbers are completely different by some very small quantity. Or right here
we might simply write whereas a > 0.

that’s all for now

I didn’t even get to NaNs (the are so lots of them!) or infinity or +0 / -0 or subnormals, however we’ve
already written 2000 phrases and I’m going to only publish this.

I would write one other followup submit later – that Mastodon thread has actually
15,000 phrases of floating level issues in it, there’s a whole lot of materials! Or I
may not, who is aware of 🙂

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top