math – Why does the integer illustration of a floating level quantity supply a piecewise linear approximation to the logarithm?

I will restrict my reply to the query, “Why does the
integer illustration of a floating level quantity supply a
piecewise linear approximation to the logarithm?”.
The quick inverse sq. root is an enchanting matter, however I do not
know sufficient about it to speak about it.
But it surely’s truly fairly straightforward to reply the query within the title. First I will begin with the “TL;DR” abstract:
This approximation works as a result of when a quantity is an actual energy of the bottom, its logarithm is (trivially) precisely its exponent when expressed in exponential notation. Floating-point notations like IEEE-754 are base-2 exponential notations, and the exponent occupies nearly the highest-precision bits of the format, excepting solely the signal bit, which is 0 for optimistic numbers and so may be ignored. And all of lower-precision bits (that’s, beneath the exponent) type a wonderfully linear development, notionally from 000 to 999, due partly to the truth that IEEE-754 codecs use normalized significands with a “hidden” main 1 bit.
The longer clarification begins with a evaluation: What will we imply by a logarithm?
Effectively, it is the inverse of the exponential or energy operate.
Exponentials say issues like, ten to the second energy
(102) is 100. So the (base 10) logarithm operate
asks issues like, What energy do we now have to boost 10 to to get the
quantity 100? The reply, after all, is 2.
You can even ask issues like, “What energy do we now have to boost 10 to to get the
quantity 50?” That is clearly more durable, as a result of 50 is not a good
energy of 10. There’s some very cool math behind this which I’m
not going to attempt to clarify, however elevating issues to fractional
powers is outlined in such a manner that 101.69897 is 50.
So the log (base 10) of fifty is 1.69897.
After which you do not have to restrict your self to powers of 10, both.
Mathematicians like to make use of powers of the particular quantity e
(2.71828), and naturally laptop programmers love (simply love) to
use powers of two. What energy do we now have to boost 2 to with a view to
get the quantity 32? Each laptop geek is aware of it is 5. So the log
(base 2) of 32 is 5.
After which there’s scientific notation. Let’s take the quantity 12345.
A method of representing that in scientific notation is 1.2345
× 104. And it seems that offers us a touch as
to what the base-10 logarithm of 12345 is. It should be
larger than or equal to 4, and fewer than 5. (Actually it is about 4.09149.)
This works so long as we choose scientific representations which might be
normalized in a sure manner, with the significand half
(popularly known as the “mantissa”) chosen to be
between 1 and 10. (That’s, we would not get estimate of
the logarithm on this manner if we might regarded on the otherwise-equivalent
notations 0.12345 × 105 or 12.345 × 103.)
So now let’s flip to laptop floating-point representations.
These sometimes use — and IEEE-754 positively makes use of — base-2
(binary) exponential representations, that’s, with normalized
significand values multiplied by some energy of two. For instance,
the #1.125 is represented as 1.125 × 20,
the quantity 25.25 is represented as 1.578125 × 24,
and the quantity 0.1875 is represented as 1.5 × 2-3.
And, simply as for base 10, these exponents give us a tough,
integer-only approximation of the base-2 logarithm — once more,
so long as we use normalized significands, on this case between 1 and a couple of.
So log2(25.25) goes to be someplace between 4 and 5.
It is also time for me to foreshadow
a super-special secret truth implied by the truth that
we’re proscribing ourselves to normalized significand values.
I mentioned they had been going to be “between 1 and a couple of”, however extra
exactly, they are going to vary from 1.00000 to 1.99999.
However discover that it doesn’t matter what the significand worth is,
for base 2 the primary digit is all the time going to be 1.
In a minute we’ll see what we will do with that.
The IEEE-754 binary floating-point representations all include
an indication bit, adopted by some variety of exponent bits, adopted
by some variety of significand bits. For instance, for our quantity
25.25, we will have a 0 bit representing a optimistic signal,
adopted by some illustration of the exponent 4, adopted by
some illustration of the significand 1.578125.
However now we’re getting a glimmer of how the raw-bits
interpretation of a floating-point quantity might need one thing to
do with a logarithm. For optimistic numbers we do not have to fret
in regards to the signal bit (because it’s 0), and the following factor we see is
going to be the exponent, and we have already seen that the
exponent is a tough (integer-only) approximation of the
logarithm. So all that is left to ask is, how are we going to
interpolate between the power-of-two values which have actual
base-2 logarithms
(log2(8) = 3, log2(16) = 4,
log2(32) = 5, and so forth.)
and the fractional ones in between?
If I gave you evenly-spaced numbers like 100, 200, 300 and requested
you to fill within the numbers in between, you’d have little bother:
you’d exchange the 00 half by 01, 02, 03, …, as much as 99.
And in case you knew something about binary numbers,
and I requested you do do the identical for evenly-spaced binary numbers like
00100000
01000000
01100000
10000000
you would do roughly the identical factor, changing the 00000
half by consecutive binary numbers 00001
, 00010
, 00011
, and so forth.
Remarkably sufficient, we’re about to see that one thing precisely
like this goes on after we encode significand values in IEEE-754
floating-point numbers.
Let’s begin taking a look at an precise IEEE-754
floating-point format. We’ll use single-precision float32, to
preserve the numbers midway affordable. This format makes use of 1 bit for
the signal (S), 8 bits for the exponent (e), and 23 bits for the
significand (s):
S e e e e e e e e s s s s s s s s s s s s s s s s s s s s s s s
Let’s begin determining what the representations of the numbers
16.000 = 24 and 32.000 = 25 would appear like.
16 is 1.0 × 24. So the significand is 1.0, and the
exponent is 4. How will these be saved? I will
think about the significand first. Bear in mind the statement
that the primary digit (the one to the left of the decimal level)
is all the time 1? IEEE-754 takes the place that this always-1 main bit doesn’t must be saved in any respect. The
precise bits saved for a significand of 1.0 might be 00000.
The exponent might be some illustration of 4. So I will
symbolize the quantity 16.000 like this:
0 4 4 4 4 4 4 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
These “4”s are approximate; I nonetheless have not gotten round to
explaining how the exponent is encoded.
The quantity 32.000 is analogous — truly, it is equivalent, besides
for the exponent:
0 5 5 5 5 5 5 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
However the important thing level I am constructing as much as is that the numbers in
between — these larger than 16.0, however lower than 32.0, will all
be variations on
0 4 4 4 4 4 4 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
besides with the 0’s changed by all of the consecutive 23-bit binary
numbers — all 2^23 or 8388608 of them — from 000…001 as much as 111…111.
So now we have nearly all of the items in place to see how
“the
integer illustration of a floating level quantity provides a
piecewise linear approximation to the logarithm”. We have proven
that the binary representations of the numbers between 16 and 32
type a pleasant, linear ramp from a quantity that begins with one thing
like 4, to a quantity that begins with one thing like 5.
I have been ducking the query of how the exponents are encoded.
That is performed utilizing a bias scheme. For binary32, the bit sample
saved is the precise exponent worth, plus 127. So our exponent
worth of 4 might be saved as a bit sample of 4+127=131, and 5
might be saved as 132. So the precise encodings of 16.0 and
32.0 might be
0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
the place I’ve crammed within the binary values of the encoded exponents 131 and 132, specifically
10000011
and 10000100
. And now we will see the precise, uncooked values
01000001100000000000000000000000
and
01000010000000000000000000000000
, or 0x41800000
and
0x42000000
, or 1098907648 and 1107296256.
These numbers haven’t got something apparent to do with the base-2
logarithms of 16 and 32 (specifically, 4 and 5), however they’re truly
very shut. They’re offset by the impact of the bias, and
they’re fully improper in magnitude. However that is not shocking:
by taking 32-bit portions and treating them as integers, we’re
getting large numbers (that are certainly integers), however the precise
values for these logarithms must be 4.0 and 5.0, and we might like
the opportunity of a fractional half. So if we will deal with
these large integers as some rendition of the logarithms of some
different, smaller numbers, it is going to contain one thing very a lot
like a fixed-point illustration.
Deriving the main points of that fixed-point illustration is
simple: the scaling issue is clearly 223 =
8388608, and there is additionally an offset, akin to the bias
inbuilt to the binary32 format. That offset might be 1065353216
(127 × 223)
if utilized earlier than scaling, or 127 afterwards.
Let’s have a look at if this works. We have now provide you with integer values of
1098907648 and 1107296256 akin to 16.0 and 32.0. So:
(1098907648 - 1065353216) / 8388608 = 4
(1107296256 - 1065353216) / 8388608 = 5
or the opposite manner:
1098907648 / 8388608 - 127 = 4
1107296256 / 8388608 - 127 = 5
So it is working!
It is mentioned {that a} image is value 1,000 phrases, and it is gone time to do a few of this explaining that manner. This is a graph which — not coincidentally — seems an terrible lot just like the one you posted from Mitchell 1961:
The pink curve is the precise base-2 log operate. The black dots are the locations the place log2(X) is an integer. These are the factors the place the uncooked bits of a binary32 float worth, interpreted as an integer and suitably scaled, are precisely equal to log2. After which in blue we now have straight line segments instantly connecting these factors.
To recap: If we take the uncooked bits of binary32 float values akin to actual powers of two, we get:
X | binary32 (hex) | binary32 (dec) | log2 |
---|---|---|---|
0.25 | 3e800000 |
1048576000 | -2 |
0.5 | 3f000000 |
1056964608 | -1 |
1 | 3f800000 |
1065353216 | 0 |
2 | 40000000 |
1073741824 | 1 |
4 | 40800000 |
1082130432 | 2 |
8 | 41000000 |
1090519040 | 3 |
16 | 41800000 |
1098907648 | 4 |
32 | 42000000 |
1107296256 | 5 |
A precise energy of two has a significand of 1.0 (that’s, these numbers are all 1.0 × 2N), that means that they’re encoded in IEEE-754 as all 0’s (because of the “hidden 1 bit” property).
And since they’re optimistic, their signal bit is 0 additionally, and the one half that is not 0 is the exponent. And since for actual powers the exponent is the logarithm, for these numbers, the integer interpretation of the uncooked bit sample is all the time precisely the logarithm, after shifting and scaling. (I presume that some side of that “shifting and scaling” is the place the magic quantity 0x5f37642f is available in.)
After which, for all of the numbers that are not actual powers of two, and for which the logarithm is not an integer, the best way the significand values are laid out signifies that the uncooked values (as proven by the blue strains within the plot above) type an ideal linear interpolation between the precise powers. Due to this fact, in (nearly) all circumstances, the uncooked bit patterns do certainly find yourself being a “piecewise linear approximation to the logarithm”. (I mentioned “nearly” as a result of this end result does not maintain for the subnormals, and for the particular values Inf and NaN.)
The opposite interpretation of the query “Why” is, “Why did the designers of IEEE-754 set issues up this fashion?” I do not consider it was deliberate, that’s, I do not consider that it was a requirement that the uncooked bit sample, interpreted as an integer, corresponded to a piecewise linear approximation of the base-2 logarithm. Nonetheless, given these three associated info:
- the definition of an exponential notation mainly is a piecewise approximation of a logarithm
- IEEE-754 did have a objective of preserving optimistic floating-point values monotonically rising per their uncooked bit values
- IEEE-754 hides the high-order “1” little bit of the significand; all significands are of the shape 1.xxx however solely the xxx half is saved as such
the “piecewise liner approximation” end result falls out quite neatly.