Some attainable causes for 8-bit bytes
I’ve been engaged on a zine about how computer systems characterize factor in binary, and
one query I’ve gotten a number of occasions is – why does the x86 structure use 8-bit bytes? Why not
another measurement?
With any query like this, I believe there are two choices:
- It’s a historic accident, one other measurement (like 4 or 6 or 16 bits) would work simply as nicely
- 8 bits is objectively the Finest Choice for some purpose, even when historical past had performed out in another way we’d nonetheless use 8-bit bytes
- some mixture of 1 & 2
I’m not tremendous into laptop historical past (I like to make use of computer systems much more than I
like studying about them), however I’m all the time curious if there’s a vital
purpose for why a pc factor is the best way it’s as we speak, or whether or not it’s largely
a historic accident. So we’re going to speak about some laptop historical past.
For instance of a historic accident: DNS has a class
discipline which has 5
attainable values (“web”, “chaos”, “hesiod”, “none”, and “any”). To me that’s
a transparent instance of a historic accident – I can’t think about that we’d outline
the category discipline the identical means if we might redesign DNS as we speak with out worrying about backwards compatibility. I’m
unsure if we’d use a category discipline in any respect!
There aren’t any definitive solutions on this publish, however I requested on Mastodon and
listed here are some potential causes I discovered for the 8-bit byte. I believe the reply
is a few mixture of those causes.
what’s the distinction between a byte and a phrase?
First, this publish talks about “bytes” and “phrases” quite a bit. What’s the distinction between a byte and a phrase? My understanding is:
- the byte measurement is the smallest unit you possibly can handle. For instance in a program on my machine
0x20aa87c68
is likely to be the handle of 1 byte, then0x20aa87c69
is the handle of the subsequent byte. - The phrase measurement is a few a number of of the byte measurement. I’ve been confused about
this for years, and the Wikipedia definition is extremely obscure (“a phrase is
the pure unit of knowledge utilized by a selected processor design”) Apparently
on x86 the phrase measurement is 16 bits, though the registers are 64 bits.
Now let’s speak about some attainable causes that we use 8-bit bytes!
purpose 1: to suit the English alphabet in 1 byte
This Wikipedia article says that the IBM System/360 launched the 8-bit byte in 1964.
Right here’s a video interview with Fred Brooks (who managed the project) speaking about why. I’ve transcribed a few of it right here:
… the six bit bytes [are] actually higher for scientific computing and the 8-bit byte ones are actually higher for industrial computing and each could be made to work for the opposite.
[….]
So it got here all the way down to an government resolution and I made a decision for the 8-bit byte, Jerry’s proposal.My most essential technical resolution in my IBM profession was to go together with the 8-bit byte for the 360.
And on the premise of I imagine character processing was going to develop into essential versus decimal digits.
It is sensible that an 8-bit byte could be higher for textual content processing: 2^6 is
64, so 6 bits wouldn’t be sufficient for lowercase letters, uppercase letters, and symbols.
To go together with the 8-bit byte, System/360 additionally launched the EBCDIC encoding, which is an 8-bit character encoding.
It appears to be like like the subsequent essential machine in 8-bit-byte historical past was the
Intel 8008, which was constructed to be
utilized in a pc terminal (the Datapoint 2200). Terminals want to have the ability to
characterize letters in addition to terminal management codes, so it is sensible for them
to make use of an 8-bit byte.
This Datapoint 2200 manual from the Computer History Museum
says on web page 7 that the Datapoint 2200 supported ASCII (7 bit) and EBCDIC (8 bit).
why was the 6-bit byte higher for scientific computing?
I used to be interested by this remark that the 6-bit byte could be higher for scientific computing. Right here’s a quote from this interview from Gene Amdahl:
I needed to make it 24 and 48 as an alternative of 32 and 64, on the premise that this
would have given me a extra rational floating level system, as a result of in floating
level, with the 32-bit phrase, you needed to maintain the exponent to only 8 bits for
exponent signal, and to make that cheap by way of numeric vary it might
span, you needed to modify by 4 bits as an alternative of by a single bit. And so it precipitated
you to lose a few of the data extra quickly than you’d with binary
shifting
I don’t perceive this remark in any respect – why does the exponent need to be 8 bits
when you use a 32-bit phrase measurement? Why couldn’t you employ 9 bits or 10 bits when you
needed? However it’s all I might discover in a fast search.
why did mainframes use 36 bits?
Additionally associated to the 6-bit byte: plenty of mainframes used a 36-bit phrase measurement. Why? Somebody identified
that there’s a fantastic rationalization within the Wikipedia article on 36-bit computing:
Previous to the introduction of computer systems, the cutting-edge in precision
scientific and engineering calculation was the ten-digit, electrically powered,
mechanical calculator… These calculators had a column of keys for every digit,
and operators had been skilled to make use of all their fingers when getting into numbers, so
whereas some specialised calculators had extra columns, ten was a sensible restrictEarly binary computer systems aimed on the identical market subsequently typically used a 36-bit
phrase size. This was lengthy sufficient to characterize optimistic and adverse integers
to an accuracy of ten decimal digits (35 bits would have been the minimal)
So this 36 bit factor appears to based mostly on the truth that log_2(20000000000) is 34.2. Huh.
My guess is that the explanation for that is within the 50s, computer systems had been
extraordinarily costly. So when you needed your laptop to assist ten decimal
digits, you’d design in order that it had precisely sufficient bits to do this, and no
extra.
Right this moment computer systems are means quicker and cheaper, so if you wish to characterize ten
decimal digits for some purpose you possibly can simply use 64 bits – losing somewhat bit
of area is normally no massive deal.
purpose 2: to work nicely with binary-coded decimal
Within the 60s, there was a well-liked integer encoding known as binary-coded decimal (or BCD for brief) that
encoded each decimal digit in 4 bits.
For instance, when you needed to encode the quantity 1234, in BCD that will be one thing like:
0001 0010 0011 0100
So if you need to have the ability to simply work with binary-coded decimal, your byte
measurement must be a a number of of 4 bits, like 8 bits!
why was BCD widespread?
This integer illustration appeared actually bizarre to me – why not simply use
binary, which is a way more environment friendly approach to retailer integers? Effectivity was actually essential in early computer systems!
My finest guess about why is that early computer systems didn’t have shows the identical means we do
now, so the contents of a byte had been mapped on to on/off lights.
Right here’s a picture from Wikipedia of an IBM 650 with some lights on its display (CC BY-SA 3.0):
So if you need individuals to be comparatively in a position to simply learn off a decimal quantity
from its binary illustration, this makes much more sense. I believe as we speak BCD
is out of date as a result of we have now shows and our computer systems can convert numbers
represented in binary to decimal for us and show them.
Additionally, I ponder if BCD is the place the time period “nibble” for 4 bits comes from – in
the context of BCD, you find yourself referring to half bytes quite a bit (as a result of each
digits is 4 bits). So it is sensible to have a phrase for “4 bits”, and other people
known as 4 bits a nibble. Right this moment “nibble” feels to me like an archaic time period although –
I’ve undoubtedly by no means used it besides as a enjoyable truth (it’s such a enjoyable phrase!). The Wikipedia article on nibbles helps this idea:
The nibble is used to explain the quantity of reminiscence used to retailer a digit of
a quantity saved in packed decimal format (BCD) inside an IBM mainframe.
Another excuse somebody talked about for BCD was monetary calculations. Right this moment
if you wish to retailer a greenback quantity, you’ll usually simply use an integer
quantity of cents, after which divide by 100 if you need the greenback half. That is no
massive deal, division is quick. However apparently within the 70s dividing an integer
represented in binary by 100 was very gradual, so it was price it to revamp how
you characterize your integers to keep away from having to divide by 100.
Okay, sufficient about BCD.
purpose 3: 8 is an influence of two?
A bunch of individuals stated it’s essential for a CPU’s byte measurement to be an influence of two.
I can’t determine whether or not that is true or not although, and I wasn’t happy with the reason that “computer systems use binary so powers of two are good”. That appears very believable however I needed to dig deeper.
And traditionally there have undoubtedly been plenty of machines that used byte sizes that weren’t powers of two, for instance (from this retro computing stack exchange thread):
- Cyber 180 mainframes used 6-bit bytes
- the Univac 1100 / 2200 collection used a 36-bit phrase measurement
- the PDP-8 was a 12-bit machine
Some causes I heard for why powers of two are good that I haven’t understood but:
- each bit in a phrase wants a bus, and also you need the variety of buses to be an influence of two (why?)
- plenty of circuit logic is prone to divide-and-conquer methods (I believe I would like an instance to grasp this)
Causes that made extra sense to me:
- it makes it simpler to design clock dividers that may measure “8 bits had been
despatched on this wire” that work based mostly on halving – you possibly can put 3 halving clock
dividers in collection. Graham Sutherland informed me about this and made this actually cool
simulator of clock dividers exhibiting what these clock dividers seem like. That website (Falstad) additionally has a bunch of different instance circuits and it looks as if a extremely cool approach to make circuit simulators. - if in case you have an instruction that zeroes out a selected bit in a byte, then if
your byte measurement is 8 (2^3), you should use simply 3 bits of your instruction to
point out which bit. x86 doesn’t appear to do that, however the Z80’s bit testing instructions do. - somebody talked about that some processors use Carry-lookahead adders, and so they work
in teams of 4 bits. From some fast Googling it looks as if there are a large
number of adder circuits on the market although. - bitmaps: Your laptop’s reminiscence is organized into pages (normally of measurement 2^n). It
must maintain observe of whether or not each web page is free or not. Working programs
use a bitmap to do that, the place every bit corresponds to a web page and is 0 or 1
relying on whether or not the web page is free. For those who had a 9-bit byte, you’d
have to divide by 9 to search out the web page you’re searching for within the bitmap.
Dividing by 9 is slower than dividing by 8, as a result of dividing by powers of two
is all the time the quickest factor.
I in all probability mangled a few of these explanations fairly badly: I’m fairly far out
of my consolation zone right here. Let’s transfer on.
purpose 4: small byte sizes are good
You is likely to be questioning – nicely, if 8-bit bytes had been higher than 4-bit bytes,
why not maintain rising the byte measurement? We might have 16-bit bytes!
A few causes to maintain byte sizes small:
- It’s a waste of area – a byte is the minimal unit you possibly can handle, and if
your laptop is storing plenty of ASCII textual content (which solely wants 7 bits), it
could be a reasonably large waste to dedicate 12 or 16 bits to every character when
you would use 8 bits as an alternative. - As bytes get greater, your CPU must get extra complicated. For instance you want one bus line per bit. So I assume less complicated is healthier.
My understanding of CPU structure is extraordinarily shaky so I’ll depart it at
that. The “it’s a waste of area” purpose feels fairly compelling to me although.
purpose 5: compatibility
The Intel 8008 (from 1972) was the precursor to the 8080 (from 1974), which was the precursor to the
8086 (from 1976) – the primary x86 processor. It looks as if the 8080 and the
8086 had been actually widespread and that’s the place we get our fashionable x86 computer systems.
I believe there’s an “if it ain’t broke don’t repair it” factor happening right here – I
assume that 8-bit bytes had been working nicely, so Intel noticed no want to vary the
design. For those who maintain the identical 8-bit byte, then you possibly can reuse extra of your
instruction set.
that’s all!
It appears to me like the primary causes for the 8-bit byte are:
- plenty of early laptop firms had been American, essentially the most generally used language within the US is English
- these individuals needed computer systems to be good at textual content processing
- smaller byte sizes are on the whole higher
- 7 bits is the smallest measurement you possibly can match all English characters + punctuation in
- 8 is a greater quantity than 7 (as a result of it’s an influence of two)
- upon getting widespread 8-bit computer systems which might be working nicely, you wish to maintain the identical design for compatibility
Somebody identified that page 65 of this book from 1962
speaking about IBM’s causes to decide on an 8-bit byte mainly says the identical factor:
- Its full capability of 256 characters was thought-about to be ample for the nice majority of functions.
- Throughout the limits of this capability, a single character is represented by a
single byte, in order that the size of any specific file isn’t depending on
the coincidence of characters in that file.- 8-bit bytes are fairly economical of cupboard space
- For purely numerical work, a decimal digit could be represented by solely 4
bits, and two such 4-bit bytes could be packed in an 8-bit byte. Though such
packing of numerical information isn’t important, it’s a frequent observe in
order to extend pace and storage effectivity. Strictly talking, 4-bit
bytes belong to a special code, however the simplicity of the 4-and-8-bit
scheme, as in contrast with a mixture 4-and-6-bit scheme, for instance,
results in less complicated machine design and cleaner addressing logic.- Byte sizes of 4 and eight bits, being powers of two, allow the pc designer
to make the most of highly effective options of binary addressing and indexing to
the bit stage (see Chaps. 4 and 5 ) .
General this makes me really feel like an 8-bit byte is a reasonably pure alternative if
you’re designing a binary laptop in an English-speaking nation.