Two attention-grabbing XOR circuits contained in the Intel 386 processor

Intel’s 386 processor (1985) was an vital advance within the x86 structure, not solely shifting to a 32-bit processor but in addition switching to a CMOS implementation.
I have been reverse-engineering components of the 386 chip and got here throughout two attention-grabbing and fully completely different
circuits that the 386 makes use of to implement an XOR gate: one makes use of standard-cell logic whereas the opposite makes use of pass-transistor logic.
On this article, I check out these circuits.
The die of the 386. Click on this picture (or every other) for a bigger model.
The die picture above reveals the 2 metallic layers of the 386 die. The polysilicon and silicon layers beneath
are principally hidden by the metallic.
The black dots across the edges are the bond wires connecting the die to the exterior pins.
The 386 is a sophisticated chip with 285,000 transistor websites. I’ve labeled the principle purposeful blocks.
The datapath within the decrease left does the precise computations, managed by the microcode ROM within the decrease proper.
Regardless of the complexity of the 386, when you zoom in sufficient, you’ll be able to see particular person XOR gates.
The crimson rectangle on the prime (beneath) is a shift register for the chip’s self-test. Zooming in once more reveals the silicon for an XOR gate
carried out with move transistors.
The purple outlines reveal lively silicon areas, whereas the stripes are transistor gates.
The yellow rectangle zooms in on a part of the standard-cell logic that controls the prefetch queue.
The closeup reveals the silicon for an XOR gate carried out with two logic gates.
Counting the stripes reveals that the primary XOR gate is carried out with 8 transistors whereas the second makes use of 10 transistors. I will clarify beneath how these transistors are related to type the XOR gates.
The die of the 386, zooming in on two XOR gates.
A short introduction to CMOS
CMOS circuits are utilized in virtually all fashionable processors.
These circuits are constructed from two kinds of transistors: NMOS and PMOS.
These transistors may be considered as switches between the supply and drain managed by the gate.
A excessive voltage on the gate of an NMOS transistor turns the transistor on, whereas a low voltage on the gate of
a PMOS transistor turns the transistor on.
An NMOS transistor is nice at pulling the output low, whereas a PMOS transistor is nice at pulling the output excessive.
Thus, NMOS and PMOS transistors are opposites in some ways; they’re complementary, which is the “C” in CMOS.
Construction of a MOS transistor. Though the transistor’s title represents the Metallic-Oxide-Semiconductor layers, fashionable MOS transistors sometimes use polysilicon as an alternative of metallic for the gate.
In a CMOS circuit, the NMOS and PMOS transistors work collectively, with the NMOS transistors pulling the output low
as wanted whereas the PMOS transistors pull the output excessive.
By arranging the transistors in several methods, completely different logic gates may be created.
The diagram beneath reveals a NAND gate constructed from two PMOS transistors (prime) and two NMOS transistors (backside).
If each inputs are excessive, the NMOS transistors activate and pull the output low. But when both enter is low, a PMOS transistor
will pull the output excessive. Thus, the circuit beneath implements a NAND gate.
A NAND gate carried out in CMOS.
Discover that NMOS and PMOS transistors have an inherent inversion: a excessive enter produces a low (for NMOS) or a low enter produces a excessive (for PMOS).
Thus, it’s easy to provide logic circuits corresponding to an inverter, NAND gate, NOR gate,
or an AND-OR-INVERT gate.
Nonetheless, producing an XOR (exclusive-or) gate would not work with this strategy:
an XOR gate produces a 1 if both enter is excessive, however not each.1
The XNOR (exclusive-NOR) gate, the complement of XOR, additionally has this downside.
Consequently, chips typically have inventive implementations of XOR gates.
The usual-cell two-gate XOR circuit
Elements of the 386 had been carried out with standard-cell logic.
The concept of standard-cell logic is to construct circuitry out of standardized constructing blocks that may be wired
by a pc program.
In earlier processors such because the 8086, every transistor was fastidiously positioned by hand to create a chip format
that was as dense as attainable.
This was a tedious, error-prone course of for the reason that transistors had been match collectively like puzzle items.
Customary-cell logic is extra like constructing with LEGO. Every gate is carried out as a standardized block and the blocks are organized
in rows, as proven beneath.
The house between the rows holds the wiring that connects the blocks.
Some rows of standard-cell logic within the 386 processor. That is a part of the section descriptor management circuitry.
The benefit of standard-cell logic is that it’s a lot sooner to create a design for the reason that course of may be
automated.
The engineer described the circuit by way of the logic gates and their connections.
A pc algorithm positioned the blocks so associated blocks are close to one another.
An algorithm then routed the circuit, creating the wiring between the blocks.
These “place and route” algorithms are difficult since it’s an especially troublesome optimization downside,
figuring out the very best places for the blocks and methods to pack the wiring as densely as attainable.
On the time, the algorithm took a day on a strong IBM mainframe to compute the format.
Nonetheless, the automated course of was a lot sooner than handbook format, chopping weeks off the event
time for the 386.
The draw back is that the automated format is much less dense than manually optimized format, with much more wasted
house.
(As you’ll be able to see within the picture above, the density is low within the wiring channels.)
For that reason, the 386 used handbook format for circuits the place a dense format was vital, such because the datapath.
Within the 386, the standard-cell XOR gate is constructed by combining a NOR gate with an AND-NOR gate as proven beneath.2
(Though AND-NOR seems to be difficult, it’s carried out as a single gate in CMOS.)
You possibly can confirm that if each inputs are 0, the NOR gate forces the output low, whereas if each inputs are 1, the AND gate forces the output low, offering the XOR performance.
Schematic of an XOR circuit.
The picture beneath reveals the format of this XOR gate as a typical cell.
I’ve eliminated the metallic and polysilicon layers to point out the underlying silicon. The outlined areas are the
lively silicon, with PMOS above and NMOS beneath. The stripes are the transistor gates, usually coated by polysilicon wires.
Discover that neighboring transistors are related by shared silicon; there isn’t any demarcation between the supply
of 1 transistor and the drain of the subsequent.
The silicon implementing the XOR normal cell. This picture is rotated 180° from the format on the die to place PMOS on the prime.
The schematic beneath corresponds to the silicon above. Transistors a, b, c, and d implement the primary
NOR gate. Transistors g, h, i, and j implement the AND a part of the AND-NOR gate.
Transistors e and f
implement the NOR enter of the AND-NOR gate, fed from the primary NOR gate.
The usual cell library is designed so all of the cells are the identical top
with an influence rail on the prime and a floor rail on the backside. This enables the cells to “snap collectively” in rows.
The wiring contained in the cell is carried out in polysilicon and the decrease metallic layer (M1), whereas the wiring between
cells makes use of the higher metallic layer (M2) for vertical connections and decrease metallic (M1) for horizontal connections.
This technique permits vertical wires to move over the cells with out interfering with the cell’s wiring.
Transistor format within the XOR normal cell.
One vital consider a chip such because the 386 is optimizing the sizes of transistors.
If a transistor is just too small, it’s going to take an excessive amount of time to change its output line, lowering efficiency.
But when a transistor is just too giant, it’s going to waste energy in addition to slowing down the circuit that’s driving it.
Thus, the standard-cell library for the 386 contains a number of XOR gates of varied sizes.
The diagram beneath reveals a significantly bigger XOR normal cell. The cell is identical top because the earlier XOR
(as required by the usual cell format), however it’s a lot wider and the transistors contained in the cell are taller.
Furthermore, the PMOS aspect makes use of pairs
of transistors to double the present capability. (NMOS has higher efficiency than PMOS so would not require
doubling of the transistors.) Thus, there are 10 PMOS transistors and 5 NMOS transistors on this XOR cell.
A big XOR normal cell. This cell can also be rotated from the die format.
The move transistor circuit
Some components of the 386 implement XOR gates fully in a different way, utilizing pass transistor logic.
The concept of move transistor logic is to make use of transistors as switches that move inputs via to the output,
quite than utilizing transistors as switches to drag the output excessive or low.
The move transistor XOR circuit makes use of 8 transistors, in contrast with 10 for the earlier circuit.3
The die picture beneath reveals a pass-transistor XOR circuit, highlighted in crimson.
Notice that the encompassing circuitry is irregular and rather more tightly packed than the standard-cell circuitry.
This circuit was laid out manually producing an optimized format in comparison with normal cells.
It has 4 PMOS transistors on the prime and 4 NMOS transistors on the backside.
The pass-transistor XOR circuit on the die. The inexperienced areas are oxide that was not fully eliminated inflicting thin-film interference.
The schematic beneath reveals the guts of the circuit, computing the exclusive-NOR (XNOR) of X and Y with 4 move transistors.
To grasp the circuit, take into account the 4 enter instances for X and Y.
If X and Y are each 0, PMOS transistor a will activate (as a result of Y is low), passing 1
to the XNOR output.
(X is the complemented worth of the X enter.)
If X and Y are each 1, PMOS transistor b will activate (as a result of
X
is low), passing 1.
If X and Y are 1 and 0 respectively, NMOS transistor c will activate (as a result of X is excessive), passing 0.
If X and Y are 0 and 1 respectively, transistor d will activate (as a result of Y is excessive), passing 0.
Thus, the 4 transistors implement the XNOR perform, with a 1 output if each inputs are the identical.
Partial implementation of XNOR with 4 move transistors.
To make an XOR gate out of this requires two extra inverters.
The primary inverter produces X from X.
The second inverter generates the XOR output by inverting the XNOR output.
The output inverter additionally has the vital perform of
buffering the output for the reason that move transistor output is weaker than the inputs.
Since every inverter takes two transistors, the whole XOR circuit makes use of 8 transistors.
The schematic beneath reveals the complete circuit. The i1 transistors implement the enter inverter and the i2
transistors implement the output inverter.
The format of this schematic matches the sooner die picture.5
Implementation of NOR with eight move transistors.
Conclusions
An XOR gate might look like a trivial circuit, however there may be extra occurring than you would possibly anticipate.
I believe it’s attention-grabbing that there is not a single resolution for implementing XOR; even inside a
single chip, a number of approaches can be utilized.
(In the event you’re occupied with XOR circuits, I additionally regarded on the XOR circuit in the Z80.)
It is also reassuring to see that even for a posh chip such because the 386, the circuitry may be damaged down into
logic gates after which understood on the transistor degree.
I plan to write down extra concerning the 386, so
observe me on Twitter @kenshirriff or RSS for updates.
I am additionally on Mastodon often as @[email protected].