Dave’s Hacks: Contained in the ALU of the armv1
I actually loved studying Ken Shirriff’s blogs about reverse engineering the 8085, (e.g. Inside the ALU of the 8085 microprocessor), and instantly considered his articles once I noticed that the fellows over at visual6502.org announced that they’d launched the masks stage particulars and full simulation of the very first arm chip – the armv1.
With that in thoughts I embarked by myself try to reverse-engineer elements of the armv1. Some background information of the processor’s structure is useful, and googling for “ARM Structure Reference Guide” will lead you to very detailed descriptions of the extra fashionable variations of the processor. By simply wanting on the masks and understanding a little bit concerning the processor’s structure it is potential to make some good guesses at what a few of the blocks are.
The barrel shifter is very apparent once you go to the interactive visual6502.org web site and zoom in on that portion of the chip and see the diagonal traces. Additionally, from the structure description, we all know every data-processing (ALU or arithmetic logic unit) instruction selects 3 registers: one vacation spot register (the place the ALU end result goes), and a register for every of the 2 operands – Operand 1 and Operand 2. It’s due to this fact an affordable guess that the Register file has 3 units of register choice logic, which is verified by the three layers of gates of very comparable sample instantly above the Register file.
From the structure description we all know that ALU is managed by 16x opcodes:
So my first step was to make sure I might discovered the precise space for the ALU. From the structure description I do know that the 2 inputs to the ALU are:
- Operand 1: the content material of register n, as chosen by the Rn subject of the instruction.
- Operand 2: the output of the barrel shifter (most operations choose a shift of 0).
I due to this fact began by reverse-engineering the barrel-shifter and figuring out the barrel-shifter’s output. By following the output I knew it will result in the ALU.
The portion of the die related to a single bit slice of the ALU is right here:
An instance of the interpretation of transistors right into a gate (which corresponds to the higher left circuit of the ALU) is as follows:
The complete ALU circuit accommodates 70+ transistors for every of the 32 bits, or over 2,200 transistors in complete.
This diagram corresponds to a single bit within the ALU, so that is replicated 32 instances to type the complete ALU. On the bodily silicon these are stacked one on high of the opposite, though bodily the circuit is swapped left for proper, because the inputs to the ALU are from the right-hand-side and exit on the left-hand-side.
Within the schematic above the management alerts (7500, 2370, and so on. – these are their web numbers) are proven coming into the circuit from above and under; on the bodily silicon all these management alerts originate from above the ALU.
The eagle-eyed may even discover that the Carry propagation and Zero calculation circuits alternates barely between every bit, with b0, b2, and so on an identical, and b1, b3, and so on. an identical. The top end result is identical however the purpose for the distinction is to maintain the execution path as quick as potential by eliminating an inverter per bit; be aware that the Carry Out and Zero Out alerts are reverse polarity to the inputs.
The 16x totally different ALU operations are chosen by the suitable setting of the management alerts as proven within the desk under.
The schematic and the desk above give an enormous quantity of element! Nevertheless it may be damaged down into smaller, extra digestible items.
First, be aware that 2370, 2371, 7484, and 7485 all have the identical setting for every opcode; they, and the related FET transistor isolation circuitry, could be ignored (their objective is for an additional dialogue).
Second, be aware that 7393 is simply excessive when it is an arithmetic operation – it objective is to allow/disable the Carry chain.
Third, Management sign 7500, and the 3x gates on the high left of the schematic, decide whether or not Operand 1 is inverted on entry to the remainder of the ALU (be aware that the enter sign is already inverted, so the ‘regular’ setting is for it to be 1 to invert it once more).
Fourth, management alerts 7489 and 7499 choose whether or not Op 2, or its inverted model, is chosen by means of the higher path and to level (A) marked on the schematic.
Equally, management alerts 7487 and 7488 choose whether or not Op 2, or its inverted model, is chosen by means of the decrease path and generates sign (B) on the schematic.
The directions that require a subtraction – sub (subtract), rsb (reverse subtract), and cmp (examine) – accomplish that by changing the related operand into its twos-complement type by inverting all of the bits and including 1 by feeding a ‘1’ into bit 0 of the carry chain (7326) after which performing an add operation.
The varied logic operations (and, or, unique or, and bit clear) are chosen by choosing the suitable polarity of every enter operand and selecting the best mixture of 7489, 7499 and 7487, 7488. For instance, be aware that the one distinction within the management alerts between the ‘and’ opcode and ‘bic’ opcode (bit clear), is that the values on 7489 and 7499 are swapped inflicting the inverted type of Operand 2 to be fed into the higher calculation path. That is whereas each 7487 and 7488 are pressured excessive inflicting sign (B) to be low no matter the enter.
The desk under reveals for the entire opcodes a few of the intermediate outcomes, and the outputs for one mixture of enter bits.
So how are the management alerts generated? They’re created by PLA-1, however how that a part of the circuit works is for an additional day.