8bit
So you have finished your Ben Eater breadboard computer build? And now you want to learn a little more, add a little more power, and write a bit more ambitious programs? Here are 10 improvements to the original design that are relatively easy to do, require only a few more ICs, and will all fit on the original 2x7 breadboards.
I suggest making these improvements in the order they are presented here. The description of each improvement assumes that the previous improvements have been implemented.
This document got started because I decided to source most of my components from a local electronics shop for my build. This meant the exact ICs that Ben Eater used were not always available and I sometimes had to use different ICs, notably:
Initially, I made the build as close to Ben Eater's as I could get it. However, the RAM module had to be re-designed because the GM76C28A-10 RAM IC uses combined input/output data lines, where the original design calls for separate input and output data lines.
After getting the original design (mostly) working I started planning some enhancements with Joost Vromen and Werner van Ipenburg to leverage the extra address pins on my replacement ICs. Some of them have already been implemented, others are on the to-do list. An assembler and simulator for the final machine as envisioned in this document can be found here: https://github.com/wmvanvliet/8bit/tree/final
Some upgrades were inspired by these Reddit threads:
and this Arduino project:
Required components: Some hook-up wire
The machine uses EEPROMs for the control logic, containing microcode. The flexibility created by using microcode instead of a hard-wired logic circuit is a huge boost to the system as a tool for exploration of CPU design, allowing us to add new features fast (for inspiration, see the rest of this document).
However, EEPROMs do not change state instantaneously. Whenever the address lines on the two microcode EEPROM ICs change, we have a period (150ns in my case) where the EEPROM outputs are undefined. This means that during this time, the control lines that are connected to these outputs are bouncing around.
To deal with this, the general design of the machine is to set the control lines on the down-flank of the clock, and have the system "listen" to them only on the up-flank. Except we don't. Since the outputs of the instruction and flags registers are directly hooked-up to the address pins of the EEPROMs, the control lines will change as soon as these registers change, which is on the up-flank of the clock. This is not good, as the control lines are bouncing around while the different modules are actively listening to them.
When multiple modules start writing to the bus simultaneously, you get bus fighting and a spike in power consumption. Add in the poor power distribution across the breadboards, and this spike wreaks havoc on the RST, HLT, and OI lines. With better power management, the spike may be handled well enough to not cause problems, but it would be nice to prevent bus fighting altogether.
The solution is to take the 3-8 decoder used to reset the microstep counter and place it on the control line board instead (or use a new one, but in improvement 3 we will remove the existing 3-8 decoder anyway).
Hook up the last 3 outputs of the microcode EEPROM, controlling the 3 least-significant bits of the control word, to the 3 inputs of the decoder. The first output of the decoder is low when all 3 inputs are low, and we keep this one unconnected for the "no output control lines are active" condition. We use the other outputs of the decoder to drive the RO, IO, CO, AO and EO lines. The existing hex-inverter ICs are used to invert the signal when necessary, so the proper signal is send over the control line and to the indicator LED. The decoder makes sure only one output line is pulled low at any time, so no more bus fighting!
Of course, we must update our microcode to know about the new way of controlling the output control lines:
Required components: one 74LS273 8-bit register IC, some hook-up wire
We might have prevented bus-fighting during the bouncing of the control signals, but the instruction and flags registers are still violating the basic design of only changing control lines on the down-flank of the clock. We can address this with an extra 74LS273 8-bit register. There is plenty of space for it on the breadboard below the instruction register.
Instead of hooking up the instruction register to the EEPROM address pins directly, it is hooked up to 4 of the inputs of the 74LS273 8-bit register. The flags register is hooked up to an additional 2 inputs of this register. Finally, the outputs of the 74LS273 are hooked up to the EEPROM address pins. The output-enable pin of the 74LS273 is tied high and the input-enable pin is tied to the inverted clock signal.
Now, the EEPROM address pins only ever change on the down-flank of the clock and the system should be much more stable!
Required components: Some hook-up wire
Every instruction currently takes a full 5 microsteps. However, we have freed up two EEPROM outputs, and the decoder also still has two free outputs. Let's put them to good use! We can use the final output of the decoder as a control line (SR), so we can reset the microstep counter from the microcode.
In terms of hardware, my first thought was to connect the SR control line to the same 74LS00 NAND gate input where the original reset line had been. However, this causes a problem, as the control lines bounce every time the EEPROM address changes, causing random resets.
A better strategy is to connect the SR control line to the "input enable" pin of the 74LS161 4-bit counter (see /u/Positive_Pie6876's Reddit post). A reset is performed by making the counter IC read from its 4 input pins, which we all tie to ground. Crucially, the 74LS161 will only perform the read on a clock pulse, in this case the down-flank of the clock (remember that the clock pin is connected to the inverted clock signal).
To make the physical reset button also reset the microstep counter, you can connect the "inverted reset" signal that was connected to pin 1 of the 74LS00 directly to the reset pin of the 74LS161.
For example, a NOP instruction will be:
taking only 2 microsteps. Note that setting the SR signal high produces a reset on the next down flank of the clock.
Required components: An Arduino Nano board (or clone), some hook-up wire
When the system is in a stable configuration, we can start thinking about some more ambitious upgrades.
The modified SAP-1 design by Ben Eater is genius in that it resembles a plateau point where, in order to add more functionality in a meaningful way, a big jump in terms of hardware is needed. The main limitation is the need to program the machine through DIP switches. These switches provide the most simple to understand way to provide input to the machine, so designing the system around them makes sense. There is no need for more than 16 bytes of RAM, because programming larger programs using the switches becomes tiresome. There is no need for additional instructions, because the kinds of programs you can write using 16 bytes of RAM will not fundamentally change with a more elaborate instruction set. Hence, in order to meaningfully add more functionality to the machine, we first need to overcome the need for programming through the DIP switches.
I think that the easiest way to achieve this is to place an Arduino Nano on the board. This is kind of a cheat, as it means embedding a more powerful CPU into our breadboard CPU. A better alternative may be to attach an SD-card reader instead and use an EEPROM to bit-bang the SPI protocol, but I don't know how feasible this is. So, I'm fine with the Arduino hack for now.
The idea is to make the arduino disable the EEPROMs upon power-on by setting their CE pin high, and take control over some control lines. Hook up the following pins:
Make the arduino listen to the clock (pin A3), write to the bus, and use the MI and RI lines to store the value on the bus in the memory address register and RAM. After the program has finished loading, disconnect from the bus and control lines (set the pins to INPUT to set them to high-impedance mode), bring the EEPROM CE pin low and the SR line high.
Required components:
Admittedly, this modification is not really easy and is quite a drastic change in terms of hardware, but in my opinion completely worth it. While our memory address and program counter registers only have 4 bits, our data bus is 8 bits, so in theory we could use 8-bit addressing to get access to a whopping 256 bytes of memory. The expansion can be achieved by first doubling up the memory address DIP switch, the 74LS157 line multiplexer, 74LS173 memory address register and 74LS161 program counter. Next, the RAM chip needs to upgraded to something larger. This means that in all likelihood, the upgraded RAM chip uses the same data-lines for both input and output. The circuit must be re-arranged to take this into account, and that is the most complex change we'll have to make.
There is an excellent guide by /u/MironV that provides detailed instructions on how to do it. However, one downside of this design is that it removes the LEDs showing the contents of the RAM. This was a no-go for me, so I created my own design that uses an additional 74LS245 chip, but keeps the LEDs intact:
Using 8-bit memory addresses also means we have to change the microcode. The great thing about 4-bit addresses was that we could pack a 4-bit instruction and a 4-bit address together in the 8-bit instruction register. But no more. An instruction involving a memory address will now take up 2 bytes of memory and the microcode will have to increase the program counter twice during its execution. For example, here is the modified LDA instruction:
Note how we read from RAM three times: once to load the instruction, once more to load the parameter of the instruction, and finally once to load the contents of the requested memory address into the A-register. We've lost some speed, but gained a lot of memory in return!
256 bytes of RAM allows you to write programs such as computing the square root.
Required components: 4-bit DIP switch, hook-up wire
After all of this, we still have three control lines free (two EEPROM outputs and one decoder output). Let's call one of the free EEPROM outputs the "Segment Select" (SS) line and hook it up to the RAM IC as an extra address line. With 9 address lines, we have 512 of addressable bytes in theory, but our bus is only 8 bits wide, so addressing will have to work a little differently.
Addresses 0–255 (SS line low) are dedicated to the programming code and will be read-only (the code segment). Any instructions that read from or write to a memory address will have the SS line high and thus operate on addresses 256–511 (the data segment). Any jump instructions will keep the SS line low and thus jump to addresses in the 0–255 range, which should only contain programming code. Downside: no more self-modifying code. Upside: even more memory!
To use the 9-th address line during programming, you can hook-up a DIP switch to the 74LS157 multiplexer on the programming board. If you've followed my schematics of the RAM upgrade, there should be a multiplexer free on the IC.
With 512 bytes of RAM, we can write more ambitious programs. To structure them, it sure would be nice to have subroutines!
To call
a subroutine means to jump to the start of the subroutine (easy) and, at the end of the subroutine, ret
urning back to the place where we called the subroutine from (hard).
Since a subroutine can be called from different places, the return address cannot be hard-coded.
Instead, whenever we call
a subroutine, we need to store the return address somewhere.
From a hardware perspective, storing the return address at memory location 256 (address 0 with the SS line high) is most convenient, since we can force a 0 into the memory address register by having it read from the bus with nothing currently writing to the bus.
Figuring out what the return address should be is harder.
At the end of the subroutine, the ret
instruction should jump to the instruction after the original call
instruction, or we will have an infinite loop.
Here is an example program that calls a subroutine to display the value 42:
In assembly language:
Compiled into binary code:
When we initiate the call
instruction, we are at memory address 00000000
.
The address we want to jump to, the start of the subroutine, is given as parameter to the call
instruction and placed at address 000000001
(remember that our memory addresses are now 8-bits, so we cannot pack them alongside the opcode anymore).
The program continues (with a hlt
instruction) at address 00000010
, so this should be the return address.
Conveniently, the program counter will contain the right return address if we keep incrementing it as we read the call
instruction and its parameter.
Inconveniently, performing a jump will overwrite the program counter.
So, the order in which things need to happen is:
If you try to compose the microcode for this in your head, you might notice the problem: we need to temporarily store the parameter somewhere while we are writing the return address to the RAM.
The B-register is the perfect place for this.
After all, we are already using it to temporary hold values during addition and subtraction.
We just need to hook up the BO control line in order to get a value out of it without using the ALU.
Luckily, we still have a line free on our decoder.
And with that, we can write the microcode for call
and ret
:
We can only store a single return value, so we cannot perform a "nested" call
, but even with this limitation, its a convenient thing to have when writing larger programs.
At this point, opcodes are still 4-bits: 4 lines drawn from the instruction register, through an 74LS273 buffer register, to the EEPROM address pins. But the EEPROMs I used have 13 address lines, which means we could use all 8 bits of the instruction register:
Now all 8 bits of the instruction register are hooked up to the inputs of the 74LS273 register, filling it up completely. So we need another 74LS173 as a buffer for the flags. Plenty of room on the board for that. And with that, we have 256 possible opcodes! Let's put them to good use.
Our memory expansion has come at a terrible cost: many instructions now take up two whole bytes (gasp!) and take more microsteps to execute. We can use our abundance in opcodes to offset this somewhat by designing "virtual registers".
With our low clock speeds, reading from RAM is just as fast as reading from the A or B register.
This means that we can use the RAM to simulate additional registers at little extra cost.
Since memory address 256 (0 with SS line high) is already reserved for the return address, lets assign addresses 257-263 (1-7 with SS high) for "virtual" registers: b,c,d,e,f,g,h
.
The hardware B-register is never used as a general purpose register and is hereby renamed the "temp" register.
The hardware A-register will remain the a
register and will be nicknamed the "accumulator".
To read from and write to a virtual register, we need the microcode to be able to write its memory address to the bus somehow.
Luckily, we still have the final 4 bits of the instruction register hooked up to the bus from way back in the original design.
Let's remove one line and just keep the final 3 bits attached to the bus, matching our 8 registers.
Now, all we have to do is make sure that the opcode for any instruction using the virtual "b" register ends with 001
, any instruction using the virtual "c" register ends with 010
, and so forth, so setting the IO
line high will put the correct address on the bus.
Here are the opcodes for loading the contents of a memory address into a register:
You can also look at this as a single LD
command with opcode 00001
with the register as a 3-bit parameter packed alongside the opcode.
So now our instruction set will sometimes have parameters packed alongside the opcode, sometimes parameters on the following memory address, and sometimes both.
For example, here is the command for loading the value 42 into virtual register e
:
Here is the corresponding microcode to execute it:
(notice how we use the hardware B"temp" register as a temporary storage location again)
Use the last available EEPROM output to have a separate control line for the XOR gates in the ALU, and the carry-in to the adders.
Using this, you can create ADC
and SBC
instructions, which perform "add with carry" and "subtract with carry".
Now you can add and subtract numbers larger than 8 bit by processing them byte-by-byte.
Even when using the final 3 bits as a parameter for some instructions, we have room for a lot of opcodes. Given our restricted memory, it makes sense to implement a large number of them, where each opcode can do a lot of work (a CISC design). A good way forward is to try and implement an orthogonal instruction set, meaning that all instructions can deal with any type of parameter, whether that be a register, direct value or memory address.
My instruction set is modeled after that of the Z80:
where #
and ##
can be a register, a memory address or an immediate value.
For example, here are all the possible variations of the ld
instruction:
Note the difference between my_label
and [my_label]
.
The former indicates the memory address of a label, which translates into an immediate value as a parameter to the opcode.
The latter means the contents of the RAM at the memory address of the label, which translates into an address as a parameter to the opcode.
Here are the jp
variations: