Sunday, September 23, 2012

It's a RACE!

I've been switching back and forth between the hardware and Verilog versions of the 4004 CPU, based mostly on mood. The Verilog version had been moving ahead so quickly I was thinking I'd have it working before I got the Instruction Pointer board fully populated and tested, but then I ran into the inevitable bugs in my translation from schematic to Verilog.

Suddenly I have an Instruction Pointer board which needs testing, with nothing to properly test it. When my plans for the evening were rescheduled, I got some time to focus on the Verilog version.

Most of the problems I've run into have either been from misreading the schematic, or from dropping a negation in the constant polarity reversals. Let me explain the latter: the type of circuits used in the 4004 are either NOR or NAND gates. This means that the output is the negation (i.e. "NOT") of the basic logic. When they want to OR two signals the output comes out inverted (if either A OR B is 1, the output is 0, else it's 1). The same thing happens with a NAND (if A AND B are 1, the output is a 0, else 1).  Every time I trace a signal through series of logic circuits I have to account for this reversal. If I do a verbatim translation to Verilog I usually get it right, but if I need to do a bit of creative interpretation the chance of getting it wrong goes up.

Finding such mistakes can be a challenge, but this is a geek's version of a good detective mystery. Every possible clue must be considered; some clues are misleading, but others lead you closer and closer to catching the culprit. It's something I'm pretty good at, and it's fun when there isn't a deadline hanging over my head.

A little while ago I was very excited, thinking I had the system working. The simulator had successfully fetched and executed the first two instructions, but then the simulator screen full of nice, green signal traces (valid states, either 0 or 1) turned to a cascade of red (invalid or unknown states, neither 0 or 1, represented as 'X'). The first clue was that the first two instructions are both NOPs (No-Operation), which are coded as '0000'. The third instruction, which was being fetched when things blew up, was a JUN (Jump Unconditional), coded as '0100'. For 20 nanoseconds, one internal clock cycle, the data bus showed '0100', but then it changed to '0X00'. That started a cascade of red 'X' states as one signal after another became unknown. It took about an hour (including time spent petting the cat while pondering the problem) to track it back to a missing inversion in the generation of the OPR-IB signal; this caused a second source to attempt to drive the data bus to '0000', and the conflict caused the failure.

I'm not claiming I've fixed all the bugs, but the simulator just stepped through a sequence of JMS and BBL (call and return) instructions, which tests a whole lot of logic in the Instruction Decode and Instruction Pointer groups. This is enough to be able to test the Instruction Pointer board pretty thoroughly.

So the race is on!

No comments:

Post a Comment