Thursday, July 5, 2012

Wherein the fundamental flaw is ultimately expressed

July 4th is a US national holiday, and amongst celebrating and catching up on my sleep I worked a bit on the Instruction Pointer board layout. I worked out a compact layout for the counters used to manage the refresh and effective address selection, and did the layout for all four bits. This brings the the IP board to 60% completion, as measured in placed and routed components, although this doesn't count the capacitors I added to the layout.

Which brings me to a flaw in the 4004's design.

Since the 4004 design is based on dynamic logic -- that is, it depends on storing a charge on the input capacitance of MOSFETs -- the state must be refreshed periodically or bad things happen. In most of the design this happens as a natural part of system operation. For example, the Instruction Register shown on Intel's block diagram really doesn't exist, it's just the charges stored on the inputs of the pairs of inverters that drive the decode logic. These "registers" are updated every time a new instruction is fetched, so they don't get corrupted due to charge leakage.

Other sections of the design must hold their values for extended periods of time, like the Instruction Pointer and Scratchpad Register arrays. These DRAM arrays have dedicated logic that makes sure they get refreshed periodically even if they aren't explicitly written by software.

One of the few "oopses" in the design is in the IP Effective Address counter. The first bit of this two-bit counter works just fine. Transmission gates T0371 and T0424 are driven by the CLK1 signal, so the charges on the gates of T0378 and T0420 are refreshed at least every 2 us. However, the same is not true for the second bit: when bit 0 is high transmission gates T0284 and T0353 are off, and will remain off as long as bit 0 remains high. The only reason bit 0 will go low is the execution of a subroutine call or return instruction. Thus, if the software doesn't execute one of these instructions for too long, the charges on the gates of T0294 and T0350 can leak away, changing the state of bit 1. This changes the row of the IP array used as the instruction pointer, causing the program to jump erratically.

I can't claim to have discovered this myself. I vaguely recall hearing that some microprocessor required occasional subroutine calls to work around a hardware defect, but that was decades ago and I don't recall knowing it was the 4004. Lajos Kintli pointed out this design flaw when I contacted him regarding my project. I'll have to ask him where he heard about it, just to avoid leaving that bit of trivia dangling.

With this flaw in the design in mind, it becomes a bit more clear why the test code the 4004 Analyzer runs by default starts by performing a sequence of three subroutine calls followed by three subroutine returns.

Since it's looking like I'll have board space to spare, I may lay out pads for capacitors between the gate and source of T0294 and T0350. My breadboard experiments suggest that the components I'm using will hold a "valid" charge for a shockingly long time -- several seconds, as I recall -- so I may not need these, but again it's easier to lay out pads now than to try to add them in later.

1 comment:

  1. Thanks for the analysis. I had heard about this bug, but can't remember where. I always thought that the 3 JMS and 3 return instructions were just to initialize the 12x4 address register to a known state, assuming that the POC (reset) signal doesn't clear the DRAMs (though I haven't checked).

    ReplyDelete