Thursday, March 31, 2016

Thou shalt check up

One of the commandments that any engineer worth his salt follows is this:
Thou shalt not assume. Thou shalt check up and make damned sure.
In July 2012 I was trying to determine how many transistors of each type I should buy. At the time a BSS83 cost 27.6¢/ea when ordered in quantity, while the FDV301 cost only 4.7¢/ea. I'd discovered the FDV301 worked well in a grounded-source configuration but the gate protection diode could be a problem in other configurations. To help with the decision I created a spreadsheet. It breaks the 1749 FETs needed into three categories:
  • 1041 with the source connected to GND
  • 105 with the drain connected to VDD
  • 603 fitting neither of the above two configurations
Clearly those in the first category were a good fit for the FDV301. The drain-to-VDD configuration wasn't suitable for the FDV301 because pulling the gate to ground could forward-bias the protection diode pulling the source to ground too. But it would guarantee that the body diode would remain reverse-biased, making another FET like the DMN26 a candidate.

When I went back to this spreadsheet this morning to see how many extra BSS83s I had on hand I got a real scare. I'd settled on using the FDV301 only in the grounded-source configuration and the BSS83 for the rest. That meant I needed 1041 FDV301s and 708 BSS83s.

I only purchased 700 BSS83s. And there are no more to be had.

I have a few short strips of tape with a few extra BSS83s I bought for experimentation but haven't used, but it'd be close. If any got lost or damaged I could end up not being able to complete the project. I bought 1200 FDV301s even after having identified a suitable replacement in the DMN26, but the BSS83 was a Unicorn. What had I been thinking??

I started thinking about replacing some of the BSS83s in the second and third categories with either FDV301s or DMN26s. The reason I hadn't done so before was the difficulty in creating an automated process for identifying those that have to be a BSS83.

Then it occurred to me that I'd already looked at one such second-category situation: push-pull drivers. I'd identified 58 places where the FDV301 would work better than the BSS83. More importantly, I'd made the substitution before purchasing the FETs, meaning I actually need 1099 FDV301s and only 650 BSS83s.

Instead of being short 8 BSS83s I have 50 extra, plus any in my little strips.

I'd have trouble respinning any of the boards, but if I don't screw up I have enough. To double-check this I reran an Eagle user-language program I'd written to count components on each board. The results of an early version of this program can be seen at the end of my posting on partitioning, but that version didn't differentiate between the transistor types. The updated version does and confirms the transistor counts above.


The demise of the BSS83

I learned yesterday that the NXP BSS83 has been discontinued  as "non-manufacturable" and is no longer available. The last order date was December 2014 and the last delivery date December 2015. When I bought components for this project my intention was to order more than enough for the complete board set, but now I'm going to have to make sure what I have on-hand is enough. Fortunately 4-terminal FETs are only required in the pass-gate configuration and a few other critical spots.

I heard this from someone who is also interested in building a discrete component version of the 4004. Welcome to my insanity, Aston!

Over the years I've come across various FETs that could be used for such a project. The BSS83 RF FET was a good choice because it has a reasonable threshold voltage and a low gate capacitance (1.5pF), though it was N-channel rather than the 4004's P-channel construction. Others I identified included the N-channel Calogic SST215 (which might be a drop-in replacement for the BSS83) and the P-channel Micrel MIC94050. Tim McNerney suggested using the P-channel dual-FET BSS84 with the two transistors wired in series source-to-source (or drain-to-drain).

Aston has started experimenting with the MIC94050, which is now manufactured by Microchip. He asked whether it would work in the bootstrap load circuit given its low on-state resistance. My answer is it depends on what you're trying to do with it. My attempt to recreate the bootstrap load circuit was only to understand why this circuit was needed and how it worked.  My test circuit added several other components including current-limiting resistors, and I was able to recreate the circuit's operation well enough to understand it.

Tim McNerney, who is far more focused on recreating the exact design than I am, told me that Federico Faggin told him not to try to recreate the bootstrap load in a discrete component implementation because he'd never be able to match the characteristics with off-the-shelf components. I concur. You  end up with a complicated circuit where a simple resistor will do the job.

Fortunately bootstrap loads aren't necessary if you use true resistors as loads, as resistors don't turn off when the voltage across them drops below a FET's turn-on threshold the way FET loads do.

What other challenges might a hobbyist face when using the MIC94050 rather than a BSS83? I'd guess speed and power consumption will be the biggest factors. The MIC94050 has an input capacitance of 600pF, compared to the BSS83's 1.5pF and the FDV301's 9.5pF. For a given load resistor value, this will greatly increase propagation delays. To put some numbers to it, let's look at the rise-times with a 4.7K load pulling up the gate of a single FET. Trc with a BSS83 is about 7ns, and with an FDV301 about 45ns. With the MIC94050 it's 2.8us, which is more than double the 1.35us cycle time of the 4004.

To achieve a 45ns rise time we'd need to use 75 ohm load resistors, which would draw 67mA when pulled low. Assuming half of the 473 resistors are being pulled low at any time that's 14 amps at 5 volts, or about 72 watts. That'd keep your hands warm in the winter!

Worse, I haven't yet identified the longest combinational path through the 4004 logic. My recreated Instruction Pointer board appears to work quite nicely with a 2.0us cycle time, with the rest of the CPU emulated by an FPGA, but I think the critical path is in the ALU circuits. Thus I'm not sure even 45ns is fast enough to achieve a 1.35us cycle time.

Also consider the drive requirements for the CLK1 and CLK2 paths. CLK1 has to drive 28 FETs in parallel, and CLK2 has to drive 51. A little math says that's 17,000pF and 31,000pF respectively. The TC4427A Power MOSFET driver I used on my FPGA interface board is specified to drive 1,000pF at 5V in about 50ns, and has performance curves that show it driving 2,200pF in about 90ns, but we're still off by a factor of about x14.

All is not lost. The leakage current in modern discrete components is a lot lower than it was early in the development of integrated circuit technology. My DRAM test circuit appeared to hold its state for several seconds, while the 4004 refreshed its IP and Scratchpad registers every 500us or so. Since running a recreated 4004 built with MIC94050s at 741KHz isn't going to be an option, take advantage of the low leakage and run the system at a (much) lower speed. That will make everything easier to work with too (not everyone can splurge on a 1GHz digital oscilloscope).

Friday, March 25, 2016

More musing on a 6th board

As I mentioned a few days ago, I'm considering a sixth board. This one would carry a modern FPGA and components and serve two purposes:
  1. An easily reconfigurable test board to allow testing of any single board or set of boards in the i4004 CPU board stack.
  2. A system emulator for the i4004 CPU board stack, emulating the rest of the components of a Busicom PF-141 calculator.

Thursday, March 24, 2016

A bigger, almost as cheap FPGA experimenter's board

When I started looking at inexpensive FPGAs reference boards a few years back one I considered was the Lattice Semiconductor iCE40-HX1K. My Verilog i4004 implementation consumes 42% of the logic cells (and more than 58% of the 1280 PLBs). I expect I could probably squeeze the emulation for a few other MCS-4 chips needed for a simple system into it, but there wouldn't be a lot of resources left for anything else. Routing in an FPGA starts to become the limiting factor at some point even when there are free logic cells (75% seems to be a commonly quoted number).

The same line of Lattice chips has 4K and 8K parts. The 4K part comes in a 144-lead TQFP with 0.5mm lead pitch, which is manageable, but the 8K only comes in BGA packages. At the time there were no reference boards for these but that's changed; there is now a breakout board for the iCE40-HX8K:

This breakout board costs only $43 makes 120 I/Os available to the experimenter.

If the 1K part and only a few I/Os suits your needs, the iCEblink board seems to have been supplanted by the iCEstick, which is not much bigger than a large memory stick. It's dirt cheap at $22.

Thursday, March 17, 2016

I/O & Timing board progress

I've made good progress on the I/O and Timing board, so I thought I'd share a screen capture of the layout. There are currently 249 unrouted signal airwires, plus another 201 to GND and 92 to VDD.

Monday, March 14, 2016

Perhaps a different calculator?

When I started this project I wanted to find a printing calculator with a keyboard that was similar in layout to that of the Busicom 141-PF to serve as an input/output device for my CPU. I vaguely remember looking specifically for calculators with 00 keys, 12-digit capability, and a "number of decimal digits" switch. After looking at a bunch of models in my local Staples office products store I chose the Canon P170-DH Printing Calculator.

A sixth board?

When I started this project I was very worried about how I'd test this thing. I didn't want to build all five boards, only to discover I'd made some major mistake that would require scrapping them and starting over. Doing so would not only take a lot of time, but between the cost of the board and the components on it, each assembled board costs around $250. Redoing one would be frustrating and annoying, but if I had to redo all five I might well drop the project.

Starting the I/O and Timing board

I figured the best way to be sure all the right components got on the I/O and Timing board was to create an Eagle project for it. So I did. And promptly realized that a lot of the Eagle configuration I'd gotten used to has to be done on a per-board basis, like trace width and via drill diameters. This is the first i4004 board I've started since late 2012, and you can forget a lot in three years.

There are three major functional groups on this board:
  1. A self-initializing 8-bit shift register that produces the one-hot CPU phase of execution signals (A12, A22, A32, M12, M22, X12, X22, X32) and the SYNC signal.
  2. The 5-bit Chip Select decoder and external output drivers (CMROM, CMRAM0 to CMRAM3). This is the chunk of logic that I moved off the ALU board.
  3. The 4-bit, tri-state, bi-directional data bus external pin I/O drivers.
I haven't made a screen shot of the layout yet. I'm at about the half-way point, with the first two groups placed and partially routed. The layout is moving so rapidly because there is a lot of repetition within each group. The shift register is essentially 8 copies of the same 1-bit pattern, with a minor variation on the first (A12) and last (X32) bits, plus some logic to generate the SYNC signal. Once I found a layout that worked well for two adjacent bits the rest follow the pattern. The five Chip Select output drivers are all the same and the decode logic has common elements. I haven't started laying out the data bus I/O drivers yet, but it will be four instances of the same layout pattern -- one for each bit -- plus a little decode logic.

I expect the parts I've already placed will shift position (and possibly rotate in the case of the Chip Select logic) to accommodate the data I/O drivers but I don't expect any problems making it all fit. This board has the fewest components of any board in the set and there's quite a bit of free space left.

Thursday, March 10, 2016

A reminder to myself

I got to thinking about the I/O and Timing board last night, and discovered I hadn't started an Eagle project for it. This is a worry, because in September 2012 I did A little re-partitioning which involved moving the ROM/RAM chip select logic and drivers off the ALU board and onto the I/O and Timing board; a board for which a project does not exist. I don't want to lose this change.

With any luck this change was just moving an entire schematic sheet from one board project to another, but I have suspicions that it involved splitting Sheet 13 of the partitioned master into two separate sheets. Since I no longer remember what did and didn't stay in the ALU project I need to go back to the partitioned master and compare that with the ALU schematics so that chunk of logic doesn't get lost when I create the IOT board project.

Wednesday, March 9, 2016

Preliminary Scratchpad placement complete

I was having trouble getting to sleep last night, so I finished placing the remaining components on the Scratchpad board. Here's a screenshot:

With plenty of free space on the board I kept the components making up the various functions grouped and separated. The DRAM array is pretty obvious. The row drivers are to its immediate right, column pre-charge above and column sense and mux below. The control and data busses are below that (on the bottom of the board, shown in blue), and the write data latches below those.

Outlined on the right edge of the array are the row read and write enable drivers, and to their right is the 3-to-8 row address decoder. Continuing to the right is the 3-bit refresh counter, with each bit outlined. Bit 0 is on the bottom and bit 2 is on the top.

The other groupings are logic functions to generate various signals such as read and write enables for the odd and even nibbles, row read and write strobes, etc. When placing these groups I worked from the output drivers back toward the input signals. I started laying them out left-to-right, but after finishing them I decided they'd work better with each group rotated 90 degrees clockwise. Originally the order of the groups was the same as in the schematic, but I've since rearranged them into what seems better from a signal routing perspective.

Speaking of routing, I also routed the refresh counters and the row decoder logic. This seemed to be the easiest way to be sure the layout was workable. There's still a lot of routing to be done: there are 639 signal airwires, plus 424 ground and 84 VDD airwires. I also need to add power decoupling capacitors, and decide whether to add provisions for charge storage capacitors in the DRAM array as I did with the IP board.

Tuesday, March 8, 2016

Scratchpad refresh counter placement

Thinking how to best pin out the rest of the inter-board connections, last night I decided I should also look at the Scratchpad board. To my surprise I discovered I really hadn't gotten much done on this when I shelved this project back in 2013. Looking back at the blog entry I wrote about Scratchpad Array Placement in August 2012 I see I'd done the placement of the rectangular DRAM array components but not much more. Almost none of the signals are routed.