Video Display Cloning

Timing

To duplicate the displays of old micros, you need to know the signal timings.

Horizontal Timing

NTSC and PAL have very similar timings to display horizontal lines.

Standard Frames
/ sec
Fields
/ sec
Lines/
frame
Lines
/ sec
µs
/line
 
PAL 25 50 625 15625 64 µs exactly  
NTSC 30 60 525 15750 63.492 µs.  

For simplicity, this discussion will assume 64 µs per line.
This is split into the following regions:

Active
display
Front
porch
Sync
pulse
Back
porch
56 µs 1.5 µs 4.5 µs 6 µs

I started by looking at the timings of the 6847 chip, used in relatively simple micros (the Acorn Atom and the Tandy Color Computer). For economy it uses a single crystal (3.579545MHz) to drive the colour signal, the dot clock, and the entire timing chain for memory accesses. It clocks out pixels at both clock edges (so a 50% duty cycle is required), i.,e. at twice the crystal frequency. There are 8 pixels per character cell width, so characters are accessed at a quarter of the crystal rate - i.e. 0.8948863 MHz.

Interleaving

It is possible to interleave CPU/VDG memory accesses so each has alternate access without interfering with each other. This is just what the Tandy CoCo does. The Acorn Atom however does not, which allows it to run at its own maximum speed - 1MHz . The Atom avoids disrupting video by restricting access to the non-display periods.

I wanted to produce a 6847-like system that would suit both micros. I have a PAL-standard TV, and wish to drive it via the RGB inputs of its SCART socket, so the NTSC colour crystal frequency is not required. I am free to use other crystal frequencies for the timing.

A later Acorn machine - the BBC micro - used a 6845 video sequencer chip, and interleaved access at 1 (or 2) MHz in 40 (or 80) characters per line modes.

At 8-pixels per character width, and 40 characters per line, the BBC micro produces 320 pixels in a 40 µs time period. The 6847 in contrast uses 32 chars = 256 pixels in 35.76 µs. About 80% shorter time and narrower display.

Square pixels

At this point it is worth pointing out the desirability of square pixels - i.e. when you draw 100 x 100 pixels on a screen it appears square. This simplifies display geometry. Pixel height is of course fixed by the TV standard, but the pixel width depends on the pixel clock.

Clearly the BBC micro's 6845 and the 6847 cannot have the same pixel aspect ratio, so I measured their displays on my TV to find out the ideal pixel clock for square pixels. I found the BBC micro in 320x240 pixel mode (8 MHz pixel rate) gave almost square pixels. The few percent error I can attribute to the TV's own width-adjustment settings being slightly off.

So it seems the nominal horizontal display time is 40 us. This is smaller than the 52 µs TV picture display time because a TV picture normally goes from edge to edge, while a computer image must not lose anything at the edges hence a 6 us margin left and right.

A 40-char display shows 320 pixels in the 40 us. For 32-character displays to clock out 256 pixels in the same time, the character rate is reduced from 1 MHz to 0.8 MHz. The dot clock drops from 8 MHz to 6.4 MHz.

Suddenly, a few facts clicked into place. The ZX81 (designed for the UK and PAL) has a 256x192 pixel display, and a 6.5MHz dot clock which is near enough the ideal 6.4 MHz.

I reckon that 40 us is the ideal time, and 6.4 MHz is the ideal for 256 pixels in that time.
8 MHz is the ideal for 320 pixels in that time.

Pragmatic decisions

Ditching the NTSC colour crystal based timing allows both the 6502 and the 6809 to run at their standard maximum ratings of 1 MHz and have square pixels. As an added bonus, it makes higher-resolution displays (40 chars / 320 pixels wide) possible.

Video RAM decisions

The original CoCo teamed the 6847 with the 6883 - a chip that multiplexed addresses and generated control signals for dynamic RAM. Close inspection of the timings for typical 150 ns rated DRAMs show they are safely met by the 6883 with an NTSC colour crystal, but there is not much room for 'overclocking'. Running the CPU at 1MHz shortens the access time required. The 6502A of the BBC micro can run at 2 MHz, shortening access time even further and hence the BBC micro used 100ns rated DRAM chips.

Static RAM chips have simpler timing diagrams. These have more address lines to drive but this seems a fair trade in return for simpler interfacing. I'd prefer to get a simpler design running first then improve it later, than to spend longer getting a more complex design running.

Usually, static CMOS RAM chips are designed to run at a balanced speed (70ns to 150ns) and power. However, the PC industry created a demand for cache RAM that was faster that the main DRAM, and static RAM chips were produced optimised for this task. As a result, you can get very fast static RAM (15 to 20 ns) pretty easily. I picked up an old 486 motherboard with nine UM61256 SRAM chips (32K x 8 bit, 15ns). I thought they would be expensive (because of the speed) but they are only £2 ($3) from Maplin (the UK equivalent of Tandy).

Fast SRAM also opens the door to some serious overclocking: 15ns = 1/66 MHz! The 6502 is available up to 4 MHz tops, but if I stick a VHDL 6502 into my Xilinx X2C200, I could be running it around 40MHz! I'll need to the EPROM copy itself into the RAM first (I don't know of any ROM that fast and cheap), then switch the clock to higher speed. It would be amusing to see an old micro with a 'turbo' button, running many times faster than normal. It might blow away all those software emulators running on Pentiums!

This recent look at RAMs made me realise how slow these things are compared with the CPU.
My PC's CPU is running at nearly 1 GHz, main DRAM modules run about a tenth of that, and only then through clever access modes.

Proposed New '6847' Display Horizontal Timing

The 6845 bases timings from the top left corner of the active display area, and the diagram below follows suit.

00-31 32-39 40-45 46-47 48-51 52-56 58-63
Active display (32 characters, 32 µs) Right
border
Front
porch
Sync
pulse
Back
porch
Left
border
Active display (40 characters, 40 µs)  
32 µs 8 µs 6 µs 1.5 µs 4.5 µs 6 µs 6 µs

Of the 52 µs active picture region, the margins are defined to centre a 40 µs (320 pixel) window. It would also be possible to make the timing centre a 256-pixel window by changing the margins to be 8 µs each, but for now I would like to keep things simple by using this one timing which will suit both sizes.

The front porch and sync pulses have been rounded up to the nearest couple of microseconds.

Proposed New '6847' Display Vertical Timing

The 6847 had a significant nuisance in the UK - it ran at 60 Hz field rate (NTSC standard) while the UK expected 50 Hz (PAL). Most cheap TVs managed to cope with this but some did not. Imaging coils are designed to run at their resonant frequency. Anything else is less efficient and generates more heat. A good TV would be quite right to refuse a non-standard signal. Designing with programmable logic allows this problem to be fixed.

The Atom sensed the frame sync pulse to sense when it was safe to write to the screen without causing snow. With an interleaved memory system, any write is 'snow-free', so the CPU can omit 'anti-snow' procedures to have full-time access to the screen. This accelerates video-intensive operations.

The Atom also sensed the frame sync pulse to provide timing information. Some software might rely on this to regulate the speed of events - e.g. the ticking of a real-time clock. For this reason it might be worth generating a 'fake' 60Hz signal that is nothing to do with the real frame rate but exists solely for the benefit of such programs.

NTSC Non-interlaced
262 13 Vertical blanking
243 25 Top border
192 Active display (256x192)
26 Bottom border
6 Vertical retrace
PAL Non-interlaced
312 10 Vertical blanking
292 26 Top border
240 192 Active display (256x192)  
38 Active display (320x240)  
26 Bottom border
10 Vertical retrace

The BBC micro's default 6845 register values

Values cross-checked with BBC documentation.

  0 1 2 3 4 5 6 7 Function
0 127 63 Total number of characters on a line minus one
1 80 40 The number of displayed character cells per line
2 98 49 51 Position of horizontal sync
3 0x28 0x24 Sync widths (V/H in hi/lo nibbles). 2 scan lines, 4 chars.
4 38 30 38 30 Character lines refreshed in 1/50th second, minus one.
5 0 2 0 2 Fractional part of the number of lines scanned per frame (1/50th second). A.k.a. Vertical Total Adjust.
6 32 25 32 25 Number of displayed character rows
7 34 27 34 27 Position of the vertical sync (character rows)
i.e. at scan line 34/27 = 272, 27 x 10 = 270
8 0x01 0x93 Interlace and display blanking delay.
D0=1 selects Interlaced mode
9 7 9 7 9 18 Number of scanlines per character minus one
10 103 114 Scanline at which the cursor starts
11 8 9 8 9 19 Cursor end scan line. It must be >= register 10.
12,13 0x0600 0x0800 0x0B00 0x0C00 0x2800 Location in memory of the top left hand corner of the screen
14,15 0x0600 0x0800 0x0B00 0x0C00 0x2800 Address of the cursor

PAL requires 625 lines = 312 lines in even field, 313 in odd field.
Registers 4 and 5: (38+1)x8+0 = 312, (30+1)x10+2 = 312.

6845 register settings to emulate the 6847

  0 1 2 3 4 5 6 7 8 9 A B C D E F Function
0 63 64 µs per scan line at 1 MHz CPU clock
1 32 Displayed character cells per line
2 49 Position of horizontal sync (adjust to centre the image).
3 0x24 Sync widths (V/H in hi/lo nibbles)
4 25                             38 Character lines refreshed per field, minus one
5 0                             0 Fractional part of the lines per field
6 16                             24 Number of displayed character rows
7 22                             34 Position of the vertical sync. Not critical?
22x12=264, 23x12=276. 34x8 = 272
8 0                             0 Interlace (off) and display blanking delay
9 11                             7 12 scanlines per character, in text/semigraphic modes.
8 scanlines per character cell, in graphic modes.
10 0x20 (no cursor) Scanline at which the cursor starts
11 xx Cursor end scan line. It must be >= register 10.
12,13 0x8000 Location in memory of the top left hand corner of the screen
14,15 xx Address of the cursor

Cursor registers are not relevant to 6847 emulation because the 6847 has no hardware cursor control.

Ideally the memory would be scanned as rows of characters with either 1,2,3,4,6, or 12 scan lines per characters, but register 4 can only go up to 128 character rows so it is not possible to choose 312 character rows which are one scan line high. The best it could do is 312/3 = 104. Instead, the 6845 is set for 39 rows of 8 lines per character row. This requires the character row outputs from the 6845 to drive address lines of the display instead of the font-generator ROM.

Of course the 6845 model could be extended to count beyond the 128-character-row limit, but for now it would be simplest to see if the 6845 can work with the fewest modifications.

Shift Registers

This seems a fairly simple function at first sight, and indeed it is for a simple 8-bit shifter with 1 bit per pixel. If you want more modes than this, you need more registers arranged in more complex ways and clocked at different rates, which all adds up to a lot more wiring. Video data handling is typically the highest-speed operation in a machine.

For these reasons, video circuitry is a top priority for cramming into custom chips. Of the two ULA chips in the BBC micro, ones contains the video clocks, shift registers, and colour palette. The ULA in the ZX81 is largely the video generation. All the early micros were founded on the strength of their video generation circuitry. Even in the early days, the ROM, RAM and Z80 or 6502 CPUs could be bought for just a few pounds but the video generation was most of the gap between something that could merely compute and a practical system.

Many machines require a shifter that will cope with 1, 2, 4 or 8 bits per pixel.

d7 d6 d5 d4 d3 d2 d1 d0 -->1-bit to colour palette
 
d6 d4 d2 d0 -->2-bits to colour palette
d7 d5 d3 d1
 
d4 d0 -->4-bits to colour palette:
d5 d1
d6 d2
d7 d3