Decoded: QBasic Gorillas

QBasic Gorillas was a game that Microsoft distributed with later versions of MS-DOS to demonstrate the capabilities of their QBasic IDE. I recall finding it shortly after an OS update on my 386 SX 25. My programming experience at the time had been limited to vi, edit, and the BASIC interface. The QBasic IDE was familiar and looked a lot like edit. In retrospect, I realize that the edit format was very popular across many development IDEs, such as Borland.

The game is very simple by 1990 standards. If I had to date the concept and features, I'd probably put Gorillas closer to 1982, especially when we consider that Scorched Earth was released the next year. One weapon, hot seat play, 6 or so colors. Yes, this game falls well short of the bar in 1990. Nevertheless, it does occupy a special place for me because I learned two important programming concepts from it: Storing graphics data as integers, and manipulating computer hardware. The latter ended up becoming my full time interest so I quickly cut over to learning C not too long after this.

Before I continue, let me throw up the code and the walkthrough
(Original Source) (w/ line numbers) (Code Walkthrough)

Let's take a look at the now (very obsolete) lessons that QBasic Gorillas taught me back in the day.

Accessing BIOS Services in QBasic

The idea that you could control hardware with programming was magic to me at the time. In Gorilla, the usage is so trivial that you barely notice that the game turns on the NumLock key when you start. However, I was hooked. Back in the days before the Internet, there weren't good resources for researching how things worked (at least in my small town). I spent days messing around with the POKE command in and around the memory offset 1047 with various bit masks. Fortunately, we don't have need to experiment to get things done now.

BIOS Data in DOS can be accessed in BASIC via PEEK & POKE by pointing to the base memory segment 0 and offsetting 1024 bytes (0x400). BIOS data occupies the next 256 bytes (1024-1280 / 0x400-0x4FF). The first keyboard status byte is offset 0x17 from BIOS start -- at raw memory address 0x417. The bits of this byte mean the following:

Bit 0 - Right-Shift status
Bit 1 - Left-Shift status
Bit 2 - Control key status
Bit 3 - Alt key status
Bit 4 - ScrollLock key status
Bit 5 - NumLock key status
Bit 6 - CapsLock key status
Bit 7 - Insert key status

There are plenty of other useful memory areas for QBasic in DOS such as: 0xA000 is a direct portal to video memory, including the frame buffer 0x21 to send and receive arbitrary data from the DOS interrupt vector.

The sky is the limit!

For more information about the BIOS Data map in DOS, see this resource:

Image Data Formats in QBasic

The second challenge buried in QBasic Gorillas was the storage format of the animated bananas. It seemed like nonsensical number values and I only had a vague sense that they depicted pixel colors at the binary level. It took quite a bit of prodding to figure out how it worked back in 1990 and looking back on it today (2017), I can understand why. The format doesn't really compare to anything we see today. Yet, it does make sense at the lowest level of managing bit fields for arbitrary image dimensions. We need a description of how to put the image together.

The first 4 bytes of the DATA sequence provide a header describing the dimensions of the subsequent data. The first 2 bytes hold the width (X), and the next two bytes hold the Y (height). Note that this header format is slightly different across screen modes 1, 9, and 13 (EGA, CGA, and VGA).

For example, an 3x4 image would be stored with these 4-byte headers in various screen modes:

CGA - Integer value is 262150. Byte 1 (X) is 6 (00000110)
EGA - Integer value is 262147. Byte 1 (X) is 3 (00000011)
VGA - Integer value is 262168. Byte 1 (X) is 24 (00011000)
All modes have the same byte 3 (Y), which is 4 (00000100)

Only the EGA screen mode directly converts the X dimension values. CGA mode left shifts the base value by 1 while VGA left shifts it by 3.

The image data following the header describes the color of individual pixels. The image is described from top to bottom, left to right, like reading western books. Data is read little-endian between bytes, but within a byte, we stream from most-significant bit to least

There's one gotcha lurking in here: When the next pixel in the target image changes to a new line, then we stop reading the current byte and move to the next. This is likely because of the way the interpreter handles address offsets. You can't directly address individual bits in a byte and point them to a new line simply by offsetting from a base. (Without storing in an intermediate format, at least). If we tried to do that, then the pixels of the next line wouldn't be properly byte addressable, leading to a Big Mess (tm). It's easier to just skip to the next byte.

Enough chatter, let's take a look at specific examples from the Gorilla code:

CGA Banana Left

In CGA, 2 bits represent a single pixel for a total of 4 possible colors.

DATA 327686, -252645316, 60

A CGA Image stored in QBASIC DATA format

EGA Banana Left

EGA allows for 4-bit color (16 total), but the bits are stored in planes, not consecutively. Reading the bit aligned position in each plane determines the color.

DATA 458758, 202116096, 471604224, 943208448, 943208448, 943208448, 471604224, 202116096, 0

An EGA Image stored in QBASIC DATA format

Notice that the banana is visible in the three most significant planes. Thus, the value of the pixel in that position is 8 + 4 + 2 == 14. The color 14 in the standard EGA palette is yellow.