skip to main content

3D Software Rendering on the GBA

Game Boy Advance Fixed-point math, here we come...

 |  programming  | 


When I started programming the gba-sprite-engine two years ago, I knew I would be getting myself into trouble. The Game Boy Advance only has 16Mhz and it’s whole software library is written in low-level C using DMA (Direct Memory Access) and memory-mapped IO. Translation: pointers! ** - Yay!

In the end, switching to C++11 while trying to unit test and stub out BIOS code as much as possible did help soften the pain. I’m glad I got my hands dirty again, and it writing “closer to the metal” was a welcome change from the usual high-level stuff I produce.

But GBA MODE 0-1-2 is not the only possibility to write a GBA game. There’s also bitmap mode, 3-4-5, that lets you write pixel colors (or palette indices) yourself. This opens up possibilities of software rendering things yourself. 90% of the GBA library did not do that. But a few games did:

How do you render things in 3D without hardware acceleration, and without an FPU on the circuit board that handles float digits, taken into account the (mostly) 16-BIT bus rate and 16Mhz CPU? Well… It does not exactly produce 30+ FPS:

Wireframing 507 vertices and 968 faces

Trying to rasterize the same thing

Drawing a lot of lines is not exactly something the GBA loves to do. And I did use tonclib’s optimized routines after a failed attempt to implement Bresenham myself. MODE4 has weird byte-write requirements and you can optimize DMA writing of horizontal lines.

But the worst part was fixed-point math, sine lookup tables, and calling the BIOS just to get a square root of something. Math.sin() takes input in radians, in any common programming language. The above imported Babylon JS mesh expects the same, but my sine table is filled in [1-512] slices and expects it’s input 16-BIT. More needless bit-shifting.

I intended to design the engine again as high-level as possible taking advantage of C++'s objects and operator overloading. How about worldMatrix * viewMatrix;? Everything is unit-tested (thank god for that, it took out a lot of bugs). But passing objects around in limited RAM sounds ridiculous - and it probably is, even if it’s a const MatrixFx& reference or a std::shared_ptr<Mesh>.

Reverting to a simple box sped up the FPS:

A BabylonJS-exported Box. (including a bug)

A rasterized octahedron, with back-face culling.

Even calculating the frames per second is a pain. What’s a “second”? Okay, so we need a hardware timer interrupt. When does this thing overflow? How many cycles does the CPU take before that happens? Are you seriously using the divide operator instead of fxdiv()?

Also, I could not remember most of the math needed to project 3D vertices into a 2D view, so I let myself be guided by David’s excellent 3D soft engine tutorial in JavaScript. Of course I had to port in all Matrix/Vector operations myself.

Future work: texturizing - I’m curious to see at what rate we could get a simple box textured with a mario “?” block. I won’t even try to attempt portal rendering like the 007 Nightfire devs.

Check out the source code here:

Teaching Object-Oriented design using the GBA

C++ and a GBA engine. Let's learn to create a game!