slembcke.net

2020-08-21

Tina: The header only coroutine libary

Critical Match

A couple years ago I was working on an NES game, and I figured coroutines would perfectly solve a problem I had. The game play code was written in C, but implementing coroutines in assembly felt completely normal considering most of the project was in assembly. I was pretty pleased with the result, and it only took up ~200 bytes in the ROM! In retrospect it was actually pretty complicated because it had to accommodate the dual hardware + software stack cc65 used to implement it's C runtime on the 6502 CPU.

Fast forward a year and I decided some amd64 assembly practice would be fun. I picked coroutines as an interesting example since I'd done it before and it's not something you can just implement in C. Boy was it so much simpler than the cc65 version. Since I was all excited about coroutines at that point I started looking for some “real” libraries. There are some clever (but not so practical) options like Protothreads, a bunch of wrappers of Windows fibers and ucontext on Unix (both of which are not-quite-deprecated), and some interesting options like Boost contexts. There weren't really any simple options though. To be fair, you need a bunch of platform/ABI specific assembly so how could it be simpler?

Enter Tina

So I had this really stupid idea… My coroutine implementation had a .s, a .c, and a .h file. The whole thing was dozens of lines of code. Almost as a joke, I made a header only library out of it. I mean inline assembly in a header seemed like a bad joke, and I like bad jokes! It only supported the amd64 System V ABI, so it was still just a silly novelty. Right?

That's when I had a less stupid idea: You can mix assembly code for different ISAs in a single C file and use the preprocessor to select code for different platforms. So I read some more ABI docs and wrote dozens more lines of code so I could support the System V ABIs for aarch32, aarch64, and amd64. Suddenly it didn't seem like such a dumb idea. It now supported all of the platforms except for weird niche ones like Windows. (Hmm…)

Visual Studio

The problem with adding Win64 support was that Visual C stopped supporting inline assembly for 64 bit. (Dang!) If Visual C didn't support inline assembly, what about inline binary blobs? Yup! You can create custom linker sections with a special pragma, and then tag variables with an attribute to place them in your new section. Clang/GCC have a similar attribute which is nice because I didn't really want to duplicate the code if I didn't have to. Voila!

#pragma section("tina", execute)

__declspec(allocate("tina"))
const char executable_binay_code_goes_here[] = {...};

At this point, my “joke” coroutine library supported all the modern desktop ABIs, mobile ABIs, and as far as I can tell, console ABIs. Not bad for a little over 200 sloc of code. :)

Going Further

While coroutines themselves are pretty useful, I remembered watching Christian Gyrling's Parallelizing the Naughty Dog Engine video a couple years ago. It described a neat fiber based job system that seemed pretty easy to implement and use. The catch was that you need sturdy fibers (coroutines) to build on top of. :D While I didn't really need a job system for my hobby games, I did want one. It also seemed like a good test of Tina and it's performance so I went ahead and implemented that too. Naturally I wrote a fractal explorer to test the job system that was itself a test of Tina. >_< However, it was kinda fun to use. As a result, I've been happily using Tina jobs in my hobby game engine ever since.

fractal

At this point I've successfully run Tina and Tina Jobs on 64 bit Linux, Mac, Windows, FreeBSD, and Raspberry Pi (32 and 64 bit). I haven't tried mobile yet, but I'm pretty sure it will work with a few ifdef changes. It's no longer a joke, and I intend to keep supporting it though I still enjoy the absurdity of a little binary blob snippet in a headerlib. :)

Get the code

Anyway, that's the story of why I put binary code in a header. (Thanks for sticking with me this far. This is probably the most I've written in one sitting for years.) If you are interested in the code, it's on Github

(EOF)