About the Code Generator Generator

OASM is the orange c assembler for X86. When it was written, a description of the parser and object code generator was written in XML. Then a meta-compiler was written that could generate C++ code from this description. This C++ code was incorporated directly into the assembler to describe the assembly language input and object code output.

This approach has a couple of advantages to hand-rolling the parser and code generator. First, focusing on a descriptive language is easier because you don't have to see all the implementation detail of the source code for a generic programming language. Second, now it is possible to retarget the assembler just by writing the description for another processor and compiling it, then attaching the compiled code to the assembler core.

The description for the x86 processor is currently in 'src\oasm\x86.adl'. There is no documentation for it as of yet although that will eventually be forthcoming. This file also has parts of an early mockup-up for the next phase of this project.

As a next phase, this file will contain a description for how the compiler should generate assembly language for each of the Intermediate Language statements that can be generated. This is similar to what was done for the assembler, but now focuses on groups of instructions that can be assembled instead of focusing on the syntax of specific instructions and their opcodes. When combined with the code generator description for the assembler, it should be possible to use this to specify a complete backend for the compiler as an XML description. In addition to the benefits listed for the assembler, this also insulates details as to how the compiler works from the description for code generation. All such details would be hidden in the meta-compiler that generates the backend from the description, or possibly in 'glue' code written to attach the meta-compiler's output to the compiler.

Long term, the goal is to write a description of the ARM processor so that the toolchain can generate ARM code. While I'm trying to get the kinks out I may try my hand at processors I'm used to working with though.

A lot of this work has been planned for a long time; various aspects of the compiler were specifically designed to facilitate it.