Bzo Devlog #1: Parsing, Machine Code, Solvers
I’ve made several attempts to build compilers in the past. Generally what has happened is that my vision of what the end result might look like has changed fairly quickly, while development was rather slow. Give it enough time, and eventually the two don’t line up anymore and it makes sense to start over.
Compilers have a regular structure to them - you start out by loading the files, then you parse the code into an abstract syntax tree (AST), you perhaps do some typechecking and similar checks on the tree to make sure everything makes sense, then you generate some bytecode that can be analyzed and optimized until it’s ready to be converted to machine code or some other target format.
This structure can be fairly cleanly divided into discrete stages, and generally you need some progress to be made on the earlier stages before it makes any sense at all to build the later ones. After all, it’s a lot easier to generate machine code from an AST if you already have a parser that outputs the AST. Further, many of these compiler components can get rather complex and so trying to build everything all at once before you have a good grasp on what all the edge cases will be in a previous stage isn’t really good.
Compilers make a good case study for some of the ideas I have on theoretically approaching software planning, though that’s a discussion for another time.
One thing I’ve learned though is that, when building software, a breadth-first approach to construction is often much slower and worse than a depth-first approach. Writing code productively can be as much as a psychological challenge as a technical one, and minimizing the time spent unnecessarily slogging through hard problems is valuable.
My previous attempts to building a compiler have largely been breadth-first, trying to build the parser in its entirety before building anything else. This was largely a mistake, and the approach I’m taking now is a depth-first traversal; building very little of the parser, and immediately trying to convert that parsed code into runnable machine code. There’s even been some work on some of the stranger and more unique parts of the compiler.
Progress there has been going well. Let’s go through some of it.
Keep reading with a 7-day free trial
Subscribe to Bzogramming to keep reading this post and get 7 days of free access to the full post archives.