Oct 10

Nand2Tetris: assembler implemented and verified (project 6)

Nand2Tetris‘ assembler/comparator thinks that the 20000 line-binary file produced by my assembler for the pong game is correct to the bit, which means that my assembler, although I know it is not even close to being robust, is now good enough for my purpose.
As usual, the book contains a very detailed analysis of the problem to solve, and a clean design proposal. What is left is quite a straightforward implementation. Still, it is not entirely trivial, and one gets the satisfaction to have gone one step further towards the goal of a computer built from Nand gates that will be able to run graphics programs written in a high level language.
From a software and hardware development process perspective, the course is also very pedagogic, providing the means to test the results of every project. Encouraged by that mindset, I implemented a test class for the assembler parser, that helped me to verify that I had not broken anything when I added more functionality. In fact, I did write the test cases and run them before even starting to write the corresponding parser code, so one could say that I applied the principles of test driven development.
Given the little scope of the project, I implemented support for this little unit testing in my main() function:

In order for PARSERTESTER_HPP to be defined, I only have to add:

This way, I can keep the rest of my file and Makefile structure untouched. When the #include is there, my application will be a unit test application instead of being the full assembler. My the test code is written to throw an exception any time a test does not pass. The exception won’t be caught and will lead to a crash of the application. If the test application writes “Test successful”, it means that it run to completion without hitting a throw. Primitive, but simple.
Most of the time I spent in this project was researching a good solution for the parser in C++ (see my 3 previous articles).
The times I showed in Performance of C++11 regular expressions were for a one pass-implementation of the assembler that had not support for labels.
Interestingly, the times for the complete version, which has two passes, i.e. parses the whole source file twice, are not much longer.
One pass:

Two passes:

It would therefore seem that most of the time is spent in input/output from and to hard disk. A bugged version of the assembler that did not write the output file and that I happened to time seemed to show that most of the “sys” time in a working version is spent writing the file to disk. Maybe that could be optimized in some way (I haven’t done the math).
I will now move on to chapter 7, entitled VM I: Stack Arithmetic. :-)