Oct 04

Building a Modern Computer from First Principles

The free and open source NAND to Tetris course has been teasing me for a few months.
I believe I first heard about it in the text attached to the following video.

The video shows an implementation of the ALU from the course in Minecraft, which is pretty amazing, but more importantly it made me aware of the course. The course site in its turn made me discover that designing one’s own general purpose computer from scratch was apparently not beyond reach. I will humbly admit that before discovering that course, although I have an academic degree in computer science, I would not have thought that one could design a general purpose 16-bit computer in the scope of a one-semester course homework.
After watching several videos by Shimon Schocken, a sympathetic human being who is one of the two professors behind the course (see Quick presentation, Google talk and TED talk), I was convinced.

Designing my own general purpose computer ranks very high on the list of my fantasies (had I already mentioned I was a nerd?), but I do not think I could achieve that without some serious support. That course might just be the support I need. One needs to be aware of that the hardware built in the course only exists in a simulator when the course is completed. It is built with the help of a Hardware Description Language (called HDL). Generating real hardware from the chips described in HDL in the course has however been done, and the way it was done is very much like modern real life hardware construction: by burning the design synthesized from HDL (or actually Verilog, a real life variant of HDL) to an FPGA.
Of course, the design in the course is not my design, but who knows how much inspiration I could get from taking the course?

Anyway, I decided to give it a go and have taken module 0 today. :-)
The hardware simulator works on my Linux computer (written in Java like all the tools for the course). Scrolling horizontally in the text editor windows does not work for me (the text is no longer displayed correctly when I move the sliders), but it was still usable, and editing the HDL code requires an external editor anyway.
Thanks a lot to the Nand to Tetris team!

Aug 28

Alexandre Astier est génial

Kaamelott m’a largement échappé, entre autres raisons parce que je vis à l’étranger, mais je viens de voir le spectacle Que ma joie demeure ! de et par Astier sur DVD, et je suis conquis. J’avais à plusieurs reprises réfléchi sur le potentiel comique largement inexploité de la connaissance en général. L’idée serait d’utiliser une connaissance particulière dans à peu près n’importe quel domaine, et d’en faire la matière d’un spectacle comique éventuellement dans un but partiellement éducatif. Le sketch de Coluche sur le capitalisme et le syndicalisme en est, dans une mesure très limitée, un exemple. Le sketch d’Alexandre Astier sur la physique quantique en est également un exemple dont le début est très réussi. Il me semble néanmoins qu’Astier est bridé par les limites de sa connaissance sur le sujet. Ce problème est au contraire totalement absent dans le cas de Que ma joie demeure !, qui porte sur Johann Sebastian Bach et la musique. Les parties du spectacle qui portent sur une classe de maître pour les nuls sont absolument hilarantes, et reposent sur les connaissances d’Astier en musique et sur Bach qui semblent très étendues. Les parties humaines et dramatiques du spectacle sont également très réussies. Bref, je ne peut que recommander ce DVD/Blu-ray qui est d’une grande originalité et qui est rempli du talent de son auteur et interprète. Je suis maintenant très curieux à propos de l’exoconférence, son spectacle actuel, et je me demande dans quelle mesure il pourrait constituer la raison de mon prochain passage en France.

Aug 27

Le podcast, c’est super!

Lors de la configuration du smartphone que j’ai récemment acquis, je décidai d’explorer plus avant le podcast. Sur mon smartphone précédent, j’écoutais de temps à autre quelques émissions de France Inter en podcast après téléchargement, dont notamment Rendez-vous avec X, que j’apprécie beaucoup. En vacances en France récemment, dans un appartement sans moyens hi-fi ou vidéo traditionnels, mais avec un téléphone portable muni d’une radio FM, je me laissai souvent bercer par France Inter, et constatai avec surprise qu’il n’y avait pas grand chose à jeter de toute une journée d’émission de cette station de radio.

De retour en Suède, j’acquis un nouveau smartphone en remplacement de mon exemplaire précédent, malheureusement perdu après s’être pernicieusement échappé de l’une de mes poches un peu trop ouverte pendant une course un peu trop effrénée. Quelques jours après cette acquisition, je dus avec dépit me rendre à l’évidence : j’étais entré dans l’ère des smartphones sans radio FM. Cette technologie était désormais dépassée, et moi également sans doute. Après avoir séché ma larme, bien décidé à ne pas me laisser distancer par les jeunes cons, je me lançai à corps défendant mais perdu vers le remplacement intégral de la FM par le podcast. Je payai pour la version pro de l’app BeyondPod que j’avais utilisée sur mon smartphone précédant, et devins un abonné acharné aux flux de France Inter et de France Culture, ne laissant pas d’être impressionné par l’étendue de l’offre de ces deux stations de radio.

Chaque nuit, sans me déranger, BeyondPod télécharge désormais vers mon téléphone les dernières diffusions des émissions auxquelles je suis abonné, ce qui les rend ensuite disponibles à tout moment, dès lors que j’ai mon téléphone sur moi, même si je suis au milieu de nulle part, déconnecté du reste du monde, voire des ondes FM. J’ai depuis déjà joui de quelques heures de Ça peut pas faire de mal, Le masque et la plume, Le gai savoir, La tête au carré et que sais-je encore. Ma recherche frénétique de podcasts intéressants a également abouti à une découverte inattendue: La folle histoire de l’Univers et son auteur, une future Martienne qui s’appelle Florence Porcel, et qui a de l’énergie pour douze (voir par exemple son CV court-métrage façon “Amélie Poulain” ). Elle est également chroniqueuse dans La tête au carré.
Happy podcasting! :-)

Jul 01

While_one project on the STM32F4-Discovery with a GNU ARM Eclipse template

In our previous post, we reported how easy it was to produce and run a blinking program on the STM32F4-Discovery with Eclipse IDE for C/C++ Developers, GNU ARM Eclipse, GNU Tools for ARM Embedded Processors and OpenOCD. It did however, leave me an impression that the project and executable were quite large. Let’s check how it really is.
Who says that “Hello world” is a simple program? It certainly isn’t in bare metal programming. Even blinking a LED is too advanced for our purpose which is to study in details the structure of the STM32F4xx C/C++ project template in GNU ARM Eclipse:

  • Source code
  • Makefile
  • Map file
  • In a lesser extent or just for fun, processor instruction level

To reach that purpose, we need no more no less than the while_one program:

In the Eclipse setup described in our previous post, we create a C project called while_one. It will be an STM32F4xx C/C++ Project, with the Cross ARM GCC toolchain.
Under “Target processor settings”, we choose an STM32F407xx with a Flash size of 1024 KB. The content is “empty”, we use no POSIX system calls and no trace output. We check “some” and “most” warnings, and leave the other settings as they are. We leave the standard folders as they are. We select the Debug and the Release configurations. We use the tool chain “GNU Tools for ARM Embedded Processors (arm-none-eabi-gcc)” and set the correct path to the bin folder (where arm_none_eabi_gcc is located).
The generated code is just what we need:

I have however a number of issues with the resulting project:

  • It does not run on target! More precisely, when trying to step over from the first row in main(), the OpenOCD console ends in a "Info : halted: PC: 0x08000cb4" forever loop. This is in contrast to the blinky program, that just runs as expected. Since the main() function is trivial, the problem must be related to the initialization that happens before main() is invoked.
  • The project includes 10 files from the STM32CubeF4 HAL. I have a hard time believing that while_one needs some much hardware support.
  • The project includes some files, for example _initialize_hardware.c, that are part of “the µOS++ III distribution”. Firstly, I find a bit strange to have some files included from a project that I did not intend to use (at least not right now). Secondly, just to take one example, __initialize_hardware() only enables the FPU, which is also done by SystemInit() in system_stm32f4xx.c, provided by STM32CubeF4 as specified in CMSIS. In other words, the template provides code that is redundant with what ST provides, that also is included in the project.

The observations above are pretty much enough for me to avoid using the STM32F4xx C/C++ Project template from GNU ARM Eclipse. The rest of it, GNU ARM C/C++ Cross Compiler Support and GNU ARM OpenOCD Debugging support still seems interesting, however. My next step will probably be to keep these two plugins, to remove GNU ARM C/C++ STM32Fx Project Templates from Eclipse, and to rebuild the while_one project from STM32CubeF4 instead.

Jun 29

Running an STM32CubeF4 template on the STM34F4-Discovery

STM32CubeF4
Led by The Definitive Guide to ARM® Cortex®-M4…, we have quite easily managed to compile and run a sample from GNU Tools for ARM Embedded Processors (see earlier post). However, we only got a generic Cortex-M4 startup assembly file and corresponding linker script from the sample. According to The Definitive Guide to ARM® Cortex®-M4…, there is more we can get from our vendor, ST in this case (HAL headers and code, drivers, and more generally, all sort of boilerplate code we want to have when we make full use of the board’s resources, instead of reinventing the wheel). STM32CubeF4 is just that, and quite a lot more (especially plenty of example applications and templates). It complies to The ARM CMSIS, Cortex Microcontroller Software Interface Standard, a vendor-independent hardware abstraction layer for the Cortex-M processor series that also specifies debugger interfaces.
What I am most interested in is the contents of Projects/STM32F4-Discovery/Templates/, as it should contain exactly what we need to develop applications for the board (although I am not sure whether it includes support for C++ compilation, which I intend use, but the ARM variant used in an earlier post did have such support, so it should be easy enough to copy/paste).
Projects/STM32F4-Discovery/Templates/ contains project files for several development environments, but no Makefile. One of the supported environment is TrueSTUDIO, that seems to make use of a GNU chain, which is good for us.
I might as well take the opportunity to digress a little about the development environment topic. I won’t apologize for loving open source. No single software vendor has a chance to have nearly as many reviewers as an open source tool. Many reviewers just means higher quality, it’s that simple. Using a GNU toolchain is not even a topic of discussion for me. Using openocd, including its integration with GDB has been very positive so far, so I do not see a reason for looking elsewhere. What is left to choose is:

  • The editor.
  • The debugger GUI (living without a debugger GUI is not really an alternative).
  • Last but not least: the build tool.

Concerning point 1, although I have used Emacs many years, I am leaning towards Eclipse because it is the de facto standard. The reason is that I also develop software for a living, and Eclipse is probably preferable for a potential customer. It is easier to get a consensus around it. The debugger GUI issue is then solved as well (with the right plugins). When it comes to the build tool, I want to be able to build both inside and outside of Eclipse. I reckon that will ease the generation of production binaries, and I also reckon that GNU make and its Makefile are the natural solution for that issue.

GNU ARM Eclipse
I have investigated the fastest way to get a blinking LED example running/debugged under Eclipse:

  • Download Eclipse from Eclipse IDE for C/C++ Developers. Unpack it wherever you like and start it from there.
  • Install GNU ARM Eclipse, as documented under GNU ARM Eclipse plugins installation. GNU ARM Eclipse is a set of plugins, quoting the site: “currently maintained by Liviu Ionescu, a senior IT engineer, with expertise in operating systems, compilers, embedded systems and Internet technologies”.

GNU ARM Eclipse is certainly impressive. Once that was installed, using the documentation from the same site, GNU Tools for ARM Embedded Processors, and OpenOCD, I could run/debug a LED blinking example and see printouts from the program in an Eclipse console in no time, without even using STM32CubeF4. It should be however noted that some code in the plugin, I guess most of what is specific to ST MCUs and boards, comes from STM32CubeF4.

Jun 28

Running ARM samples on the STM32F4-Discovery

Now that we have an original flash image that we know how to restore, it is time to start building and running our own software on the board.

When it comes to the toolchain, I started with the version provided by Manjaro, but I ran into an issue related to Newlib-Nano, which is the C library that is supposed to be used with that toolchain. After a few other tries, I was finally successful with the toolchain provided by ARM and located at GNU Tools for ARM Embedded Processors, that the arch/Manjaro packages are built on anyway. As mentioned in a previous post, the installation is not more intrusive than unpacking a compressed folder and pointing to it in my PATH.
Led by The Definitive Guide to ARM® Cortex®-M4…, who recommended the use of linker scripts provided by ARM in their toolchain samples, I decided to start by building and running the actual samples.
To start with, I reuse the exact code structure provided by ARM in their samples. My purpose was to be able to just run make after as few adaptations as possible. The structure is the following:

The dump directory is mine. The rest is a copy/paste of the contents of ARM’s sample folder.
Under ldscripts, I have modified the contents of the mem.ld file to match my board:

Since gcc.ld (used in most samples) and nokeep.ld had the same rows, I replaced the redundancy by some INCLUDE commands:

The default processor in the samples being a Cortex-M0, I also change the processor to a Cortex-M4:
[nilo@floor arm-none-eabi]$ head src/makefile.conf

And then, under the src directory, I just ran make. :-)
Here for the short version:

The simplest of these examples being minimum, that is the one I decided to test.

Under openocd telnet:

The PC and the MSP match the disassembled image:

Now debugging in gdb (openocd still started, telnet closed, gdb connected instead):

Jun 28

Useful commands in Manjaro Linux

List files included in an installed package:

Upgrading all packages:

Installing an AUR package:

Updating an AUR package:

List block devices and their mount points:

Jun 28

Restoring original flash contents to the STM32F4-Discovery

Now we will test restoring the binary image that we earlier got from dumping the original contents of the flash memory.
Our unique flash bank looks as follows:

We can first verify the image file:

We can then naively test to restore the image without first erasing the bank:

This is not surprising, although I have seen it go through without an error message before (I am not sure what really happened in that case).
Lets now try to first erase the whole bank.
We check the contents of the first word:

We recognize the first word from earlier. Now we erase the whole bank (i.e. the whole flash memory):

It does look like the flash memory is erased. Now lets restore the original image:

This worked too! After the reset, the LEDs are flashing as they did before, instead of staying unlit when I run reset just after the erasing.

Jun 27

Disassembling original flash contents from the STM32F4-Discovery

To work with bare metal ARM programming, I need a bare metal ARM toolchain. Being a Manjaro Linux user, I first installed the following packages from the regular repositories:
– arm-none-eabi-binutils
– arm-none-eabi-gcc
– arm-none-eabi-gdb
This works well enough for what I am doing in this post. However, when trying to compile some samples from ARM it complains as follows:

Newlib-Nano was produced as part of ARM’s “GNU Tools for ARM Embedded Processors” initiative in order to provide a version of Newlib focused on code size. The error is apparently a known issue in arch/Manjaro. The easiest solution I found was to uninstall the packages above, unpack the pre-built toolchain provided by ARM at GNU Tools for ARM Embedded Processors to my home folder and to adapt my PATH to that location, as mentioned in readme.txt.
Now, we disassemble the binary we previously got in openocd:

The -Mforce-thumb option is required because this version of objdump, although recent (binutils 2.24) does not have an explicit armv7 option or equivalent. Cortex-M4 processors implement the ARMv7-M architecture that uses the Thumb-2 instruction set architecture, i.e. a seamless mix of 16 and 32-bit instructions. Without the -Mforce-thumb option, objdump interprets the binary as 32-bit instructions only, which is totally incorrect. In fact, most of the instructions in that binary happen to be 16-bit wide.
As a matter of fact, openocd can disassemble too:

That is a straight disassembly of the first ten instructions located at address 0x00000000 which, as mentioned in an earlier post, is mapped to the start of the internal flash. It seems that opendocd does not need to be instructed about the detailed architecture, probably because that information already is contained in the configuration files used when starting the program.
So, the processor starts by executing lsrs r0, r0, #0x12, right? Wrong. As explained in The Definitive Guide to ARM® Cortex®-M3 and Cortex®-M4 Processors, Third Edition, the first thing the processor does when it comes out of reset, is fetching the MSP value (Main Stack Pointer) from address 0x0000 0000, i.e. a 32-bit address, in our case 0x2000 0c80, which unsurprisingly lies in SRAM (0x2000 0000 - 2001 FFFF) according to the STM32F407VG datasheet. The stack grows downwards, so that address is the top of the stack.
Next, the processor fetches the reset vector from address 0x0000 0004. In our case 0800 422d, which is in flash (0x0800 0000 - 0x080F FFFF according to the same datasheet).
The processor then starts to execute the program from the reset vector address and begins normal operations:

The reason why the fetched vector address ends with 422d instead of 422c is because vector addresses in the vector table should have their LSB set to 1 to indicate that they are Thumb code.
The first instruction loads the value located at address 0x0800 4240, that is 0xe000 ed88 to r0 (the disassembler interprets it as a 32-bit unknown instruction, assuming that the first word is most significant, which explains the half word inversion in presentation). The ARMv7-M ARM (Architecture Reference Manual) tells us that 0xe000 ed88 is the address of the Coprocessor Access Control Register (CPACR). The three following instructions set the so-called CP10 and CP11 bit fields to 0b11, which give full access to the floating point coprocessor.