Disassembling original flash contents from the STM32F4-Discovery

To work with bare metal ARM programming, I need a bare metal ARM toolchain. Being a Manjaro Linux user, I first installed the following packages from the regular repositories:
– arm-none-eabi-binutils
– arm-none-eabi-gcc
– arm-none-eabi-gdb
This works well enough for what I am doing in this post. However, when trying to compile some samples from ARM it complains as follows:

Newlib-Nano was produced as part of ARM’s “GNU Tools for ARM Embedded Processors” initiative in order to provide a version of Newlib focused on code size. The error is apparently a known issue in arch/Manjaro. The easiest solution I found was to uninstall the packages above, unpack the pre-built toolchain provided by ARM at GNU Tools for ARM Embedded Processors to my home folder and to adapt my PATH to that location, as mentioned in readme.txt.
Now, we disassemble the binary we previously got in openocd:

The -Mforce-thumb option is required because this version of objdump, although recent (binutils 2.24) does not have an explicit armv7 option or equivalent. Cortex-M4 processors implement the ARMv7-M architecture that uses the Thumb-2 instruction set architecture, i.e. a seamless mix of 16 and 32-bit instructions. Without the -Mforce-thumb option, objdump interprets the binary as 32-bit instructions only, which is totally incorrect. In fact, most of the instructions in that binary happen to be 16-bit wide.
As a matter of fact, openocd can disassemble too:

That is a straight disassembly of the first ten instructions located at address 0x00000000 which, as mentioned in an earlier post, is mapped to the start of the internal flash. It seems that opendocd does not need to be instructed about the detailed architecture, probably because that information already is contained in the configuration files used when starting the program.
So, the processor starts by executing lsrs r0, r0, #0x12, right? Wrong. As explained in The Definitive Guide to ARM® Cortex®-M3 and Cortex®-M4 Processors, Third Edition, the first thing the processor does when it comes out of reset, is fetching the MSP value (Main Stack Pointer) from address 0x0000 0000, i.e. a 32-bit address, in our case 0x2000 0c80, which unsurprisingly lies in SRAM (0x2000 0000 - 2001 FFFF) according to the STM32F407VG datasheet. The stack grows downwards, so that address is the top of the stack.
Next, the processor fetches the reset vector from address 0x0000 0004. In our case 0800 422d, which is in flash (0x0800 0000 - 0x080F FFFF according to the same datasheet).
The processor then starts to execute the program from the reset vector address and begins normal operations:

The reason why the fetched vector address ends with 422d instead of 422c is because vector addresses in the vector table should have their LSB set to 1 to indicate that they are Thumb code.
The first instruction loads the value located at address 0x0800 4240, that is 0xe000 ed88 to r0 (the disassembler interprets it as a 32-bit unknown instruction, assuming that the first word is most significant, which explains the half word inversion in presentation). The ARMv7-M ARM (Architecture Reference Manual) tells us that 0xe000 ed88 is the address of the Coprocessor Access Control Register (CPACR). The three following instructions set the so-called CP10 and CP11 bit fields to 0b11, which give full access to the floating point coprocessor.

Leave a Reply

Your email address will not be published. Required fields are marked *