In this
article:
The Blink Eco-system
Summary
Navigation:
HomeHardware
Software
Techniques
Controllers
Reviews
Index
Hello World
For many people, programming computers comes in two flavors, the first is rather theoretical with data structures, algorithms, and various library functions. The second flavor is when the computer is either too simple to host an interactive operating system, or the computer is dedicated to running a piece of code that is intimately tied into the goings on of the world around it. Robots typically run such embedded systems and their idea of an operating system can be quite a bit simpler than something like Linux.
If you are learning to program computers (as opposed to learning to program a specific computer, but we’ll get to that in a moment) the first program most people write is called “hello world.” So named because the only thing the program does, is print out or display the text “Hello World!”.
The origin of this was the book “The C Language” by Kernigan and Ritchie, which has the following as its original “hello.c”
#include <stdio.h> main() { printf("hello, world\n"); }
It is a really simple program. The reason it exists in the tutorial is to give the person reading an opportunity to go through all of the steps that their computer uses to compile, assemble, and link together a new piece of software and turn it into a program that the computer will run.
In the embedded world, the equivalent tutorial program is called “Blink.” Unlike the C example above, there are rarely “standard” libraries for managing I/O pins or doing delays. As a result the canonical blink example is more of a structure than a compilable program. Its shown next.
#include <custom-include-file.h> void main() { initialize_io_pin(LED_PIN, OUTPUT_PIN); while (1) { toggle_io_pin(LED_PIN); delay(1000); } }
Like hello.c
, the goal is not so much the function of the
program, rather it is to verify that one can successfully do all the steps
necessary to create, compile, load, and run a program on the embedded
system.
Unlike hello.c
, there aren’t any standards or libraries
one can use so every version of blink.c
is a bit different than the
others. For the Arduino, there are
standard libraries and so blink.cpp
for the Arduino can call those libraries to fill in the gaps from our
example above.
Lets look at the minimum blink for the ST Micro STM32F4 chip, and in the process we will see how the tools work to put code on to this chip.
Important Tools
C Compiler
Being able to compile your code is essential. While it was common on small machines to write directly in machine code or assembler, on a “larger” architecture like the 32 bit Cortex M series that is moderately impractical.
The choice of language is a personal preference, I find C to be an excellent language for writing systems code, from drivers to operating systems to large applications. There is also a very well developed and well supported compiler system available, the Gnu C Compiler or GCC. This combination made it ideal for me but others have preferred other languages. Perhaps the most wide spread in Robotics groups was BASIC and especially PIC BASIC from Parallax which was used in their Basic Stamp controllers.
One of the down sides of GCC is that as an open source product, if you are just learning, you are at the mercy of whether or not the compiler supports your chip. Once you are an expert this becomes a super power and you can port GCC to host any chip you want. But before we get there, the folks at ARM have done a great job of porting it to the Cortex M series and have hosted it on Launchpad as the GNU Tools for ARM Embedded Processors project.
There are executables for Windows, Mac OS X, and Linux on that page so pretty much what ever system you are running you can cross compile for the Cortex M family of embedded processors.
Bootloading and Debuggging
In pre-historic times computers had a series of switches on their front panel you could use to deposit binary data directly into their memory, during the Dot Com days an engineer would use something called a PROM Programmer to write binary data into a memory chip, but these days, and pretty much ever since the Motorola 68HC11 and the Microchip 16C84, there has been a way to use a generic port to program the non-volatile memory of a microprocessor. The Cortex M series is no exception.
ARM defined a “standard” way of programming and debugging the Cortex M series, based on something they called a “Debug Access Port”. This port can use industry standard JTAG pins or software debug (SWD) pins. There are off the shelf products that will talk to those pins like the Black Magic Probe. Manufacturers however, liking a single one-step procedure will often include a captive processor to run a debugging tool, in the ST Micro case that is called “ST Link” and in the Freescale case they have adapted a new standard from ARM called CMSIS-DAP which Freescale calls OpenSDAv2 when combined with their Cortex M0 chip.
What these things have in common is that an open source tool, OpenOCD, can be used to talk to that protocol over USB and down load programming to the chip you are programming. This is much simpler than either the lights and switches model or the erase, program, test, repeat model.
Using these two tools, GCC Embedded and OpenOCD, we’ll compile and load our blink example on to the STM32F4-Discovery board which ST Micro sells from Digikey and others for about $15.
The Blink Eco-system
When you are starting from “bare metal” (which is one way of thinking about a completely unprogrammed microprocessor system), is to think about the components that have to be present for that machine to run.
Because this is a software article and not a hardware one, we’ll just gloss over the fact that the board has a crystal for its clock, a power supply from the USB port, and an LED already attached to GPIO pin 15 on port D. The hardware engineer took care of that for us and we’re here to make it blink.
The things we’ll be looking at are how we start this CPU up from “scratch”, how to configure the I/O pins into something useful, and what it takes to get this into the hardware and running.
Startup Configuration
The hello.c
example has all of its startup issues taken care of by
the operating system, for blink programs the programmer is responsible
for everything. Starting up the board and chip isn’t necessarily
complicated but it is a peice of code that embedded systems
programmers deal with while people writing code for operating system
hosted applications don’t.
The first thing startup has to be concerned about is just what happens when the chip comes out of reset. In the case of the ARM Cortex M, the CPU takes the first 32 bit value in memory address 0 and puts that into the stack pointer, and then it takes the second 32 bit value and puts that into the program counter, effectively jumping to that address. In fact the first hundred plus 32 bit words contain the addresses of functions that are interrupt service routines (ISRs) and are called when an interrupt occurs. This is called the ‘interrupt_vectors’ table or the ‘isr_vector’.
The gcc sample code defines an interrupt table like this:
.align 2 .globl __isr_vector __isr_vector: .long __StackTop /* Top of Stack */ .long Reset_Handler /* Reset Handler */ .long NMI_Handler /* NMI Handler */ .long HardFault_Handler /* Hard Fault Handler */ .long MemManage_Handler /* MPU Fault Handler */ .long BusFault_Handler /* Bus Fault Handler */ .long UsageFault_Handler /* Usage Fault Handler */ .long 0 /* Reserved */ .long 0 /* Reserved */ .long 0 /* Reserved */ .long 0 /* Reserved */ .long SVC_Handler /* SVCall Handler */ .long DebugMon_Handler /* Debug Monitor Handler */ .long 0 /* Reserved */ .long PendSV_Handler /* PendSV Handler */ .long SysTick_Handler /* SysTick Handler */ /* External interrupts */
The first thing to notice about this code is that it gives these interrupt
vectors their own section name “.section .isr_vector
” which we will
see again when we talk about the linker script.
The second thing to notice is that the vector for “reset” is filled in
with the function name Reset_Handler
. This function gets control
whenever the device is reset. So it has the responsibility of setting
up a couple of things that are important to C programs.
.globl Reset_Handler .type Reset_Handler, %function Reset_Handler: /* Firstly it copies data from read only memory to RAM. * * The ranges of copy from/to are specified by following symbols * __etext: LMA of start of the section to copy from. Usually end of text * __data_start__: VMA of start of the section to copy to * __data_end__: VMA of end of the section to copy to * * All addresses must be aligned to 4 bytes boundary. */ ldr r1, =__etext ldr r2, =__data_start__ ldr r3, =__data_end__ .L_loop1: cmp r2, r3 ittt lt ldrlt r0, [r1], #4 strlt r0, [r2], #4 blt .L_loop1 /* * Clear the bss section * * The BSS section is specified by following symbols * __bss_start__: start of the BSS section. * __bss_end__: end of the BSS section. * * Both addresses must be aligned to 4 bytes boundary. */ ldr r1, =__bss_start__ ldr r2, =__bss_end__ movs r0, 0 .L_loop3: cmp r1, r2 itt lt strlt r0, [r1], #4 blt .L_loop3 /* * On to initializing the system */ bl SystemInit .pool
This code does two things, for variables that are ‘initialized’ it makes
sure that the initialization is done, and for ‘uninitialized’
variables it makes sure they are set to zero. In the code below we use
delay_time
to hold the number of iterations in our delay loop.
Technically that isn’t necessary but when we start talking about gdb
here in a minute it will make sense.
Clock Considerations
Back in the day it was easy to start up a microprocessor, hook up a crystal, or even more simply a crystal oscillator “can” to the Clock In pin and voila. While there is some flexibility with today’s microcontrollers like the PIC or ATMega the clock options are set by fuses that are programmed during flashing. For the ARM Cortex M, there are a number of options you can choose from and you program them after the chip is running.
In this particular example, I’ve not changed the clock. This means
that the chip is running on its default internal oscillator which is
not particularly accurate. This is fine for the demo, and so when
Reset_Handler
exits by calling SystemInit
, that code just goes and
calls the function main
right away. We will do a bit more in
SystemInit
when there are more interesting things to do, but remember
that the blink
example, like hello.c
is a way to make sure you can
use all of your tools, what the program does is primarily incidental.
System Configuration
Another thing that makes blink.c
more complicated in the embedded
world is that there is no standard for how various I/O devices on an
embedded chip should work. Further, while 8 bit microprocessors had
dedicated I/O ports, the number of pins on something like the Cortex
M4 cannot hope to bring all of the possible I/O features out to actual
pins. This situation came about because the ability to more
transistors on silicon greatly increased, but the ability to put more
pins on an integrated circuit did not increase nearly so much.
In order to avoid an explosion of variations on a single chip, microcontroller companies have taken a page from the FPGA design book and made the I/O pins into programmable hardware.
All ARM Cortex M series, and perhaps all ARM CPUs, now have a system of
registers which are used to dynamically assign the function of a
package pin, to a part of the chip inside that implements that
feature. Sometimes there are multiple choices, so the transmit line
for a UART may appear on one pin or another depending on
configuration. That configuration then needs to be taken into
consideration if you’re program is going to use them. In our simple
example we’ll do that configuration in our program but it will be
useful to have that taken care of in later examples by the
SystemInit
code.
So the datasheet is consulted and the memory address of the peripheral controller and GPIO port D are identified. Port D was chosen since, on the Butterfly the upper four bits of that I/O port are connected to four different LEDs. Using that information several defines are made for the code:
#define GPIOD 0x40020C00ul // from the data sheets #define GPIOD_MODE (*(uint32_t *)(GPIOD+0x00ul)) #define GPIOD_TYPE (*(uint32_t *)(GPIOD+0x04ul)) #define GPIOD_SPEED (*(uint32_t *)(GPIOD+0x08ul)) #define GPIOD_PUPD (*(uint32_t *)(GPIOD+0x0aul)) #define GPIOD_IN (*(uint32_t *)(GPIOD+0x10ul)) #define GPIOD_OUT (*(uint32_t *)(GPIOD+0x14ul)) #define RCC_BASE 0x40023800ul #define RCC_ENABLE (*(uint32_t *)(RCC_BASE + 0x30ul)) #define GPIOD_ENA 0x8
And then in the main code, those addresses are used to first enable the I/O port, and then to program it as an output port for those four pins.
RCC_ENABLE = GPIOD_ENA; // Enable clocks to GPIOD GPIOD_MODE = 0x55000000; // Bits 15 - 12 all output
Finally we implement the core of the “blink” logic, which is to turn the LEDs on for a bit, then off for a bit. Except in this case we alternate two on, then the other two on, which gives us a pleasing flip flop kind of display.
while (1) { GPIOD_OUT = 0xa000; // two LEDs on for (i = 0; i < delay_time; i++) { asm("nop"); } GPIOD_OUT = 0x5000; // other two LEDs on for (i = 0; i < delay_time; i++) { asm("nop"); } }
Feel free to pop over to the full source to see all of that in context.
Compiling This Application
Now that we’ve got the code in place, the next step is converting it into something we can install on the target system. The concept is fairly simple, we want to create a binary image of what should be in memory on the controller to make our system blink. The differences of various systems make that a bit more challenging that we would like.
We are using the arm-none-eabi
compiler, that sequence means
compiles for the “ARM” architecture, for the “none” operating system, and
uses the embedded application binary interface (EABI). Or more
specifically we’re using it to compile things that run directly on
ARM processors without an operating system present. There are however a
number of different kinds of ARM processors and we tell the compiler
which one we are targeting using the -mcpu=
flag.
In the Makefile we tell the compiler to make code for the Cortex M4 cpu, and to use the Thumb instruction set.
The two source files we compile are the blink source, from blink.c
and the startup code for the Cortex M4 in startup_ARMCM4.S
file.
Once compiled we use the linker to assemble them into the binary image and we have to tell the linker how that assembly should happen. To do that, we create a linker script describing our target system.
The entire script is here and it is
pretty much a copy of the gcc.ld
script that comes with the
compiler in the samples directory. The only changes made were to
describe the memory on the board and to allow for creating the copy
table and the zero table for the startup code to use.
MEMORY { FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 1024K /* 1M */ RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 112K /* 112K */ }
The above snippet shows the specification for FLASH and RAM on the STM32F407G chip with its 1MB of flash and 112K of general purpose RAM. And then a change to uncomment out the copy table stuff shown below.
.copy.table : { . = ALIGN(4); __copy_table_start__ = .; LONG (__etext) LONG (__data_start__) LONG (__data_end__ - __data_start__) __copy_table_end__ = .; } > FLASH
This looks fairly complex but what it is doing is putting a copy of
pre-initialized data into the FLASH section, with a simple set of
pointers and lengths to allow the startup script to bulk copy that
data into RAM where it should be. The actual code in Reset_Handler
that looks for those symbols is here:
/* Firstly it copies data from read only memory to RAM. * * The ranges of copy from/to are specified by following symbols * __etext: LMA of start of the section to copy from. Usually end of text * __data_start__: VMA of start of the section to copy to * __data_end__: VMA of end of the section to copy to * * All addresses must be aligned to 4 bytes boundary. */ ldr r1, =__etext ldr r2, =__data_start__ ldr r3, =__data_end__ .L_loop1: cmp r2, r3 ittt lt ldrlt r0, [r1], #4 strlt r0, [r2], #4 blt .L_loop1
The loop above just copies from FLASH into RAM based on the values in the copy table.
Once we have a linker script that describes how we want this binary
put together, we tell the linker that with options -T blink.ld
where blink.ld
is the name of our script. And I’ve added -Map
blink.map
to the options as well so that we get a map file of where
everything was placed in memory. That can help us when we are
debugging with gdb.
The resulting Makefile is show below
# # Trivial blink example using the ARM Embedded Toolchain # Chuck McManis (cmcmanis@mcmanis.com) # all: blink.elf blink.ld CFLAGS = -mcpu=cortex-m4 -mthumb -g LDFLAGS = -T blink.ld -Map blink.map all: blink.elf startup.o: startup_ARMCM4.S arm-none-eabi-as $(CFLAGS) -o startup.o $< blink.o: blink.c arm-none-eabi-gcc $(CFLAGS) -c blink.c blink.elf: blink.o blink.c startup.o arm-none-eabi-ld $(LDFLAGS) -o blink.elf blink.o startup.o clean: rm *.o blink.elf blink.map
This demo has three files,
- blink.c - which is the actual blink code
- startup_CM4.S - this file handles startup of the processor
- blink.ld - a linker script to tell the linker how to package all the pieces together
When make is run, the result is two new files, blink.elf
and
blink.map
which we’ll use in the final stage. Running our code.
Loading Code
Now that the code is written, and compiled, and layed out to run from FLASH memory space. Its time to get it into the board to test it out.
The tool I’ve used here is OpenOCD. OpenOCD is a free software package that knows about a number of board debugging protocols and can present them as a gdb server.
Most of the boards offer a form of JTAG debugging capability. The down side is that this often plugs into either a proprietary tool, or an expensive third party tool. Fortunately ST Micro has been reasonably open with their ST Link protocol and because of that, the OpenOCD project has an option to build the tool with support built in.
So before I started using it, I downloaded the OpenOCD code from its repository and configured and built it with STLINK_V2 support. Then you create a configuration file in your home directory containing the following:
source [find board/stm32f4discovery.cfg]
And that tells OpenOCD to look for the Butterfly board when it starts up. If you are successful OpenOCD will print the following on startup:
Now in your build directory you can connect to that instance using
the gdb command target extended-remote :3333
. In the build
directory that looks like this:
So at this point you’ve got GDB looking at your blink.elf binary (the whole “Reading symbols …” bit) and you’ve connected to OpenOCD. But on the OpenOCD console you may see something like this:
Basically that means that OpenOCD and the development board are in an odd
state but the fix is pretty easy, back in your GDB session, you
type mon reset halt
which tells OpenOCD to reset the target and to leave
it in the ‘halt’ state.
And now on the OpenOCD window you will see the following:
It may actually halt at a different address but the key is that the board
is now being held in reset by OpenOCD and is waiting for you to load code
into it with GDB. You do that with the load
command. [This works because
when gdb was started you put blink.elf
on the command line. If you had not
done that, so you could tell GDB which file you were talking about with the
file blink.elf
command.] To start the program once it is loaded you use
the run
command. Remember setting the entry point in the startup script?
That is where gdb will start running your program from.
On the board you will see the LEDs alternately flashing RED/GREEN and ORANGE/BLUE. Congratulations, you’ve compiled, loaded, and run the blink program for the STM32F4-Butterfly board!
Now that it is running you can change how fast it blinks by changing the
value in delay_time
. The simplest way is to type ^C (control-C, or
simultaneously pressing the control key and C) to get control back in
GDB, and then use the set
command to change the value. Then type cont
to resume the program. That would look like this on the gdb console:
Summary
So this shows how you would write, compile, link, and run what is perhaps the simplest program you can run on the this board. Going forward and writing additional applications for the board you will probably want to change the startup script to have the interrupt vectors for all of the peripherals on the STM32F407G. Alternatively you can use the sample startup code that ST Micro provides for their processors or startup code from a third party library like libopencm3.