Blink, the HelloWorld of Hardware -- Chuck's Robotics Notebook

In this
article:

Important Tools
The Blink Eco-system
Summary

Navigation:

Home
Hardware
Software
Techniques
Controllers
Reviews
Index

Hello World

For many people, programming computers comes in two flavors, the first is rather theoretical with data structures, algorithms, and various library functions. The second flavor is when the computer is either too simple to host an interactive operating system, or the computer is dedicated to running a piece of code that is intimately tied into the goings on of the world around it. Robots typically run such embedded systems and their idea of an operating system can be quite a bit simpler than something like Linux.

If you are learning to program computers (as opposed to learning to program a specific computer, but we’ll get to that in a moment) the first program most people write is called “hello world.” So named because the only thing the program does, is print out or display the text “Hello World!”.

The origin of this was the book “The C Language” by Kernigan and Ritchie, which has the following as its original “hello.c”

Snippet from hello.c : The Canonical "Hello World"

#include <stdio.h>

main()
{
    printf("hello, world\n");
}

It is a really simple program. The reason it exists in the tutorial is to give the person reading an opportunity to go through all of the steps that their computer uses to compile, assemble, and link together a new piece of software and turn it into a program that the computer will run.

In the embedded world, the equivalent tutorial program is called “Blink.” Unlike the C example above, there are rarely “standard” libraries for managing I/O pins or doing delays. As a result the canonical blink example is more of a structure than a compilable program. Its shown next.

Snippet from blink.c : The structure of a Blink program

#include <custom-include-file.h>

void main() {
    initialize_io_pin(LED_PIN, OUTPUT_PIN);

    while (1) {
        toggle_io_pin(LED_PIN);
        delay(1000);
    }
}

Like hello.c, the goal is not so much the function of the program, rather it is to verify that one can successfully do all the steps necessary to create, compile, load, and run a program on the embedded system.

Unlike hello.c, there aren’t any standards or libraries one can use so every version of blink.c is a bit different than the others. For the Arduino, there are standard libraries and so blink.cpp for the Arduino can call those libraries to fill in the gaps from our example above.

Lets look at the minimum blink for the ST Micro STM32F4 chip, and in the process we will see how the tools work to put code on to this chip.

Important Tools

C Compiler

Being able to compile your code is essential. While it was common on small machines to write directly in machine code or assembler, on a “larger” architecture like the 32 bit Cortex M series that is moderately impractical.

The choice of language is a personal preference, I find C to be an excellent language for writing systems code, from drivers to operating systems to large applications. There is also a very well developed and well supported compiler system available, the Gnu C Compiler or GCC. This combination made it ideal for me but others have preferred other languages. Perhaps the most wide spread in Robotics groups was BASIC and especially PIC BASIC from Parallax which was used in their Basic Stamp controllers.

One of the down sides of GCC is that as an open source product, if you are just learning, you are at the mercy of whether or not the compiler supports your chip. Once you are an expert this becomes a super power and you can port GCC to host any chip you want. But before we get there, the folks at ARM have done a great job of porting it to the Cortex M series and have hosted it on Launchpad as the GNU Tools for ARM Embedded Processors project.

There are executables for Windows, Mac OS X, and Linux on that page so pretty much what ever system you are running you can cross compile for the Cortex M family of embedded processors.

Bootloading and Debuggging

In pre-historic times computers had a series of switches on their front panel you could use to deposit binary data directly into their memory, during the Dot Com days an engineer would use something called a PROM Programmer to write binary data into a memory chip, but these days, and pretty much ever since the Motorola 68HC11 and the Microchip 16C84, there has been a way to use a generic port to program the non-volatile memory of a microprocessor. The Cortex M series is no exception.

ARM defined a “standard” way of programming and debugging the Cortex M series, based on something they called a “Debug Access Port”. This port can use industry standard JTAG pins or software debug (SWD) pins. There are off the shelf products that will talk to those pins like the Black Magic Probe. Manufacturers however, liking a single one-step procedure will often include a captive processor to run a debugging tool, in the ST Micro case that is called “ST Link” and in the Freescale case they have adapted a new standard from ARM called CMSIS-DAP which Freescale calls OpenSDAv2 when combined with their Cortex M0 chip.

What these things have in common is that an open source tool, OpenOCD, can be used to talk to that protocol over USB and down load programming to the chip you are programming. This is much simpler than either the lights and switches model or the erase, program, test, repeat model.

Using these two tools, GCC Embedded and OpenOCD, we’ll compile and load our blink example on to the STM32F4-Discovery board which ST Micro sells from Digikey and others for about $15.

The Blink Eco-system

When you are starting from “bare metal” (which is one way of thinking about a completely unprogrammed microprocessor system), is to think about the components that have to be present for that machine to run.

Because this is a software article and not a hardware one, we’ll just gloss over the fact that the board has a crystal for its clock, a power supply from the USB port, and an LED already attached to GPIO pin 15 on port D. The hardware engineer took care of that for us and we’re here to make it blink.

The things we’ll be looking at are how we start this CPU up from “scratch”, how to configure the I/O pins into something useful, and what it takes to get this into the hardware and running.

Startup Configuration

The hello.c example has all of its startup issues taken care of by the operating system, for blink programs the programmer is responsible for everything. Starting up the board and chip isn’t necessarily complicated but it is a peice of code that embedded systems programmers deal with while people writing code for operating system hosted applications don’t.

The first thing startup has to be concerned about is just what happens when the chip comes out of reset. In the case of the ARM Cortex M, the CPU takes the first 32 bit value in memory address 0 and puts that into the stack pointer, and then it takes the second 32 bit value and puts that into the program counter, effectively jumping to that address. In fact the first hundred plus 32 bit words contain the addresses of functions that are interrupt service routines (ISRs) and are called when an interrupt occurs. This is called the ‘interrupt_vectors’ table or the ‘isr_vector’.

The gcc sample code defines an interrupt table like this:

Snippet from startup_ARMCM4.S : Interrupt Service Routine Vectors

    .align  2
    .globl  __isr_vector
__isr_vector:
    .long   __StackTop            /* Top of Stack */
    .long   Reset_Handler         /* Reset Handler */
    .long   NMI_Handler           /* NMI Handler */
    .long   HardFault_Handler     /* Hard Fault Handler */
    .long   MemManage_Handler     /* MPU Fault Handler */
    .long   BusFault_Handler      /* Bus Fault Handler */
    .long   UsageFault_Handler    /* Usage Fault Handler */
    .long   0                     /* Reserved */
    .long   0                     /* Reserved */
    .long   0                     /* Reserved */
    .long   0                     /* Reserved */
    .long   SVC_Handler           /* SVCall Handler */
    .long   DebugMon_Handler      /* Debug Monitor Handler */
    .long   0                     /* Reserved */
    .long   PendSV_Handler        /* PendSV Handler */
    .long   SysTick_Handler       /* SysTick Handler */

    /* External interrupts */

The first thing to notice about this code is that it gives these interrupt vectors their own section name “.section .isr_vector” which we will see again when we talk about the linker script.

The second thing to notice is that the vector for “reset” is filled in with the function name Reset_Handler. This function gets control whenever the device is reset. So it has the responsibility of setting up a couple of things that are important to C programs.

Snippet from startup_ARMCM4.S : The Reset_Handler Code

    .globl  Reset_Handler
    .type   Reset_Handler, %function
Reset_Handler:
/*  Firstly it copies data from read only memory to RAM.
 *
 *  The ranges of copy from/to are specified by following symbols
 *    __etext: LMA of start of the section to copy from. Usually end of text
 *    __data_start__: VMA of start of the section to copy to
 *    __data_end__: VMA of end of the section to copy to
 *
 *  All addresses must be aligned to 4 bytes boundary.
 */
    ldr r1, =__etext
    ldr r2, =__data_start__
    ldr r3, =__data_end__

.L_loop1:
    cmp r2, r3
    ittt    lt
    ldrlt   r0, [r1], #4
    strlt   r0, [r2], #4
    blt .L_loop1

/*  
 *  Clear the bss section
 *
 *  The BSS section is specified by following symbols
 *    __bss_start__: start of the BSS section.
 *    __bss_end__: end of the BSS section.
 *
 *  Both addresses must be aligned to 4 bytes boundary.
 */
    ldr r1, =__bss_start__
    ldr r2, =__bss_end__

    movs    r0, 0
.L_loop3:
    cmp r1, r2
    itt lt
    strlt   r0, [r1], #4
    blt .L_loop3

/*
 * On to initializing the system
 */
    bl  SystemInit

    .pool

This code does two things, for variables that are ‘initialized’ it makes sure that the initialization is done, and for ‘uninitialized’ variables it makes sure they are set to zero. In the code below we use delay_time to hold the number of iterations in our delay loop. Technically that isn’t necessary but when we start talking about gdb here in a minute it will make sense.

Clock Considerations

Back in the day it was easy to start up a microprocessor, hook up a crystal, or even more simply a crystal oscillator “can” to the Clock In pin and voila. While there is some flexibility with today’s microcontrollers like the PIC or ATMega the clock options are set by fuses that are programmed during flashing. For the ARM Cortex M, there are a number of options you can choose from and you program them after the chip is running.

In this particular example, I’ve not changed the clock. This means that the chip is running on its default internal oscillator which is not particularly accurate. This is fine for the demo, and so when Reset_Handler exits by calling SystemInit, that code just goes and calls the function main right away. We will do a bit more in SystemInit when there are more interesting things to do, but remember that the blink example, like hello.c is a way to make sure you can use all of your tools, what the program does is primarily incidental.

System Configuration

Another thing that makes blink.c more complicated in the embedded world is that there is no standard for how various I/O devices on an embedded chip should work. Further, while 8 bit microprocessors had dedicated I/O ports, the number of pins on something like the Cortex M4 cannot hope to bring all of the possible I/O features out to actual pins. This situation came about because the ability to more transistors on silicon greatly increased, but the ability to put more pins on an integrated circuit did not increase nearly so much.

In order to avoid an explosion of variations on a single chip, microcontroller companies have taken a page from the FPGA design book and made the I/O pins into programmable hardware.

All ARM Cortex M series, and perhaps all ARM CPUs, now have a system of registers which are used to dynamically assign the function of a package pin, to a part of the chip inside that implements that feature. Sometimes there are multiple choices, so the transmit line for a UART may appear on one pin or another depending on configuration. That configuration then needs to be taken into consideration if you’re program is going to use them. In our simple example we’ll do that configuration in our program but it will be useful to have that taken care of in later examples by the SystemInit code.

So the datasheet is consulted and the memory address of the peripheral controller and GPIO port D are identified. Port D was chosen since, on the Butterfly the upper four bits of that I/O port are connected to four different LEDs. Using that information several defines are made for the code:

Snippet from blink.c : STM32F4 Specific I/O Address Defines

#define GPIOD       0x40020C00ul    // from the data sheets
#define GPIOD_MODE  (*(uint32_t *)(GPIOD+0x00ul))
#define GPIOD_TYPE  (*(uint32_t *)(GPIOD+0x04ul))
#define GPIOD_SPEED (*(uint32_t *)(GPIOD+0x08ul))
#define GPIOD_PUPD  (*(uint32_t *)(GPIOD+0x0aul))
#define GPIOD_IN    (*(uint32_t *)(GPIOD+0x10ul))
#define GPIOD_OUT   (*(uint32_t *)(GPIOD+0x14ul))

#define RCC_BASE    0x40023800ul
#define RCC_ENABLE  (*(uint32_t *)(RCC_BASE + 0x30ul))
#define GPIOD_ENA   0x8

And then in the main code, those addresses are used to first enable the I/O port, and then to program it as an output port for those four pins.

Snippet from blink.c : Setting GPIO D, 12 - 15 to outputs

    RCC_ENABLE = GPIOD_ENA; // Enable clocks to GPIOD
    GPIOD_MODE = 0x55000000; // Bits 15 - 12 all output

Finally we implement the core of the “blink” logic, which is to turn the LEDs on for a bit, then off for a bit. Except in this case we alternate two on, then the other two on, which gives us a pleasing flip flop kind of display.

Snippet from blink.c : Alternating LED activity

    while (1) {
        GPIOD_OUT = 0xa000; // two LEDs on
        for (i = 0; i < delay_time; i++) {
            asm("nop");
        }
        GPIOD_OUT = 0x5000; // other two LEDs on
        for (i = 0; i < delay_time; i++) {
            asm("nop");
        }
    }

Feel free to pop over to the full source to see all of that in context.

Compiling This Application

Now that we’ve got the code in place, the next step is converting it into something we can install on the target system. The concept is fairly simple, we want to create a binary image of what should be in memory on the controller to make our system blink. The differences of various systems make that a bit more challenging that we would like.

We are using the arm-none-eabi compiler, that sequence means compiles for the “ARM” architecture, for the “none” operating system, and uses the embedded application binary interface (EABI). Or more specifically we’re using it to compile things that run directly on ARM processors without an operating system present. There are however a number of different kinds of ARM processors and we tell the compiler which one we are targeting using the -mcpu= flag.

In the Makefile we tell the compiler to make code for the Cortex M4 cpu, and to use the Thumb instruction set.

The two source files we compile are the blink source, from blink.c and the startup code for the Cortex M4 in startup_ARMCM4.S file.

Once compiled we use the linker to assemble them into the binary image and we have to tell the linker how that assembly should happen. To do that, we create a linker script describing our target system.

The entire script is here and it is pretty much a copy of the gcc.ld script that comes with the compiler in the samples directory. The only changes made were to describe the memory on the board and to allow for creating the copy table and the zero table for the startup code to use.

Snippet from blink.ld : FLASH and RAM specification

MEMORY
{
  FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 1024K /* 1M */
  RAM (rwx)  : ORIGIN = 0x20000000, LENGTH = 112K /* 112K */
}

The above snippet shows the specification for FLASH and RAM on the STM32F407G chip with its 1MB of flash and 112K of general purpose RAM. And then a change to uncomment out the copy table stuff shown below.

Snippet from blink.ld : The copy table construct

    .copy.table :
    {
        . = ALIGN(4);
        __copy_table_start__ = .;
        LONG (__etext)
        LONG (__data_start__)
        LONG (__data_end__ - __data_start__)
        __copy_table_end__ = .;
    } > FLASH

This looks fairly complex but what it is doing is putting a copy of pre-initialized data into the FLASH section, with a simple set of pointers and lengths to allow the startup script to bulk copy that data into RAM where it should be. The actual code in Reset_Handler that looks for those symbols is here:

Snippet from startup_ARMCM4.S : Initializing Data Sections

/*  Firstly it copies data from read only memory to RAM.
 *
 *  The ranges of copy from/to are specified by following symbols
 *    __etext: LMA of start of the section to copy from. Usually end of text
 *    __data_start__: VMA of start of the section to copy to
 *    __data_end__: VMA of end of the section to copy to
 *
 *  All addresses must be aligned to 4 bytes boundary.
 */
    ldr r1, =__etext
    ldr r2, =__data_start__
    ldr r3, =__data_end__

.L_loop1:
    cmp r2, r3
    ittt    lt
    ldrlt   r0, [r1], #4
    strlt   r0, [r2], #4
    blt .L_loop1

The loop above just copies from FLASH into RAM based on the values in the copy table.

Once we have a linker script that describes how we want this binary put together, we tell the linker that with options -T blink.ld where blink.ld is the name of our script. And I’ve added -Map blink.map to the options as well so that we get a map file of where everything was placed in memory. That can help us when we are debugging with gdb.

The resulting Makefile is show below

Snippet from Makefile : Makefile to build Blink

#
# Trivial blink example using the ARM Embedded Toolchain
# Chuck McManis (cmcmanis@mcmanis.com)
#
all:    blink.elf blink.ld

CFLAGS = -mcpu=cortex-m4 -mthumb -g
LDFLAGS = -T blink.ld -Map blink.map

all:    blink.elf

startup.o: startup_ARMCM4.S
    arm-none-eabi-as $(CFLAGS) -o startup.o $<

blink.o: blink.c
    arm-none-eabi-gcc $(CFLAGS) -c blink.c

blink.elf:  blink.o blink.c startup.o
    arm-none-eabi-ld $(LDFLAGS) -o blink.elf blink.o startup.o

clean:
    rm *.o blink.elf blink.map

This demo has three files,

blink.c - which is the actual blink code
startup_CM4.S - this file handles startup of the processor
blink.ld - a linker script to tell the linker how to package all the pieces together

When make is run, the result is two new files, blink.elf and blink.map which we’ll use in the final stage. Running our code.

Loading Code

Now that the code is written, and compiled, and layed out to run from FLASH memory space. Its time to get it into the board to test it out.

The tool I’ve used here is OpenOCD. OpenOCD is a free software package that knows about a number of board debugging protocols and can present them as a gdb server.

Most of the boards offer a form of JTAG debugging capability. The down side is that this often plugs into either a proprietary tool, or an expensive third party tool. Fortunately ST Micro has been reasonably open with their ST Link protocol and because of that, the OpenOCD project has an option to build the tool with support built in.

So before I started using it, I downloaded the OpenOCD code from its repository and configured and built it with STLINK_V2 support. Then you create a configuration file in your home directory containing the following:

source [find board/stm32f4discovery.cfg]

And that tells OpenOCD to look for the Butterfly board when it starts up. If you are successful OpenOCD will print the following on startup:

chuck@mint:~$ openocd Open On-Chip Debugger 0.7.0 (2014-01-21-21:09) Licensed under GNU GPL v2 For bug reports, read http://openocd.sourceforge.net/doc/doxygen/bugs.html srst_only separate srst_nogate srst_open_drain connect_deassert_srst Info : This adapter doesn't support configurable speed Info : STLINK v2 JTAG v14 API v2 SWIM v0 VID 0x0483 PID 0x3748 Info : Target voltage: 2.861485 Info : stm32f4x.cpu: hardware has 6 breakpoints, 4 watchpoints

Now in your build directory you can connect to that instance using the gdb command target extended-remote :3333. In the build directory that looks like this:

chuck@mint:~/projects/new-blink$ arm-none-eabi-gdb blink.elf GNU gdb (GNU Tools for ARM Embedded Processors) 7.6.0.20131129-cvs Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=i686-linux-gnu --target=arm-none-eabi". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/chuck/projects/new-blink/blink.elf...done. (gdb) target extended-remote :3333 Remote debugging using :3333 0x00000000 in ?? ()

So at this point you’ve got GDB looking at your blink.elf binary (the whole “Reading symbols …” bit) and you’ve connected to OpenOCD. But on the OpenOCD console you may see something like this:

Info : accepting 'gdb' connection from 3333 Info : stm32f4x errata detected - fixing incorrect MCU_IDCODE Info : device id = 0x10006413 Info : flash size = 1024kbytes Warn : acknowledgment received, but no packet pending undefined debug reason 6 - target needs reset

Basically that means that OpenOCD and the development board are in an odd state but the fix is pretty easy, back in your GDB session, you type mon reset halt which tells OpenOCD to reset the target and to leave it in the ‘halt’ state.

(gdb) mon reset halt target state: halted target halted due to debug-request, current mode: Thread xPSR: 0x01000000 pc: 0x080000c4 msp: 0x2001c000

And now on the OpenOCD window you will see the following:

target state: halted target halted due to debug-request, current mode: Thread xPSR: 0x01000000 pc: 0x080000c4 msp: 0x2001c000

It may actually halt at a different address but the key is that the board is now being held in reset by OpenOCD and is waiting for you to load code into it with GDB. You do that with the load command. [This works because when gdb was started you put blink.elf on the command line. If you had not done that, so you could tell GDB which file you were talking about with the file blink.elf command.] To start the program once it is loaded you use the run command. Remember setting the entry point in the startup script? That is where gdb will start running your program from.

(gdb) load Loading section .text, size 0x104 lma 0x8000000 Loading section .copy.table, size 0xc lma 0x8000104 Loading section .zero.table, size 0x8 lma 0x8000110 Loading section .data, size 0x4 lma 0x8000118 Start address 0x80000c4, load size 284 Transfer rate: 701 bytes/sec, 71 bytes/write. (gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/chuck/projects/new-blink/blink.elf

On the board you will see the LEDs alternately flashing RED/GREEN and ORANGE/BLUE. Congratulations, you’ve compiled, loaded, and run the blink program for the STM32F4-Butterfly board!

Now that it is running you can change how fast it blinks by changing the value in delay_time. The simplest way is to type ^C (control-C, or simultaneously pressing the control key and C) to get control back in GDB, and then use the set command to change the value. Then type cont to resume the program. That would look like this on the gdb console:

[continues from previous screen above] The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/chuck/projects/new-blink/blink.elf ^C0x08000076 in main () at blink.c:38 38 for (i = 0; i < delay_time; i++) { (gdb) print delay_time $1 = 100000 (gdb) set delay_time=500000 (gdb) continue Continuing.

Summary

So this shows how you would write, compile, link, and run what is perhaps the simplest program you can run on the this board. Going forward and writing additional applications for the board you will probably want to change the startup script to have the interrupt vectors for all of the peripherals on the STM32F407G. Alternatively you can use the sample startup code that ST Micro provides for their processors or startup code from a third party library like libopencm3.

TITLE: Blink, the HelloWorld of Hardware
AUTHOR: Chuck McManis	LAST UPDATE: 07-Sep-2013