Monday, February 20, 2012

Advanced Embedded C++ Development with Arduino

My first experience with using C++ for embedded development was circa 1988. I was teaching a university course for seniors and graduate students in real-time software design. We did everything in assembler, and very occasionally experimentally in C. I wanted to see if I could write embedded code in C++, which I was just learning at the time. I used cfront, Bell Labs' early preprocessor which input C++ source code and emitted C source code. No one could have been more surprised than I was when it worked.

C++ has evolved a lot since then, and so have I. Coincidentally, I ended up working for Bell Labs starting in 1996. The Labs had by then experience using C++ for embedded development, and thanks to some very patient mentors (Tam, Doug, and Randy: you know who you are), I ended up exploiting many of the advanced features of C++ to do development in code bases of hundreds of thousands or even millions of lines of C++ that was shared across entire product lines.

That experience has informed my professional career in the years hence, so it was only natural to see if I could use some of these same C++ features when writing code for Arduino. Ever since my days teaching CEG431 at Wright State University I've been looking for a modern and cost effective hardware and software platform on which to duplicate that pedagogical environment. I've written about this quest in this blog: Diminuto and Arroyo (AT91RM9100-EK board using an Atmel ARM9 microprocessor with Linux/GNU), Cascada and Contraption (BeagleBoard using a TI Cortex-A8 microprocessor with Linux/GNU and Android), and finally Amigo (Arduino Uno and Mega boards using Atmel AVR microcontrollers with Arduino).

It would be easy for one to assume that a 16MHz eight-bit microcontroller with only 2KB of SRAM would be too resource constrained to really pull the stops out on C++. One would be wrong. Not only can the C++ features that are routinely used in large embedded code bases be also used effectively in Arduino, there are lots of good reasons to do so, and they are all the same reasons, like code reuse and easier integration of disparate code bases.

A good example of my efforts so far is LC100, an Arduino software library that implements the most-used VT100 display and keyboard escape sequences on a small liquid crystal display (LCD), and TinyTerminal, an application that uses LC100 to implement a minuscule VT100-compatible wireless remote terminal.

Here's a side view of the TinyTerminal hardware stack. It includes an Arduino Uno at the bottom, a SparkFun XBee shield with a Digi Series 1 XBee radio in the middle, and on top a DFRobot two-line by sixteen-column LCD shield with joystick buttons to indicate up, down, left, right, and select. It's being powered by a 9-Volt transister radio battery.

Arduino Uno, XBee Shield, LCD Shield

Here's a top view of the TinyTerminal stack talking to my desktop Mac Mini via a USB-connected SparkFun XBee Explorer. The characters on the LCD display that you see below were generated by me using the Mac screen utility, pointing it to the USB serial port provided by the XBee Explorer, telling it to set the serial rate to 9600, and using the arrow keys on my Mac keyboard to position the cursor on the LCD display as I typed. In turn, I can move the cursor of the screen utility on my Mac around by pressing the joystick buttons on the LCD shield.

TinyTerminal using LC100

The TinyTerminal application uses an object of the LC100 class, which derives from the LC100Base class where most of the implementation is located. The interface to the LCD shield is abstracted out into a pure interface class called Display. This makes it easy to adapt LC100 to other LCD hardware with no source code changes through the use of dependency injection. TinyTerminal implements Display for both the DFRobot shield as well as for a simple mock object so that you can debug applications without any actual LCD hardware.

The source code to do all this, which is definitely a work in progress, is a little lengthy to cut and paste into this article. But you can download the tarball containing all the source code for both the LC100 library and the TinyTerminal example application from a link on the Amigo web page.

Here's a list of some of the C++ features I used in this project that in my experience are commonly used in much larger embedded projects.

Namespaces. When you start integrating software from a variety of other projects, you start running into name collisions. C++ gets around this by allow you to partition the global C++ namespace into hierarchically nested subspaces.


namespace com {

namespace diag {

namespace amigo {


...


}

}

}



This is a lot easier than it sounds. Basically it just makes your C++ names longer. All of the LC100 symbols are in the com::diag::amigo namespace, following a standard I've used on other C++ projects of incorporating my company's unique domain name as part of the namespace name.

Unique Preprocessor Symbols. You need to worry about name collisions even more with C preprocessor symbols because they are in a global namespace about which C++ knows nothing. That's one of the reasons I minimize the use of preprocessor symbols in TinyTerminal and LC100. But when they are necessary, such as in header file guards, I also include my domain name as part of the preprocessor symbol name, as shown in the example below.

#ifndef _COM_DIAG_AMIGO_LC100_H_

#define _COM_DIAG_AMIGO_LC100_H_

...

#endif /* _COM_DIAG_AMIGO_LC100_H_ */


Static Constants. Typically in C we are used to defining manifest constants, like the dimensions of the LCD display, as preprocessor symbols. That gets more and more problematic the more preprocessor symbols there are, for just the reasons cited above. But when you declare an primitive variable in C++ to be both static and const and provide its value in its declaration, for example

static const byte COLS = 16;


most C++ compilers won't even assign storage to the variable unless you force them to do so by taking the address of it in your code. Instead they use the variable just like a preprocessor symbol and textually replace each use with its corresponding numeric value. Unlike preprocessor symbols, static constants still have all the type safety and namespace capabilities of normal C++ variables.

Abstract Classes and Dependency Injection. LC100 is not tied to any specific LCD hardware. It expects the LCD that it uses to implement a dirt simple interface that is defined as an abstract or pure class, that is, a class without any implementation.

struct Display

{

...

virtual void setCursor(byte col, byte row) = 0;

...

};


All LCD interfaces, like the one that TinyTerminal implements for the DFRobot LCD shield, derive from class Display. TinyTerminal provides LC100 with an object of this subclass. This makes it easy to write generic software that works with lots of different hardware.

Printable Enumerations. This is a trick long known to C programmers, too, but I see it used all too seldom. States and actions used by LC100 are defined in enumations. But the actual enumeration values are explicitly defined to be printable ASCII characters.

enum State {

DATA = 'D',

ESCAPE = 'E',

DECIMAL = 'N',

BRACKET = 'B',

FIRST = 'F',

SECOND = 'S',

QUESTION = 'Q'

};


This allows them to be printed directly in debug output without requiring any mapping at run-time. This is only useful in smallish to middle-sized enumerations, but it can really simplify debugging.

Templates and Variable Length Objects. LC100 is a templatized class, which is to say it's exact definition is parameterized and determined at compile-time. LC100 uses its template parameters _COLS_ and _ROWS_ to specify the dimensions of the LCD display, than uses these values to size arrays that are part of the object.

template <byte _COLS_, byte _ROWS_> class LC100

: public LC100Base

{

...

byte lineLength[_ROWS_];

...

};


This eliminates any need for a heap and dynamic memory allocation. Templates perform code generation at compile-time, which is a very powerful form of code reuse. Because they generate code, they must be used with discipline. But the LC100 template generates no executable code other than its constructor and destructor; its implementation is in the LC100Base super class that is common to all uses of LC100.

Grammars, Finite State Machines, and Push Down Automata. The large set of VT100 escape sequences can be abstractly described as a formal grammar. If you have ever seen a programming language described in terms of Bauckus-Naur Form (BNF) then you already have some familiarity with a formal grammar. But it turns out that all sorts of things can be described in terms of a formal grammar. This is very good indeed, because even though a grammar can seem complicated, if it is in the right form then a parser for it can be routinely, even mechanically, generated, and implemented trivially as a finite state machine (FSM). A push down automata (PDA) is merely a finite state machine with a stack so that a production or rule of the grammar can be called like a subroutine; this allows a much broader class of grammars to be parsed. LC100 uses a PDA to parse incoming escape sequences a character at a time. This eliminates any need for buffering of incoming data, and can be implemented extremely efficiently in C++ using switch statements. Once you become accustomed to using grammars and implementing FSMs and PDAs, you will start to see all sorts of problems as parsing problems.

Mock Objects. TinyTerminal has a compile time option to build a Display interface object not for the DFRobot LCD shield but for a dirt simple mock object that merely traces its calls to the serial monitor.

class Mock

: public com::diag::amigo::Display

{

...

};


The LC100 software has no idea this is happening, thanks to the abstract interface and dependency injection. This makes it easy to test on Arduino but without actual LCD hardware, and to determine whether a fault lies with the LCD hardware, the Arduino LiquidCrystal library, the Arduino-side application, or even a host-side application, when wackiness ensues.

Embedded Unit Tests. TinyTerminal has embedded in it functions that perform a variety of unit tests. These include things like scrolling, erasing, and cursor placement.

#if 0

/**

* Cursor control unit test.

*/

static char TEST[] = {

lc100.ESC, '[', 'H',

'0',

lc100.ESC, '[', '1', ';', '1', '6', 'H',

'1',

lc100.ESC, '[', '2', ';', '1', '6', 'H',

'2',

lc100.ESC, '[', '2', ';', '1', 'H',

'3'

};

static void test() {

for (byte ii = 0; ii < (sizeof(TEST)/sizeof(TEST[0])); ++ii) {

lc100.write(TEST[ii]);

}

}

#endif


These unit tests can be selectively enabled at compile-time, and can be run on either the actual LCD hardware or the mock object. Since they are normally not included as part of the normal build, they incur zero overhead if they are not used. And like all unit tests, they are a form of documentation about how I expect LC100 to be used. I confess I had the passing notion of porting the entire Google Test platform to Arduino. But this simpler approach, which I've used in several production embedded systems, is a huge win all by itself.

Doxygen. Doxygen is a powerful open source tool that scans your source code for comments of a very particular format and then generates documentation based on those comments.

/**

* Place the cursor at the specified position. The column and row

* coordinates are taken modulo of the actual display dimensions.

* @param col is the zero-based column number.

* @param row is the zero-based row number.

*/

void setCursor(byte col, byte row);


Doxygen output can be web pages, PDF manuals, etc. Files, classes, functions, macros, can all be documented using Doxygen. I've used Doxygen obsessively for years. But even if I never installed the Doxygen software, I'd still use Doxygen-format comments. They encourage a specific discipline for documenting source code that has value all by itself. Both LC100 and TinyTerminal are commented using Doxygen-style comments.

I predict that you are now asking yourself how expensive is it to use all of these fancy features of C++ in your Arduino code. So here it is: the entire TinyTerminal application, including the LC100 library, the LiquidCrystal library that controls the DFRobot LCD display, and all of the other Arduino and C++ run-time, takes 8764 bytes of flash and 128 bytes of SRAM for the BSS segment. Almost all of the overhead in using these features is at compile-time. And they all can make your life much much easier.

No comments: