This exercise introduces the ARM Cortex and FP coprocessor assembly languages, instruction sets
and their addressing modes. The ARM calling convention will need to be respected, such that the
assembly code can be used with C programming language. The lab and a prior tutorial will
introduce you to the STM32CubeIDE, including the compiler and associated tools. In the second
part of the exercise, the code developed here will be used in a larger program written in C and the
Cortex Microprocessor Software Interface Standard (CMSIS-DSP) application programming
interface (API) that incorporates a large set of routines optimized for different ARM Cortex
Hence, this lab consists of two components, each requiring a week to compete:
Part 1: Assembly language exercise – Kalman filter in one dimension
Part 2: Combining assembly/embedded C and optimizing performance; CMSIS-DSP
Background – ARM Calling Convention
In assembly and C, parameters for a subroutine are passed via stack or internal registers. In ARM
processors, the registers R0:R3 are used for passing integer or pointer variables. Up to four
parameters are placed in these registers, and the result is placed in R0 and R1. If any parameter
requires more than 32 bits, then multiple registers are used. If there are no free scratch registers, or
the parameter requires more registers than remain, then the parameter is pushed onto the stack.
Since we will be also dealing with the floating-point parameters on hardware that performs
floating-point arithmetic, be aware of having the option of using either software or hardware
floating-point linkage, depending on whether the parameters are passed via general purpose or
floating-point registers. The objective here is to use the hardware linkage, hence the floating-point
registers will be used for parameter and result passing.
In addition to the class notes, please refer to the document “Procedure Call Standard for the ARM
Architecture”, especially its sections describing The Base Procedure Call Standard. Other
documents that will be of importance include the Cortex M4 programming manual, quick reference
cards for ARM ISA and the (vector) floating point instructions, all available within the course
online documentation. This particular order of passing parameters is applied by major compilers.
Using the STM32CubeIDE Integrated Development Environment Tool
To prepare for Lab 1, you will need to go through Tutorial 1, where you will learn how to create and
define projects, including assembly code projects. The tutorial shows you how to let the tool insert
the proper startup code for the given processor, write and compile the code, as well as provide the
basics of the program debugging.
Lab 1: Definition
You will develop the working assembly language code for single-variable Kalman filter that can be
used in later exercises. The single-variable version avoids the use of matrix operations required for
larger Kalman filters, and makes it amenable to an assembly code implementation, while it still
allows experimenting with and appreciating the features of this filter.
Kalman filter is a state-based adaptive estimator of a physical process. Its estimation error is
provably minimal for linear systems with Gaussian noise. It is the type of an adaptive filter, which
is generally preferred to the fixed linear filters. The state space adaptation is performed by a
sequence of discrete steps, during which the parameters of the filter change depending on the
observed physical value, as well as the current state.
Kalman filter performs the adaptation by maintaining the internal state, consisting of the estimated
value x, the adaptive tuning factor k and the estimation error, represented by its covariance p. To
obtain these values, it requires the knowledge of the noise parameters of the input measurements
and the state estimation, represented by their respective covariances q and r.
The high-level description of the Kalman filter code is given in the working python program in
Figure 1. While the code is fully functional, and it can be directly run within a larger (Python)
program, it is used here as a compact high-level specification. Please note that only the update
function is required in the assembly part of Lab 1. In the second part, when you include your
assembly code with the C code, the initialization function will be needed. That part will be written
in C. Note also that there will be differences in the code caused by different syntax and semantics of
C, compared to Python. For instance, you will need to carefully specify the data types and include
the function prototypes in the code to be able to correctly link the assembly and C code.