Firmware
Low-level firmware design, bootloaders, and embedded software architecture.
1 subtopic · 13 pages
Firmware is the software that runs directly on embedded hardware — closer to the silicon than an operating system, yet responsible for everything the product does. Firmware design determines how reliably the product operates, how safely it updates in the field, and how easily it is maintained and extended as the product evolves.
What Is Firmware?
Firmware is the low-level software that controls embedded hardware. Unlike application software running on an OS, firmware typically:
- Runs directly on the bare metal of a microcontroller — no virtual memory, no process isolation, no scheduler unless one is explicitly included (as an RTOS).
- Has direct hardware access — reads and writes peripheral registers, configures clocks, manages DMA transfers, and handles interrupts with deterministic timing.
- Operates in constrained resources — RAM measured in kilobytes, flash in kilobytes or low megabytes, CPU cycles shared between application logic and interrupt servicing.
- Must handle hardware faults safely — a firmware bug that causes a hard fault or watchdog reset in an industrial controller or medical device is a safety issue, not just a software bug.
This subtopic focuses on firmware architecture, bootloader design, and embedded software patterns. For microcontroller selection and peripheral interfacing, see the Embedded Systems topic.
Key Firmware Concepts
- Bootloader — a small program that runs before the main application firmware. Responsibilities: hardware initialisation, image validation, field update (DFU), and jumping to the application. See what is a bootloader in embedded systems?
- DFU (Device Firmware Update) — the mechanism by which deployed products receive new firmware without being returned for reprogramming. Can be implemented over USB, UART, BLE, Wi-Fi, or any available interface.
- RTOS (Real-Time Operating System) — a lightweight OS providing tasks, scheduling, semaphores, queues, and timers. FreeRTOS is the most widely deployed open-source RTOS in embedded systems. See what is an RTOS?
- Bare-metal — firmware that runs without an RTOS, relying on a main loop and interrupts for all scheduling. Simpler and more deterministic for low-complexity applications.
- ISR (Interrupt Service Routine) — the handler function that runs when an interrupt occurs. Must be short, non-blocking, and side-effect aware; long ISRs degrade real-time performance.
- Watchdog timer — a hardware timer that resets the MCU if the firmware does not periodically reset it ("kick" the watchdog), providing a recovery mechanism for firmware hangs.
- Stack overflow — a firmware fault that occurs when a task or interrupt uses more stack than allocated. In bare-metal systems, stack overflow typically corrupts adjacent memory with no immediate indication; in RTOS systems, stack canary values can detect overflow.
Common Tools and Software
- IDEs — STM32CubeIDE (STM32), VS Code with Cortex-Debug extension (multi-platform, widely used), IAR EWARM (commercial, common in safety-critical and industrial firmware).
- Debug probes — J-Link (SEGGER, industry standard, excellent RTT real-time logging), ST-Link V3 (STM32), CMSIS-DAP compatible probes (open-source, inexpensive), OpenOCD (open-source debug server).
- Firmware update tools — dfu-util (USB DFU, cross-platform), STM32CubeProgrammer (STM32 flashing and bootloader interaction), nrfutil (Nordic OTA and bootloader packaging), esptool.py (ESP32 flashing over UART).
- RTOS analysis — SEGGER SystemView (real-time task timing and scheduler trace, free with J-Link), FreeRTOS+Trace (Percepio Tracealyzer, commercial), and the built-in thread-aware debugging in STM32CubeIDE and VS Code with Cortex-Debug.
Common Mistakes
- Overwriting the running firmware image in place during a field update — if power is lost mid-write, the device boots to a corrupt image and cannot recover. Always write to an inactive flash bank (A/B partition scheme) and only switch boot bank after verifying the new image.
- Exposing raw flash write access without integrity checking — a DFU handler that writes any data it receives to flash, without CRC or signature verification, will accept a corrupt or malicious image and brick or compromise the device. Verify every image before marking it bootable.
- ISRs that are too long — ISRs preempt all lower-priority tasks and interrupts. A long ISR introduces latency into the entire system. Capture data and post a flag or queue event in the ISR; do all processing in the task context.
- Not enabling the watchdog from the start of development — adding a watchdog late in development can reveal firmware hangs that were always present but undetected. Design the watchdog service into the firmware architecture from the beginning.
- Stack overflow with no detection mechanism — bare-metal firmware typically has no stack guard; an overflow corrupts adjacent memory silently. In RTOS systems, enable stack canary checking from the start and monitor stack high-water marks during development.
Common Questions
What should a bootloader do?
At minimum: initialise essential hardware (clock, power), verify the application image (CRC or cryptographic signature), and jump to the application if valid. For field-updatable products, add: a DFU entry condition (dedicated boot pin, magic value in retained RAM, or timeout with no communication), a DFU protocol handler (USB DFU class, UART XMODEM, BLE OTA, etc.), write the new image to flash, verify the new image, and then boot it. Never expose raw flash write access without integrity checking — a corrupt firmware image can permanently brick a deployed product.
How do I implement a safe firmware update mechanism?
Use an A/B (dual bank) flash partition scheme where the new image is written to the inactive bank, verified completely, and only made active on the next boot. Never overwrite the running image in place — if power is lost mid-write, the device is bricked. Validate the image with a CRC at minimum; use a hardware-verified signature (e.g. STM32 SBSFU or Nordic's bootloader with MCUboot) for security-critical products. Maintain a permanent fallback (recovery) bootloader in protected flash that the application cannot overwrite.
What is the difference between a task and an interrupt in RTOS firmware?
An interrupt is a hardware event that preempts the current execution context immediately and runs the ISR. ISRs must be short and non-blocking. An RTOS task is a software thread with its own stack that the scheduler switches between based on priority and blocking state. The typical pattern: the ISR captures data or sets a flag (often via an RTOS queue or semaphore from ISR), then returns; the waiting task wakes, processes the data, and performs any lengthy operations. Zeus Design designs and implements embedded firmware for STM32, ESP32, nRF, and custom MCU platforms.
Knowledge Base
Bootloaders and Updates
- What Is a Bootloader in Embedded Systems? — bootloader responsibilities, application jump sequence, and field update implementation
- How Do You Use the STM32 DFU Bootloader to Flash Firmware? — STM32's built-in USB DFU bootloader: BOOT0 configuration, dfu-util, and the jump sequence
- How Does an OTA Firmware Update Work? — A/B partition scheme, MCUboot swap mechanism, ESP32 OTA API, image signing, and power-loss-safe rollback design
- How Do You Use the nRF52840 USB Port and Update Firmware Over DFU? — nRF52840 native USB setup, CDC ACM serial, MCUboot dual-slot partitioning, imgtool image signing, and DFU trigger via boot_request_upgrade()
Memory and Build System
- What Is a Linker Script and What Does It Do? — MEMORY regions, SECTIONS, LMA/VMA, and the startup symbols that initialise RAM before main() runs
- How Does the Memory Map Work in an Embedded Microcontroller? — Cortex-M address regions, STM32 flash/RAM/CCM addresses, memory-mapped peripheral registers, and why volatile is required
- What Does Embedded Startup Code Do Before main()? — reset handler, SystemInit(), .data copy, .bss zero-init, __libc_init_array, and what breaks when any step is missing
Reliability and Fault Recovery
- What Is a Watchdog Timer and How Do You Use It? — IWDG vs WWDG on STM32, prescaler calculation, kick strategy in bare-metal and RTOS firmware, and reading the reset cause register
RTOS and Architecture
- What Is an RTOS? — tasks, priorities, schedulers, semaphores, queues, and when an RTOS is the right choice
- Bare-Metal vs RTOS: Which Should You Use for Your Firmware? — the complete decision framework for architecture selection
- How Do You Create and Schedule Tasks in FreeRTOS? — xTaskCreate() parameters, priority assignment, stack sizing and overflow detection, task states, and common mistakes on ARM Cortex-M
- How Do FreeRTOS Queues, Semaphores, and Mutexes Work? — queues, binary and counting semaphores, mutexes with priority inheritance, task notifications, event groups, and ISR-safe variants
- What Are Interrupts in Embedded Systems? — interrupt vectors, ISR constraints, NVIC configuration, and interrupt-safe programming patterns
Forum Discussions
- FreeRTOS high-priority task blocks indefinitely on semaphore — priority inheritance not kicking in? —
xSemaphoreCreateBinary()vsxSemaphoreCreateMutex(), why binary semaphores cannot do priority inheritance,vTaskList()for diagnosing task starvation, and tight CPU-loop design smell