Real-Time Control on Raspberry Pi: PREEMPT-RT and the Co-Processor Pattern
Last updated 29 June 2026 · 10 min read
Direct Answer
A standard Raspberry Pi running Linux cannot deliver hard real-time control — scheduler jitter is typically microseconds to tens of microseconds and can exceed 1 ms under load. The PREEMPT-RT kernel patch reduces worst-case latency to typically 50–100 µs on the Pi 4/CM4, adequate for soft real-time tasks with millisecond-level tolerances. For hard real-time requirements below 1 ms — motor control, encoder counting, or precise pulse timing — add a co-processor MCU (RP2040 or STM32) that handles timing-critical I/O deterministically while the Pi manages high-level logic, connectivity, and user interface.
Detailed Explanation
The Raspberry Pi runs a general-purpose Linux kernel — the same class of kernel that runs on servers, desktops, and smartphones. This gives the Pi access to a rich software ecosystem and the full Python package ecosystem, but it means the kernel scheduler can preempt your application code at any time and for any duration. For most Pi applications this is fine. For timing-critical hardware I/O, it is the central problem to design around.
Why Linux on the Pi Is Not Real-Time
A standard Linux kernel is designed for throughput and fairness, not timing determinism. Your application can be preempted to service:
- Kernel interrupt handlers (network RX, USB transfers, storage I/O)
- Kernel workqueues and deferred work items (software interrupts, tasklets)
- Memory management activity (page reclaim, cache writeback)
- Other processes competing for CPU time
On a standard Raspberry Pi OS kernel, measured worst-case GPIO toggle jitter from a userspace application is typically in the range of 100 µs to several milliseconds under normal system load, and can spike further under heavy USB, network, or SD card I/O. A control loop that must execute at a precise 1 kHz interval — or a step pulse generator that must produce clean rising edges — will produce unpredictable timing errors on a standard Linux kernel.
This is a fundamental property of a general-purpose OS design, not a hardware limitation of the BCM2711 SoC.
PREEMPT-RT: What It Does and When It Helps
The PREEMPT-RT patch converts non-preemptible regions of the Linux kernel into preemptible sections:
- Spinlocks become sleeping mutexes — threads can be preempted at spinlock boundaries rather than the kernel holding the CPU
- Hard IRQ handlers become threaded interrupt handlers — scheduled as normal threads, allowing higher-priority threads to preempt them
- Preemption latency is reduced and bounded: worst-case latency on Pi 4/CM4 drops to typically 50–100 µs under normal load
Raspberry Pi OS (Bookworm and later) ships a PREEMPT-RT kernel installable via:
sudo apt install linux-image-rt-arm64
No kernel compilation is required on current Raspberry Pi OS.
When PREEMPT-RT is sufficient:
- Control loops running at 200 Hz or slower (5 ms or longer cycle times)
- Soft real-time applications where occasional 100 µs latency spikes are acceptable
- Slow actuator control — valve actuation, relay switching, slow stepper movement at a few hundred steps per second
- Sensor polling and data aggregation at millisecond intervals
When PREEMPT-RT is not sufficient:
- Step/direction pulse generation above approximately 5–10 kHz (step-to-step jitter must be well under 100 µs)
- Quadrature encoder counting at high shaft speeds where missed edges cause position error
- Hard real-time motor current control loops at 10–100 kHz
- Safety-critical interlock timing with guaranteed sub-millisecond response
- Bit-banged protocols where timing tolerances are tighter than 50 µs
For these applications, a co-processor MCU is required.
The Co-Processor Pattern
A co-processor MCU runs bare-metal or RTOS firmware directly on hardware, responding to hardware events in microseconds to tens of nanoseconds — deterministic in a way that Linux userspace cannot match. The division of responsibility in a Pi + co-processor design is:
Raspberry Pi (typically CM4) handles:
- High-level application logic and state management
- Cloud connectivity, REST APIs, and MQTT
- Web server, database, and local UI
- Configuration management and OTA firmware distribution
- Data logging and non-time-critical processing
Co-processor MCU handles:
- Deterministic hardware I/O — PWM generation, timer-driven ADC, encoder counting
- Protocol-specific state machines — Modbus RTU, custom serial framing
- Safety-critical interlocks with guaranteed response time
- Relay and actuator control with hardware watchdog protection
The two communicate via a protocol bridge over UART, SPI, or USB.
Choosing a Co-Processor: RP2040 vs STM32
| Factor | RP2040 | STM32 (G0/G4 family) |
|---|---|---|
| CPU | Dual Cortex-M0+ at 133 MHz (typically) | Cortex-M0+ to M4, 32–170 MHz |
| PIO state machines | 8 independent state machines — custom GPIO protocols at up to 125 Msamples/s | Not available; GPIO sequencing uses timers and DMA |
| ADC | 12-bit, 4 channels, 500 kSa/s; no differential input | 12-bit, multiple ADCs with simultaneous sampling; differential input on G4/H7 |
| CAN bus | Not natively supported | Native CAN/CAN FD on STM32G0/G4 |
| USB | USB 1.1 dual-mode (host and device) natively | USB device on most families; host on selected parts |
| BOM cost | Typically under AUD $2 per unit | STM32G0 series typically AUD $1–3 per unit |
| Toolchain | pico-sdk (C/C++), Arduino (Mbed), MicroPython | STM32CubeIDE + CubeMX, bare-metal CMSIS, FreeRTOS |
| Best for | Flexible GPIO, custom protocol via PIO, USB, cost-sensitive designs | Precision ADC, CAN bus, industrial applications, complex timer configurations |
Rule of thumb: Use RP2040 for GPIO-intensive tasks, custom protocol implementations via PIO, and cost-sensitive designs. Use STM32 for applications requiring precision analog, CAN bus, or a feature-rich HAL with well-supported industrial peripheral sets.
Designing the Pi ↔ Co-Processor Interface
UART (most common and recommended)
UART is the simplest and most reliable interface for most Pi + co-processor designs. Use the Pi's primary hardware UART (/dev/ttyAMA0 on CM4 and Pi 4) rather than the mini-UART. Enable it by adding dtoverlay=disable-bt to /boot/firmware/config.txt — this reassigns the hardware UART from the Bluetooth controller to GPIO14/GPIO15 (40-pin header pins 8 and 10).
Design considerations:
- Baud rate: 115200 for low-throughput command/response; 460800–921600 for tighter latency requirements. At 921600 baud, a 10-byte frame takes approximately 108 µs.
- Framing: Use length-prefixed or delimiter-framed packets rather than raw ASCII; add a CRC-8 or CRC-16 for error detection at the application layer.
- Protocol pattern: Pi sends a command frame; co-processor executes and sends a response within a fixed timeout; Pi retries on timeout. Keep the command set small and stateless where possible.
The mini-UART (/dev/ttyS0) is clocked from the BCM core clock, which varies with CPU frequency scaling. Its baud rate drifts under load. Do not use it for co-processor communication.
SPI
SPI supports higher throughput and allows the MCU to signal the Pi via an interrupt GPIO pin (pulled high when data is ready). Useful when the co-processor is streaming continuous data — ADC samples, encoder position — that the Pi polls or is notified of asynchronously. Complexity is higher: full-duplex synchronous transfers require frame alignment and buffering on the MCU side.
USB CDC (RP2040 only)
The RP2040 can enumerate as a USB CDC virtual serial port (/dev/ttyACM0 on the Pi). Convenient for prototyping — no device tree configuration needed. For production, UART is typically more predictable: USB CDC introduces enumeration latency on boot and can have higher scheduling overhead under load.
I2C
Suitable for very low-speed command exchange. The MCU acts as an I2C slave. Maximum practical throughput is 400 kbit/s (fast-mode) or 1 Mbit/s (fast-mode plus). Not recommended for time-sensitive or high-throughput data paths.
Co-Processor Firmware Architecture
Bare-metal is the right choice when the co-processor has a single primary function. A main loop polls for incoming UART bytes, assembles frames, and dispatches commands. Hardware events (encoder pulses, timer interrupts, ADC conversions) are handled in ISRs that update shared state read by the main loop. For stepper control, relay switching, and data acquisition, bare-metal on a Cortex-M0+ is entirely adequate and produces the most predictable timing.
RTOS (FreeRTOS is standard for both RP2040 and STM32) suits designs where the co-processor manages multiple concurrent tasks — a separate task for UART parsing, a task for PID control, and a task for watchdog servicing. The overhead is acceptable on a 133 MHz Cortex-M0+ for most applications. See bare-metal vs RTOS firmware for guidance on the trade-offs.
Watchdog design: If the Pi stops sending keepalive messages — due to a software crash, network hang, or deliberate power-down — the co-processor must enter a defined safe state (relays open, motors stopped, outputs at zero) rather than continuing to execute the last received command. A hardware watchdog timer on the MCU, refreshed only by valid keepalive frames from the Pi, enforces this without any software path being involved.
Practical Implementation Examples
Stepper motor control: The Pi sends target position, speed, and acceleration parameters over UART. An STM32 or RP2040 (via PIO state machine) generates step/direction pulses with sub-10 µs jitter at step rates up to 250 kHz. Linux userspace on the Pi cannot reliably generate step pulses above approximately 5 kHz.
Precision data acquisition: The Pi requests a burst of 1,000 ADC samples at a 10 kHz rate. The STM32 triggers its ADC from a hardware timer at exactly 100 µs intervals, DMA-transfers results into a ring buffer, and transmits the completed buffer over UART at 921600 baud. The Pi cannot reliably time ADC reads from userspace at 10 kHz.
Industrial relay controller: The Pi sends structured commands (relay channel, state) over UART at 115200 baud. The STM32 drives 16 relay channels via GPIO expanders, enforces a hardware watchdog (all relays open if no keepalive received within 500 ms), and reports relay coil fault conditions back to the Pi. The STM32 also handles timing-critical safety interlocks that must respond within 10 ms regardless of Pi state.
Design Considerations
- Identify real-time requirements at architecture time — list every function with a timing requirement before choosing MCUs. Anything below 1 ms belongs on the co-processor; anything above can run in Linux userspace on the Pi.
- Hardware UART, not software serial — always use the Pi's primary hardware UART. Software serial implementations from Linux userspace suffer from the same scheduler jitter the co-processor is solving.
- Route a reset GPIO from Pi to MCU — one Pi GPIO connected to the co-processor's RESET pin allows the Pi to hardware-reset the MCU on startup, after a watchdog timeout, or before initiating a firmware update. This avoids getting stuck waiting for an MCU stuck in a fault state.
- Plan the firmware update path early — RP2040 can be put into UF2 boot mode programmatically; STM32 supports DFU over UART or USB. Design an OTA path where the Pi can flash updated co-processor firmware received from a server. See how does OTA firmware update work for an overview of the update mechanisms available on common MCU families.
- Include a protocol version handshake — add a version field to the command frame so that Pi and co-processor can confirm firmware compatibility before entering the main control loop. Mismatched firmware on opposite sides of the UART interface is a common source of difficult-to-diagnose field failures.
If you are building a product around the Raspberry Pi CM4 with a real-time co-processor, Zeus Design's embedded firmware team develops both the embedded Linux application layer and the co-processor firmware for commercial product designs.
Common Mistakes
- Assuming PREEMPT-RT eliminates the need for a co-processor — PREEMPT-RT significantly reduces worst-case latency but does not eliminate it. Sub-100 µs requirements still need a co-processor regardless of the kernel patch.
- Using the mini-UART (
/dev/ttyS0) for co-processor communication — the mini-UART is clocked from the BCM core clock, which varies with CPU frequency scaling. Use the primary hardware UART (/dev/ttyAMA0) after thedisable-btoverlay. - Not implementing a watchdog on the co-processor — if the Pi crashes, the co-processor will continue executing its last command indefinitely. A hardware watchdog timer with Pi-sourced keepalive frames ensures the system returns to a safe state automatically.
- Treating the inter-processor protocol as an afterthought — the command/response interface between Pi and MCU is a real API. Define frame format, error handling, timeout behaviour, and firmware versioning before writing firmware on either side. Protocol changes are expensive after both sides are implemented.
- Choosing USB CDC for latency-sensitive applications — USB CDC is convenient for prototyping but introduces enumeration latency on boot and USB scheduling overhead under load. For production latency-sensitive communication, UART or SPI is more predictable.
Frequently Asked Questions
- Does PREEMPT-RT make the Raspberry Pi a real-time system?
- PREEMPT-RT significantly reduces worst-case scheduler latency — from potentially milliseconds on a standard kernel to typically 50–100 µs on the Pi 4/CM4 — but does not make Linux a hard real-time OS. Latency is more predictable but not guaranteed deterministic. PREEMPT-RT is appropriate for soft real-time control with millisecond-level tolerances (process control, slow motor control, UI-driven actuators). For hard real-time requirements below 500 µs, or any safety-critical timing, a dedicated co-processor MCU is still required.
- When should I use RP2040 vs STM32 as a Raspberry Pi co-processor?
- Use RP2040 when you need flexible GPIO state machines (PIO can implement custom protocols at nanosecond precision), USB dual-mode, or a very low BOM cost. Use STM32 when you need precision ADC with accurate sample timing, CAN bus, a mature HAL with extensive peripheral support, or higher clock speeds for compute-intensive tasks. Both communicate with the Pi over UART, SPI, or USB CDC; the interface choice is largely independent of MCU family.
- Which UART should I use between a Raspberry Pi and a co-processor?
- Use the primary hardware UART (/dev/ttyAMA0 on Pi 4/CM4) after reassigning it from the Bluetooth controller via the disable-bt device tree overlay. The mini-UART (/dev/ttyS0) is clocked from the variable core clock and its baud rate drifts under CPU frequency scaling. At 115200 baud a 10-byte frame takes approximately 870 µs; at 921600 baud the same frame takes approximately 108 µs.
References
Related Questions
Raspberry Pi vs Microcontroller: When Should You Choose Each?
Choosing between a Raspberry Pi and a microcontroller? Covers real-time requirements, power consumption, compute needs, and product design trade-offs.
Should You Use the Raspberry Pi CM4 in a Product?
Learn when to use the Raspberry Pi CM4 in a product: carrier board design, eMMC storage strategy, boot mode strapping, and RCM certification.
How to Interface Sensors and Peripherals with Raspberry Pi GPIO
Interface sensors and peripherals with Raspberry Pi GPIO: 3.3 V limits, level shifting, I2C/SPI/UART device tree configuration, lgpio API, and i2cdetect.
What Are Interrupts in Embedded Systems and How Do They Work?
Interrupts let a microcontroller respond to hardware events instantly without polling. Learn how ISRs, NVIC priority, and interrupt latency work.
Bare-Metal vs RTOS: Which Should You Use for Your Firmware?
Bare-metal firmware and RTOS suit different embedded projects. Learn the trade-offs — timing, RAM overhead, complexity — and how to choose.
How Do You Choose the Right Microcontroller for Your Project?
Choosing the right MCU comes down to peripherals, memory, power, wireless needs, and toolchain. This guide walks through every factor with concrete examples.