Can't decide between FreeRTOS and bare-metal for a simple sensor node — what's the tipping point?
Asked by stale_biscuit_03 ·
Working on a temperature and humidity monitoring node — STM32F103 target, BME280 over I2C, reports data every 60 seconds over UART to a Raspberry Pi gateway. That's basically the whole thing: wake, read sensor, format string, transmit, wait for ACK, loop.
Got a super-loop version running in about 150 lines of C. Then I started reading about FreeRTOS and now I'm second-guessing myself. Most of the embedded tutorials I've found seem to treat "use an RTOS" as the obvious default for anything more complex than blinking an LED, and I genuinely can't tell if I'm missing something or if my design is just simple enough that the super-loop is correct.
My worries with adding FreeRTOS:
- Flash and RAM overhead on an MCU with 64KB flash and 20KB RAM
- New API to learn for something that already works
- If it's overkill, I've made a simple project harder to reason about for no gain
But also worried I'm missing something. The design might eventually grow — a second sensor, maybe a button interrupt for triggering a manual report. Don't want to back myself into a corner.
Is there an actual rule of thumb for where the RTOS threshold is, or is it always "it depends"?
3 Replies
Your current design doesn't need an RTOS.
The rule of thumb I use: does your loop have multiple independent things that each need their own timing? You have one — read, format, transmit, wait for ACK. That's a pipeline, not concurrent tasks. A super-loop handles pipelines cleanly.
FreeRTOS on an F103 costs roughly 5–6 KB of flash and around 3–5 KB of RAM for the scheduler plus one task stack. On 20 KB of RAM, that's a real fraction before you've allocated your application data. For a sensor-read-then-transmit loop, you'd end up with one task (your old super-loop) running inside scheduler overhead. Not wrong, but not buying you anything either.
What actually pushes me toward an RTOS:
Multiple independent time domains running simultaneously. If you need to sample at 10 Hz while reporting every 60 s and kicking a watchdog every 500 ms with its own deadline, managing all of that in a single loop gets tangled fast. Three tasks with priorities are cleaner.
Blocking I/O at multiple points. If the loop has to wait for a sensor ready-flag in one place and a UART ACK in another, a super-loop forces a state machine. RTOS tasks just block.
Design scale. When you can't hold the whole loop in your head as a single flow any more. Subjective, but when you find yourself drawing flowcharts to understand what happens when X interrupts Y, that's the signal.
Second sensor and a button interrupt? Still doesn't need an RTOS. Handle the button in an ISR and set a flag. The second sensor is another read in your pipeline.
The bare-metal vs RTOS comparison goes through these decision points in detail if you want the full breakdown of when each approach scales better. Your 150-line loop is not a sign you did it wrong.
Mostly agree with soggy_waffle42. I'd add one distinction that might sharpen the decision for you as the project grows.
The thing task scheduling actually buys you isn't concurrency per se — it's
explicit temporal contracts. When a task calls vTaskDelayUntil(), you've
stated in code: this needs to run every N ms even if the previous iteration ran
long. In a super-loop, timing is implicit and the loop as a whole either meets its
timing or it doesn't.
For your current design this doesn't matter. You have no real-time deadline anywhere — if the report is 50 ms late, nothing breaks. So bare-metal is correct.
Where it starts to matter: if that button interrupt needs to trigger a manual report that must complete before some feedback timeout (say, an LED needs to light within 200 ms of the press), you now have a soft real-time constraint. A super-loop can handle it, but you'll need to reason carefully about what's happening in the UART wait when the button fires. That reasoning gets harder every time you add a feature.
On the resource side: FreeRTOS does have configMINIMAL_STACK_SIZE and you can
tune it aggressively on an F103. But with 20 KB RAM and two sensors, you need to
size task stacks explicitly, not guess. Stack overflow on a constrained MCU is a
fun bug to find at 2am.
The RTOS explainer covers what the scheduler is
actually guaranteeing if the vTaskDelayUntil model is new territory.
One reframe that sometimes helps for designs at this scale: the real choice isn't "super-loop vs RTOS." It's "super-loop vs interrupt-driven bare-metal."
Your button is the clearest example. You don't want button detection buried inside a 60-second UART wait. The natural answer is an ISR that sets a volatile flag, and your loop checks the flag at the top of each iteration. Fast, deterministic, no scheduler needed.
The thing that breaks interrupt-driven bare-metal is shared data with race conditions. The ISR sets the flag; the loop reads and clears it. On Cortex-M, read-modify-write on a single-byte flag isn't atomic. For a single event flag you can usually get away with a critical section around the read-and-clear — disable the interrupt for two instructions, handle it, re-enable. Ugly but correct.
Where I reach for FreeRTOS: when I need to block at multiple independent points within
the same logical flow. In your current loop, the UART ACK wait is polling a flag in a
tight spin — which burns CPU and means the button ISR fires but nothing processes it
until the ACK arrives. With a task, you call ulTaskNotifyTake() and genuinely yield
until the UART ISR notifies you. The button can preempt. That's when the scheduler
earns its stack allocation.
The interrupts article is worth reading alongside this — understanding when an ISR flag is sufficient versus when you need a proper notification mechanism makes this call a lot clearer.