Electronics Design AU
Firmware

What Is a Watchdog Timer and How Do You Use It?

Last updated 28 June 2026 · 10 min read

Direct Answer

A watchdog timer is a hardware countdown timer that resets the MCU if firmware does not periodically refresh it within a configured timeout period. When working firmware 'kicks' the watchdog before it expires, nothing happens. If a firmware hang, infinite loop, or deadlock prevents the kick from occurring, the watchdog expires and resets the MCU — recovering from the fault automatically. On STM32 devices there are two watchdogs: the Independent Watchdog (IWDG), clocked from the internal LSI oscillator, which continues running in low-power modes and detects any failure to kick within the configured window; and the Window Watchdog (WWDG), clocked from the peripheral bus, which additionally detects kicks that arrive too early, enabling detection of timing anomalies in real-time firmware loops.

Detailed Explanation

A processor that runs firmware can get stuck. A tight loop that never exits, a deadlock between two RTOS tasks waiting on each other, a corrupted stack pointer, or a peripheral that stalls mid-transfer — any of these can leave the MCU in a state where it is powered on but not doing useful work. Software running on the MCU cannot detect this condition because software cannot observe itself being stuck.

A watchdog timer solves this by offloading detection to hardware. The watchdog is a countdown timer that runs independently of the firmware. Firmware must periodically reset this timer — "kicking" or "refreshing" it. If the kick does not arrive within the configured timeout, the watchdog assumes something has gone wrong and resets the MCU.

IWDG — The Independent Watchdog

The STM32 IWDG (Independent Watchdog) is the simpler of the two watchdog peripherals. Its key properties:

  • Clock source: The internal LSI oscillator, which on STM32F4 has a nominal frequency of approximately 32 kHz. The LSI is independent of the main clock tree — it runs even when the system clock is stopped, unstable, or wrong (including during clock source failures), and continues running in Stop and Standby low-power modes. Note: The STM32F4 LSI frequency is nominally 32 kHz but can vary from approximately 17–47 kHz across operating temperature and supply voltage ranges (RM0090). For reliable IWDG operation, design timeouts with margin on both ends — do not configure a timeout so tight that LSI variation causes false resets.
  • Timeout formula: timeout_ms = (1000 × prescaler × (reload + 1)) / LSI_Hz
  • Prescaler options: 4, 8, 16, 32, 64, 128, 256 (set via IWDG_PR register).
  • Reload register: 12-bit value (0–4095), set via IWDG_RLR. Countdown starts from this value.

Example timeout calculation (STM32 IWDG at nominal 32 kHz LSI):

PrescalerReloadTimeout (nominal)
44095~512 ms
324095~4.1 s
2564095~32.8 s
8999~250 ms

To initialise the IWDG in STM32 HAL:

IWDG_HandleTypeDef hiwdg;

hiwdg.Instance = IWDG;
hiwdg.Init.Prescaler = IWDG_PRESCALER_64;   /* LSI / 64 */
hiwdg.Init.Reload = 499;                     /* timeout ≈ (64 × 500) / 32000 ≈ 1 s */

if (HAL_IWDG_Init(&hiwdg) != HAL_OK) {
    Error_Handler();
}

/* Once Init is called, the watchdog is running and cannot be stopped */

A critical property of the IWDG: once started, it cannot be stopped in software. There is no disable register. If the product needs to enter an extended low-power mode without firmware activity, the IWDG must be kicked before entering sleep, and the sleep duration must be shorter than the watchdog timeout. For very long sleep periods, the watchdog timeout must be extended accordingly.

Kicking the IWDG:

HAL_IWDG_Refresh(&hiwdg);    /* Reloads the counter; must happen before timeout */

WWDG — The Window Watchdog

The WWDG (Window Watchdog) adds a constraint that the IWDG does not have: the kick must arrive within a defined time window. Kicking the WWDG too early triggers a reset, just as kicking too late does.

How this works:

  • The WWDG has a 7-bit down-counter (T[6:0]) that decrements from a loaded value toward 0x40. When bit T6 clears (counter passes below 0x40), the WWDG resets the MCU.
  • An upper window limit (W[6:0]) is set in the WWDG_CFR register. If firmware kicks the watchdog while the counter is still above W (i.e. the counter has not decremented far enough yet), that is also treated as a fault, and the WWDG resets the MCU immediately.
  • The valid kick window is: W[6:0] ≥ counter > 0x40.

This makes the WWDG useful for detecting timing anomalies — a main loop that runs too fast (possibly due to skipping intended processing steps) triggers the same reset as one that runs too slow or hangs. The IWDG cannot detect "too fast."

The WWDG is clocked from PCLK1 (the APB1 peripheral bus clock), which means:

  • It does not run in Stop or Standby modes.
  • Its timeout is affected by clock configuration changes.
  • Timeout = T[6:0] × WWDG_clock_period × prescaler (detailed calculation in RM0090 section 26.3).

WWDG is typically used alongside the IWDG rather than instead of it: the IWDG provides a long-timeout safety net that survives low-power modes; the WWDG provides tight timing enforcement on the main control loop.

IWDG vs WWDG: When to Use Each

CriterionIWDGWWDG
Clock sourceLSI (independent)PCLK1 (bus clock)
Runs in low-power modesYes (Stop, Standby)No
Detects firmware hangYesYes
Detects loop running too fastNoYes
Can be disabled once startedNoNo
Typical useAlways-on system safety netReal-time loop timing enforcement

Kicking Strategy in Bare-Metal Firmware

In a bare-metal (no-RTOS) design with a main loop:

/* Bare-metal main loop */
while (1) {
    process_sensor_data();
    update_outputs();
    handle_comms();
    HAL_IWDG_Refresh(&hiwdg);    /* kick at end of every successful loop iteration */
}

Placing the kick at the end of the main loop means all of process_sensor_data(), update_outputs(), and handle_comms() must complete before the watchdog is refreshed. If any of those functions hangs, the watchdog expires.

Do not kick the watchdog in an interrupt handler. Interrupts continue to fire even when the main loop is stuck in an infinite loop. If the kick is in an ISR, the watchdog is refreshed even while the main loop is frozen — defeating its purpose entirely.

Kicking Strategy in RTOS Firmware

In RTOS firmware, each task has its own stack and runs independently. A single kick location in one task cannot verify that all other tasks are still running. The standard pattern:

  1. Each task maintains a bit in a shared task_alive_flags bitmask, setting its bit after completing its work cycle each iteration.
  2. A low-priority watchdog task clears all bits, then waits for one period. If all bits are set at the end of the period, all tasks ran — kick the hardware watchdog and repeat. If any bit is not set, at least one task failed to run — do not kick; let the watchdog expire and reset.
/* Simplified watchdog task (FreeRTOS) */
#define TASK_A_BIT  (1UL << 0)
#define TASK_B_BIT  (1UL << 1)
#define ALL_TASKS   (TASK_A_BIT | TASK_B_BIT)

volatile uint32_t task_alive_flags = 0;

void WatchdogTask(void *pvParameters) {
    const TickType_t period = pdMS_TO_TICKS(500);
    TickType_t last_wake = xTaskGetTickCount();
    for (;;) {
        vTaskDelayUntil(&last_wake, period);
        if ((task_alive_flags & ALL_TASKS) == ALL_TASKS) {
            HAL_IWDG_Refresh(&hiwdg);
        }
        /* If not all tasks checked in, don't kick — watchdog expires → reset */
        task_alive_flags = 0;    /* clear for next period */
    }
}

Each application task adds task_alive_flags |= TASK_X_BIT; at the point in its loop that represents successful completion of one work cycle.

This pattern is described in the FreeRTOS documentation as a "software watchdog task" and is the recommended approach for RTOS-based safety monitoring. For more on task structure and scheduling, see bare-metal vs RTOS: which should you use?

Watchdog on Reset: Reading the Reset Cause

After any reset — watchdog, power-on, software, external — the MCU's reset status register records the cause. On STM32, this is the RCC_CSR register. Reading and clearing it at startup allows firmware to distinguish a watchdog reset from a normal power-on reset:

if (__HAL_RCC_GET_FLAG(RCC_FLAG_IWDGRST)) {
    /* Unexpected reset — log, enter safe mode, or signal host */
    log_watchdog_reset_event();
}
__HAL_RCC_CLEAR_RESET_FLAGS();

This is useful for field diagnostics: a product that resets unexpectedly can log the cause and report it on the next communications session, even if the reset itself looked like a clean restart from the outside.

In safe OTA firmware update designs, the new firmware image must confirm itself before the watchdog expires — a failure to confirm causes a watchdog reset and the bootloader rolls back to the previous image, making the reset-cause flag an essential diagnostic when troubleshooting update failures.

For a complete diagnostic checklist covering watchdog, brown-out, hard fault, and power-glitch reset causes on STM32, see why does my STM32 keep resetting unexpectedly?

For firmware architecture that incorporates watchdog design from the start, Zeus Design's embedded firmware team delivers fault-tolerant firmware with systematic watchdog integration.

Design Considerations

  • Configure the watchdog timeout to be shorter than the worst-case legitimate silence period, with margin. The kick must arrive within the timeout during normal operation. If the main loop legitimately takes up to 800 ms under maximum load, configure a timeout of at least 1–2 seconds — but not so long that a real hang takes minutes to recover from.
  • Account for LSI oscillator tolerance on IWDG. The STM32F4 LSI varies across process, temperature, and voltage. For a 1-second nominal timeout, the actual timeout could range from approximately 0.5 to 1.5 seconds. Design the kick interval (e.g. 500 ms) so that even the shortest possible timeout is not missed, and the longest possible timeout is acceptable for the application's recovery latency requirement.
  • Enable the debug freeze for IWDG during development. In STM32, the DBGMCU_APB1_FZ register has an IWDG_STOP bit. Setting this bit pauses the IWDG counter when the CPU is halted at a breakpoint. Without this, the watchdog expires and resets the MCU within seconds of hitting any breakpoint. STM32CubeIDE enables this automatically for debug configurations; verify it is set if using a custom toolchain.
  • For safety-critical applications (IEC 61508, ISO 26262), a single watchdog is often insufficient. Standards typically require diverse redundancy — two independent watchdog timers, or a watchdog combined with a dedicated external voltage supervisor/watchdog IC, configured so that either can reset the system. Consult the applicable functional safety standard for specific requirements.

Common Mistakes

  • Kicking the watchdog from an ISR. Interrupts run even when the main task is hung. An ISR-based kick means the watchdog never expires during a main-loop deadlock, the most common fault the watchdog exists to detect.
  • Kicking too early in the main loop. If the kick is at the top of the loop before the bulk of processing, the watchdog only detects hangs in the code preceding the next kick — not hangs in the long processing sections that follow. Put the kick at the end of the loop after all critical processing completes.
  • Starting the watchdog in a branch that is conditionally reached. If the IWDG start is inside an if (production_mode) block, and that branch is not taken in a particular firmware configuration, the watchdog never starts — and the MCU has no automatic recovery mechanism. Always start the watchdog unconditionally.
  • Disabling the watchdog to simplify debugging, then forgetting to re-enable it. The DBGMCU freeze bit is the correct mechanism for transparent debugging without disabling the watchdog globally. Use it instead of commenting out watchdog initialisation.
  • Setting the timeout too long. A watchdog with a 30-second timeout means a hung product waits up to 30 seconds before recovering. For products where a 30-second outage is unacceptable, configure a shorter timeout and use the RTOS task monitoring pattern to ensure the kick is always timely.

Frequently Asked Questions

Can I kick the watchdog from an interrupt handler?
It is possible but generally wrong. An ISR can run even when a task-level bug has caused the firmware to hang in a loop — if you kick the watchdog from the ISR, the watchdog never expires and the hang goes undetected. Kick the watchdog only from the main application loop or a dedicated watchdog task at the same level where the fault you want to detect would occur. In RTOS firmware, each task should set a flag when it completes its work cycle; a low-priority watchdog task checks all flags and only kicks the hardware watchdog when every task has checked in.
Should I enable the watchdog during development?
Yes, and as early as possible. A common mistake is to add the watchdog only late in development. Enabling it from the start means any firmware hang or timing regression that prevents the kick is caught during development — when it's easy to debug — rather than in a deployed product where a reset loop is a field failure. During debugging over JTAG/SWD, most debuggers pause the watchdog counter when the MCU is halted at a breakpoint (this is configurable via the Debug freeze bit in the DBGMCU register on STM32 — see the reference manual). Verify that this freeze is enabled in your debug configuration.
What is the difference between a watchdog reset and a power-on reset?
Most MCUs set a flag in a reset status register that distinguishes the cause of the last reset: power-on reset (POR), external NRST pin, software reset (SYSRESET), watchdog reset (IWDG or WWDG), and others. On STM32 these flags are in the RCC_CSR register. Reading this register at startup allows firmware to take different actions depending on why it restarted — for example, logging a watchdog reset event, entering a safe mode, or notifying the host system that an unexpected reset occurred. Always read and clear the reset flags at the start of main() to capture the reason for the most recent restart.

References

Related Questions

Related Forum Discussions