Electronics Design AU
STM32

How Do You Configure STM32 Peripherals with HAL and CubeMX?

Last updated 26 June 2026 · 8 min read

Direct Answer

STM32 peripheral configuration uses two layers: STM32CubeMX (a GUI tool that lets you assign pins, set clock dividers, and configure peripheral parameters), and the HAL (Hardware Abstraction Layer), a C library that CubeMX generates initialisation code for and that your application code calls to send and receive data. CubeMX generates the MX_USARTx_UART_Init(), MX_SPIx_Init(), and MX_I2Cx_Init() functions; your code then calls HAL_UART_Transmit(), HAL_SPI_Transmit(), and HAL_I2C_Master_Transmit() to move data.

Detailed Explanation

STM32 peripheral configuration involves two tools that work together: STM32CubeMX for visual pin assignment and peripheral parameter setup, and the HAL (Hardware Abstraction Layer) C library that provides the runtime API for your firmware to use those peripherals. CubeMX generates the initialisation code; HAL provides the functions your application calls.

Understanding the clock tree is a prerequisite for correct peripheral configuration — peripheral bus clock speeds (APB1, APB2) determine the achievable baud rates, SPI clock dividers, and I2C timing parameters. See how does the STM32 clock tree work? before configuring peripherals. Available peripherals also vary between STM32 families — if you haven't finalised your device choice, see Which STM32 Family Should You Use? first.

How STM32CubeMX Works

CubeMX is a graphical tool (free, from ST) that lets you:

  1. Select your STM32 device or board.
  2. Assign peripheral functions to physical pins via a pinout diagram.
  3. Configure peripheral parameters (baud rate, word length, phase/polarity, clock speed, I2C timing) through parameter panels.
  4. Generate a complete project with initialisation code in C, structured for the STM32 HAL library, in your chosen IDE format (Keil MDK, IAR, STM32CubeIDE, Makefile).

The generated code includes a main.c with an MX_ initialisation function per enabled peripheral (e.g. MX_USART2_UART_Init()), called in sequence from main() before the application loop. You add your application code in the designated USER CODE sections, which CubeMX preserves across re-generations.

Re-generating from CubeMX does not overwrite your application code in USER CODE BEGIN ... USER CODE END blocks. However, it will overwrite any modifications made to generated code outside those blocks. Keep all application logic inside the protected zones.

Configuring UART

CubeMX steps:

  1. In the Pinout view, click a USART/UART-capable pin (e.g. PA2 for USART2 TX) and select USART2_TX. CubeMX automatically assigns the complementary pin (PA3 for USART2 RX).
  2. In the Connectivity → USART2 panel, set Mode to Asynchronous.
  3. Set Baud Rate (e.g. 115200), Word Length (8 bits), Parity (None), Stop Bits (1).
  4. Generate code.

Generated initialisation:

static void MX_USART2_UART_Init(void)
{
  huart2.Instance = USART2;
  huart2.Init.BaudRate = 115200;
  huart2.Init.WordLength = UART_WORDLENGTH_8B;
  huart2.Init.StopBits = UART_STOPBITS_1;
  huart2.Init.Parity = UART_PARITY_NONE;
  huart2.Init.Mode = UART_MODE_TX_RX;
  HAL_UART_Init(&huart2);
}

Application usage (blocking transmit):

uint8_t msg[] = "Hello\r\n";
HAL_UART_Transmit(&huart2, msg, sizeof(msg) - 1, HAL_MAX_DELAY);

Baud rate accuracy: UART baud rate is derived by dividing the peripheral clock (PCLK1 for USART2 on most STM32F4/G0/G4 series, PCLK2 for USART1). CubeMX shows the achieved baud rate and the error percentage — for 115200 at 72 MHz PCLK1, the error is typically < 0.1%, which is well within the UART specification (±2.5% tolerance). See what is UART? for how UART framing and baud rate work.

Configuring SPI

CubeMX steps:

  1. Click the SPI-capable pins (e.g. PA5/SCK, PA6/MISO, PA7/MOSI for SPI1) and assign SPI1 functions.
  2. For a CS-controlled single-peripheral connection, assign a GPIO output for the CS pin separately (SPI hardware NSS management is often simpler as software control in firmware).
  3. In Connectivity → SPI1, set Mode to Full-Duplex Master.
  4. Set Prescaler (SPI clock = PCLK2 / prescaler), CPOL, CPHA, and Data Size (8 or 16 bit).

The prescaler determines the SPI clock frequency. For a peripheral specified to run at up to 10 MHz with a 72 MHz PCLK2, choose a prescaler of 8 (72 / 8 = 9 MHz — within spec) rather than 4 (18 MHz — exceeds spec).

CPOL and CPHA: these set the clock idle state and the capture edge. Mode 0 (CPOL=0, CPHA=0) is the most common SPI mode — clock idles low, data captured on the rising edge. Always check the peripheral's datasheet SPI timing diagram. See what is SPI? for the full explanation of SPI modes and wiring, or the SPI CPOL/CPHA troubleshooting discussion for a real-world diagnosis of the wrong mode setting causing all reads to return 0xFF.

Application usage (blocking, with software CS):

HAL_GPIO_WritePin(SPI1_CS_GPIO_Port, SPI1_CS_Pin, GPIO_PIN_RESET); // assert CS
HAL_SPI_TransmitReceive(&hspi1, tx_buf, rx_buf, length, HAL_MAX_DELAY);
HAL_GPIO_WritePin(SPI1_CS_GPIO_Port, SPI1_CS_Pin, GPIO_PIN_SET);   // deassert CS

Configuring I2C

CubeMX steps:

  1. Click the I2C-capable pins (e.g. PB6/SCL, PB7/SDA for I2C1) and assign I2C1 functions.
  2. In Connectivity → I2C1, set Mode to I2C.
  3. Set Speed Mode: Standard Mode (100 kHz) or Fast Mode (400 kHz). For STM32 devices that support it, Fast Mode Plus (1 MHz) is also available.
  4. CubeMX automatically calculates the I2C timing register value (TIMINGR) from the selected speed and the peripheral clock — do not calculate this register manually.

Application usage (blocking master transmit to address 0x48):

uint8_t data[] = {0x01, 0xC3}; // register address + value
HAL_I2C_Master_Transmit(&hi2c1, (0x48 << 1), data, 2, HAL_MAX_DELAY);

The I2C address is left-shifted by 1 in the HAL: HAL_I2C_Master_Transmit and HAL_I2C_Master_Receive take a 7-bit address in bits [7:1], with bit 0 as the R/W flag set by the HAL internally. See what is I2C? for address format, pull-up resistor sizing, and the ACK/NACK signalling that troubleshooting relies on.

DMA for High-Throughput Transfers

For SPI peripherals driving displays (240×320 px at 30 fps ≈ 2.8 Mbit/s) or UART at high baud rates, blocking HAL calls waste CPU cycles. Enable DMA in CubeMX:

  1. In the peripheral's parameter panel, go to the DMA Settings tab.
  2. Add a DMA request for Tx (and Rx if bidirectional). CubeMX assigns a DMA channel and stream automatically.
  3. In NVIC Settings, enable the DMA stream interrupt.

The DMA transfer is then initiated with HAL_SPI_Transmit_DMA() or HAL_UART_Transmit_DMA(), and completion is signalled via HAL_SPI_TxCpltCallback() or HAL_UART_TxCpltCallback() called from the DMA interrupt handler — which CubeMX also wires up in the generated stm32xxxx_it.c file. For a deep-dive into DMA stream/channel architecture, Normal vs Circular mode, cache coherency on STM32H7, and production pitfalls, see how to configure STM32 HAL DMA for UART, SPI, and ADC.

For embedded firmware development across STM32 and other MCU platforms — from CubeMX configuration through production-quality RTOS firmware — Zeus Design's firmware team covers the complete software stack — contact Zeus Design to discuss your firmware requirements.

Design Considerations

  • Clock tree first: peripheral baud rates and clock dividers are derived from APB1/APB2 clock speeds, which depend on the clock tree configuration. Configure the clock tree in CubeMX before finalising peripheral settings, or re-check all peripheral speeds after any clock change. See how does the STM32 clock tree work?.
  • Pin remapping and alternate functions: STM32 peripherals can usually be mapped to multiple pins via alternate function (AF) numbers. CubeMX handles this automatically, but verify the assigned AF number matches your PCB layout — wiring SPI1 to PA5/PA6/PA7 requires AF5 on STM32F4 series, and the generated GPIO initialisation code must reflect this.
  • Interrupt priority: if both HAL and RTOS are used, HAL interrupt priorities must sit below the RTOS kernel priority (e.g. for FreeRTOS with configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY = 5, all HAL peripheral interrupts must be set to priority 5 or higher in NVIC settings). CubeMX does not enforce this automatically.
  • I2C pull-up resistors: I2C requires external pull-up resistors on both SDA and SCL. The HAL does not configure these — they must be present on the PCB. For 400 kHz with a 3.3V supply and bus capacitance < 100 pF, a 4.7 kΩ pull-up is typically correct; higher capacitance or longer buses may require 2.2 kΩ or stronger pull-ups.

Common Mistakes

  • Forgetting to enable the peripheral clock in application code that bypasses CubeMX init: if peripheral initialisation is written manually rather than generated by CubeMX, forgetting the __HAL_RCC_USARTx_CLK_ENABLE() macro leaves the peripheral unclocked — HAL calls will hang or return HAL_TIMEOUT. CubeMX-generated code handles this automatically in the MX_ init functions.
  • Setting SPI prescaler too low: the SPI peripheral is clocked from a bus clock (PCLK1 or PCLK2 depending on the STM32 series and the SPI instance). Selecting a prescaler of 2 when the bus clock is 84 MHz produces 42 MHz SPI — far beyond what most SPI peripherals support. Always check the peripheral's maximum SPI clock frequency in its datasheet.
  • Modifying generated code outside USER CODE blocks: any changes made to HAL initialisation code outside the CubeMX-protected USER CODE BEGIN / END sections are overwritten the next time CubeMX regenerates the project. Keep all application code inside the protected sections.
  • Using HAL_Delay() from an ISR: HAL_Delay() spins on the SysTick counter, which only increments from the SysTick interrupt. If called from an ISR at priority equal to or higher than the SysTick IRQ priority (typically 15 — the lowest), the SysTick interrupt is preempted and HAL_Delay() blocks forever. Never call HAL_Delay() from an ISR.
  • I2C address 8-bit vs 7-bit confusion: hardware I2C addresses are 7-bit values. The HAL expects a 7-bit address shifted left by 1 (e.g. device address 0x48 → pass 0x48 << 1 = 0x90 to HAL functions). Passing the raw 7-bit address without shifting causes the HAL to address the wrong device on the bus.

Frequently Asked Questions

Should I use HAL or LL (Low-Layer) drivers for STM32?
HAL (Hardware Abstraction Layer) is higher level — easier to use and portable across STM32 families, but it adds function call overhead and sometimes performs unnecessary work. LL (Low-Layer) drivers map closely to peripheral registers, giving more control and lower overhead, but are not portable between STM32 series without code changes. For most product firmware, start with HAL — it is well-documented and its generated code from CubeMX is correct. Move to LL only for specific bottlenecks (e.g. a DMA-driven SPI at maximum speed, or a UART ISR where HAL's callbacks add unacceptable latency).
Why does CubeMX generate HAL_Delay() in the initialisation sequence?
HAL_Delay() is used in a few HAL initialisation functions to allow certain peripherals (I2C bus reset, USB enumeration timing) to meet hardware timing requirements. These delays rely on the SysTick timer being correctly configured, which in turn depends on the SystemCoreClock variable being set to the actual HCLK frequency. If you change the clock tree after calling HAL_Init(), call SystemCoreClockUpdate() to resynchronise SystemCoreClock — otherwise HAL_Delay() will be wrong, causing intermittent initialisation failures.
What is the difference between blocking, interrupt-driven, and DMA HAL modes?
HAL provides three calling conventions for each peripheral. Blocking (HAL_UART_Transmit, no suffix) waits until the transfer completes before returning — simple but wastes CPU time. Interrupt-driven (HAL_UART_Transmit_IT) starts the transfer and returns immediately; the CPU is interrupted when the transfer completes, and you implement HAL_UART_TxCpltCallback(). DMA (HAL_UART_Transmit_DMA) offloads the data transfer to the DMA controller entirely — the CPU is free for other work, with a callback on completion. Use blocking mode for infrequent, short transfers (config writes, debug output). Use DMA for high-bandwidth or continuous transfers (SPI display updates, I2S audio, ADC streaming).

References

Related Questions

Related Forum Discussions