ESP32 keeps dropping Wi-Fi after 20–30 minutes in deployed location — reconnect loop doesn't always recover
Asked by stale_biscuit_03 ·
Having a frustrating one. Built an ESP32 environmental monitor (SHT40 temp/humidity, reports to an MQTT broker every 5 minutes). Works flawlessly on my desk and at home. Installed it in a shed about 150m from the house router a week ago and it keeps dropping off after 20–30 minutes. Sometimes reconnects, sometimes just hangs until I power-cycle it.
Using ESP-IDF v5.1. My reconnection code in the event handler is:
case WIFI_EVENT_STA_DISCONNECTED:
ESP_LOGI(TAG, "disconnected, reconnecting...");
esp_wifi_connect();
break;
Signal's around -65 to -70 dBm measured at the install location, which I thought was decent enough. I checked the Wi-Fi setup article but it focuses on the initial connection setup — I can't figure out why stable lab testing doesn't translate once it's deployed.
Does this pattern (works in testing, dies in the field) point to anything obvious?
3 Replies
Classic production deployment pattern. Three things to look at, roughly in probability order for your description.
1. Power save mode — most likely cause
ESP-IDF's default Wi-Fi power save mode is WIFI_PS_MIN_MODEM. In this mode,
the Wi-Fi modem sleeps between DTIM beacon intervals to save power. If the ESP32
misses enough consecutive beacons — because of timing jitter, RF interference, or
the AP's DTIM interval being longer than expected — the driver declares a beacon
timeout and drops the connection. Your home/lab AP probably has a 100ms beacon
interval with DTIM 1; the router at the shed might have different settings, or the
greater distance makes beacon reception less reliable.
For a sensor sending every 5 minutes, power save on the radio is almost certainly not something you need. Disable it:
/* call this after esp_wifi_start(), before or after connecting */
esp_wifi_set_ps(WIFI_PS_NONE);
This is the single most common fix for intermittent production disconnections on ESP32 devices that aren't battery-constrained.
2. Log the disconnect reason
Before changing anything, add the reason code to your log:
case WIFI_EVENT_STA_DISCONNECTED: {
wifi_event_sta_disconnected_t *evt =
(wifi_event_sta_disconnected_t *)event_data;
ESP_LOGW(TAG, "disconnected, reason=%d", evt->reason);
esp_wifi_connect();
break;
}
Key reason codes: 200 (WIFI_REASON_BEACON_TIMEOUT) means missed beacons —
power save or signal. Code 2 (WIFI_REASON_AUTH_EXPIRE) means the AP's
authentication timer expired — an AP-side keepalive issue. Code 8
(WIFI_REASON_ASSOC_LEAVE) is a clean AP-initiated disconnect (AP deauthed
you deliberately). Knowing which one you're hitting tells you where to focus.
3. Reconnection robustness
Calling esp_wifi_connect() directly in the disconnect handler is fine for basic
cases, but it needs backoff and a retry limit. Without backoff, failed reconnect
attempts can pile up and some APs will rate-limit or temporarily block a client
that keeps hammering auth requests.
More robust pattern: use a retry counter and, after a few consecutive failures, do a full driver reset:
static int s_retry = 0;
case WIFI_EVENT_STA_DISCONNECTED:
if (s_retry < MAX_RETRY) {
esp_wifi_connect();
s_retry++;
} else {
/* full stack reset clears driver state more thoroughly */
esp_wifi_stop();
esp_wifi_start();
/* esp_wifi_connect() will be triggered by WIFI_EVENT_STA_START */
s_retry = 0;
}
break;
case WIFI_EVENT_STA_CONNECTED:
s_retry = 0;
break;
Start with esp_wifi_set_ps(WIFI_PS_NONE) — that alone fixes this for most
people. If it doesn't, the reason code will tell you the next step.
To expand on the reason codes: the numeric values come from two sources. Codes
1–45 are 802.11 standard deauthentication/disassociation reason codes defined
in the IEEE 802.11 spec. Codes 200+ are ESP-IDF additions — WIFI_REASON_BEACON_TIMEOUT
(200) and WIFI_REASON_NO_AP_FOUND (201) being the most common non-standard ones
you'll see in practice. They're all defined in esp_wifi_types.h in the ESP-IDF
headers if you want the full list.
One thing worth knowing about calling esp_wifi_connect() inside the event handler:
the handler runs in the context of the Wi-Fi task, so the call is fine in simple
cases, but if you have complex reconnection logic (exponential backoff, MQTT
reconnection sequencing, NVS reads) it's cleaner to use a FreeRTOS event group and
handle reconnection from your main task or a dedicated network management task.
Something like:
/* in event handler */
case WIFI_EVENT_STA_DISCONNECTED:
xEventGroupClearBits(s_wifi_event_group, WIFI_CONNECTED_BIT);
xEventGroupSetBits(s_wifi_event_group, WIFI_DISCONNECTED_BIT);
break;
/* in your main/network task */
EventBits_t bits = xEventGroupWaitBits(s_wifi_event_group,
WIFI_DISCONNECTED_BIT,
pdTRUE, pdFALSE,
portMAX_DELAY);
if (bits & WIFI_DISCONNECTED_BIT) {
vTaskDelay(pdMS_TO_TICKS(backoff_ms));
esp_wifi_connect();
}
This separates the event detection from the reconnect logic and makes the backoff
timing straightforward. The ESP-IDF wifi_station example in
examples/wifi/getting_started/station/ uses the event group pattern if you want
a reference.
Also: esp_wifi_set_ps(WIFI_PS_NONE) should definitely be your first test. In
ESP-IDF 5.x it can be set before esp_wifi_start() if you call it after
esp_wifi_init() and esp_wifi_set_mode().
One more angle if the power save fix doesn't resolve it: the AP itself.
Some consumer routers have a client idle timeout — they deauthenticate stations that haven't exchanged data for a period (anything from 5 minutes to 30 minutes depending on firmware). In the lab on your home AP this might not kick in because you're doing other network activity that keeps the AP's client table warm. In the shed it's the only device on that AP, sending only every 5 minutes. If the AP's idle timer is shorter than your reporting interval, it'll disconnect you between transmissions.
You can distinguish this from power save issues by looking at the reason code: AP-side
deauth for idle timeout usually shows up as reason 2 (WIFI_REASON_AUTH_EXPIRE) or
reason 4 (WIFI_REASON_AUTH_LEAVE), not 200 (beacon timeout).
Quick test to confirm: swap the router for a mobile hotspot temporarily. Phone hotspots generally don't have aggressive idle timeouts, so if the device stays connected for hours on the hotspot but not the router, you're hitting the AP's keepalive behaviour rather than a power save or signal problem. From there you can look at the router's settings (if it exposes them) or bump your MQTT keepalive interval below the AP's idle threshold to keep the association alive.
At -65 to -70 dBm you're not in great shape for a sensitive AP, but you're generally
above the point where beacons get dropped reliably — so signal being the primary cause
is less likely than power save or AP behaviour at that RSSI. Still worth running
esp_wifi_sta_get_ap_info() to poll RSSI periodically and log it, just to confirm
it's not worse than expected at certain times of day.
The deep sleep and power management article
has a useful breakdown of the five ESP32 power modes and their modem behaviour, which
is worth reading alongside this to understand the WIFI_PS_NONE vs
WIFI_PS_MIN_MODEM vs WIFI_PS_MAX_MODEM trade-offs.