Electronics Design AU
ThreadSolved

OpenThread joiner times out every time — `ot-ctl joiner start` just returns nothing until it fails, no error

4 min read3 replies
Original Question

Asked by stale_biscuit_03 ·

Got my OTBR running on a Pi (following the setup guide here, thanks for that btw — the Docker deployment part saved me a lot of pain). Border router is up, ot-ctl state shows leader, network is formed. Now trying to join a second device (nRF52840 dongle running the OpenThread CLI joiner sample) to the same network and it just... doesn't. joiner start <pskd> on the device sits there for a while and then prints a failure with no useful detail. No error on the border router side either.

Things I've checked: both devices are definitely powered and the dongle shows up fine over USB CDC-ACM. I copied the PSKd from the same place I generated it. Channel matches what ot-ctl channel shows on the border router. Genuinely not sure where to look next — is there a step I'm missing on the commissioner side, or is this a device-side radio/config problem?

From the knowledge baseHow Do You Set Up an OpenThread Border Router?

3 Replies

zephyr_devotee
Accepted Answer

Almost certainly the commissioner side — this is the single most common OpenThread joining mistake and it produces exactly this "joiner just times out, nothing useful on either side" symptom. Forming the network and having the border router act as Leader does not automatically put it into commissioning mode. You have to explicitly start the commissioner role and register the joiner before the device's joiner start call has anything to actually authenticate against:

# On the border router (via ot-ctl):
ot-ctl commissioner start
ot-ctl commissioner joiner add <EUI64> <PSKd>

If commissioner start was never run, or the commissioning session had already timed out by the time your dongle attempted to join (commissioner sessions are time-bounded, not indefinite), the joiner's DTLS handshake has nothing on the other end to respond to it — which is exactly why you're seeing a silent timeout rather than an explicit rejection. A rejection would mean the commissioner did see the request and said no (wrong PSKd, EUI64 not in the allowlist); a pure timeout with zero border-router-side activity usually means the commissioner role wasn't active at all.

To confirm this is the cause:

  1. Run ot-ctl commissioner state on the border router — if it reports disabled, that's your answer.
  2. Start the commissioner, then add the specific joiner: commissioner joiner add <EUI64> <PSKd> — you can use * in place of the EUI64 to accept any joiner with the correct PSKd during the session, which is easier for bring-up (tighten this back up for anything beyond your bench).
  3. Then run joiner start <pskd> on the device — order matters, the commissioner window needs to already be open.

One more thing worth double-checking since you mentioned copying the PSKd: OpenThread PSKds are case-sensitive and have a restricted character set (uppercase letters and digits, no ambiguous characters like I/O/0/1 in the recommended generation scheme) — if you hand-typed it anywhere in the process rather than copy-pasting the exact string, that's a second common source of silent failures worth ruling out.

nrf_nordic_nerd

Adding the EUI64-specific gotcha since it's bitten me before: make sure you're reading the IEEE EUI-64, not the Thread Extended Address — on Nordic hardware these are two different values and it's easy to grab the wrong one from a debug log or ot-ctl output. ot-ctl eui64 gives you the factory-programmed EUI-64 that the commissioner's joiner allowlist actually needs; ot-ctl extaddr gives you the randomized Thread network extended address, which changes per-network and isn't what you register with the commissioner. If you added the wrong one to the joiner list with a specific (non-wildcard) entry, the commissioner would reject a correctly authenticated joiner because the EUI64 in the request doesn't match anything in its list — which, depending on OpenThread version/logging verbosity, may not always surface a loud error on the CLI.

kettledrum47

For what it's worth — once you're past bring-up and moving toward an actual product, I'd push back slightly on using the wildcard * joiner beyond the bench, which zephyr_devotee already flagged. On a deployed product you generally want each device pre-provisioned with its specific EUI64 registered against a PSKd generated per unit (not one shared PSKd for the whole fleet), otherwise anyone who captures a PSKd from one unit could attempt to join a device onto their own Thread network during the commissioning window. Doesn't matter for what you're doing right now on the bench, but worth building your provisioning flow around unique per-device credentials from the start rather than retrofitting it later — it's a much bigger rework once manufacturing is already flashing a shared PSKd onto every unit.