Skip to content

dcd/musb: defer EP0 SETUP during DATA_IN/STATUS race#3643

Merged
hathach merged 23 commits into
masterfrom
musb_ep0_race
Jun 17, 2026
Merged

dcd/musb: defer EP0 SETUP during DATA_IN/STATUS race#3643
hathach merged 23 commits into
masterfrom
musb_ep0_race

Conversation

@HiFiPhile

@HiFiPhile HiFiPhile commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Handle cases where a new SETUP arrives before the previous control transfer fully completes by buffering the SETUP and replaying it after status completion. Split EP0 DATA state into DATA_IN/DATA_OUT and finalize pending status-out completion before processing deferred SETUP.

Take care these 2 cases if new SETUP packet arrived while old control transfer is not finished yet. This could happen in following scenarios when CPU load is high:

  • Status IN/OUT finished, IRQ and new setup packet IRQ arrive at the same time.
  • Data IN finished and status OUT is received, both IRQs and new setup packet IRQ arrive at the same time.

Supercede #3626

Tested intensively with DFU example with Interrupt turned on/off every 20ms simulating high CP load:

diff --git a/examples/device/dfu/src/main.c b/examples/device/dfu/src/main.c
index fb3c22630..d4e75a44b 100644
--- a/examples/device/dfu/src/main.c
+++ b/examples/device/dfu/src/main.c
@@ -45,6 +45,7 @@
 #include "bsp/board_api.h"
 #include "tusb.h"
 
+#include "device/dcd.h"
 //--------------------------------------------------------------------+
 // MACRO CONSTANT TYPEDEF PROTYPES
 //--------------------------------------------------------------------+
@@ -65,6 +66,7 @@ enum {
 static uint32_t blink_interval_ms = BLINK_NOT_MOUNTED;
 
 void led_blinking_task(void);
+static void musb_test(void);
 
 /*------------- MAIN -------------*/
 int main(void) {
@@ -79,6 +81,7 @@ int main(void) {
   while (1) {
     tud_task(); // tinyusb device task
     led_blinking_task();
+    musb_test();
   }
 }
 
@@ -86,6 +89,26 @@ int main(void) {
 // Device callbacks
 //--------------------------------------------------------------------+
 
+// on off usb interrupt 20ms for testing
+static void musb_test(void) {
+  static uint32_t start_ms  = 0;
+  static bool irq_enabled = true;
+
+  if (tusb_time_millis_api() - start_ms < 20) {
+    return; // not enough time
+  }
+
+  if (irq_enabled) {
+    dcd_int_disable(0);
+    tusb_time_delay_ms_api(20);
+    dcd_int_enable(0);
+  } else {
+  }
+
+  irq_enabled = !irq_enabled;
+  start_ms += 20;
+}
+
 // Invoked when device is mounted
 void tud_mount_cb(void) {
   blink_interval_ms = BLINK_MOUNTED;
@@ -122,7 +145,7 @@ uint32_t tud_dfu_get_timeout_cb(uint8_t alt, uint8_t state) {
     // For this example
     // - Atl0 Flash is fast : 1   ms
     // - Alt1 EEPROM is slow: 100 ms
-    return (alt == 0) ? 1 : 100;
+    return (alt == 0) ? 0 : 0;
   } else if (state == DFU_MANIFEST) {
     // since we don't buffer entire image and do any flashing in manifest stage
     return 0;
@@ -142,9 +165,9 @@ void tud_dfu_download_cb(uint8_t alt, uint16_t block_num, const uint8_t *data, u
 
   //printf("\r\nReceived Alt %u BlockNum %u of length %u\r\n", alt, wBlockNum, length);
 
-  for (uint16_t i = 0; i < length; i++) {
-    printf("%c", data[i]);
-  }
+  //for (uint16_t i = 0; i < length; i++) {
+  //  printf("%c", data[i]);
+  //}
 
   // flashing op for download complete without error
   tud_dfu_finish_flashing(DFU_STATUS_OK);

Copilot AI review requested due to automatic review settings May 14, 2026 17:41

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Mentor MUSB device controller (DCD) EP0 control-transfer state machine to better handle a race where a new SETUP arrives before the prior control transfer has fully completed (typically under high CPU load). It also adjusts MUSB port headers to avoid multiple-definition linkage issues around MUSB_BASES.

Changes:

  • Split EP0 DATA state into explicit DATA_IN / DATA_OUT states and add buffering of a “deferred” SETUP request to replay after status completion.
  • Add logic to complete pending STATUS stages before dispatching the deferred SETUP.
  • Make MUSB_BASES static const in MUSB port headers to prevent ODR/multiple-definition problems.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/portable/mentor/musb/musb_ti.h Make MUSB_BASES static const to avoid multiple definitions across translation units.
src/portable/mentor/musb/musb_max32.h Same MUSB_BASES linkage fix for the MAX32 port.
src/portable/mentor/musb/dcd_musb.c Rework EP0 control-transfer state machine and add deferred-SETUP buffering/replay to handle SETUP vs status/data completion races.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/portable/mentor/musb/dcd_musb.c Outdated
Comment thread src/portable/mentor/musb/dcd_musb.c Outdated
@github-actions

github-actions Bot commented May 14, 2026

Copy link
Copy Markdown

Size Difference Report

Because TinyUSB code size varies by port and configuration, the metrics below represent the averaged totals across all example builds.

Note: If there is no change, only one value is shown.

Changes >1% in size

file .text .bss size % diff
dcd_musb.c 2225 ➙ 2595 (+370) 171 ➙ 179 (+8) 2396 ➙ 2773 (+377) +15.7%
TOTAL 2225 ➙ 2595 (+370) 171 ➙ 179 (+8) 2396 ➙ 2773 (+377) +15.7%
Changes <1% in size

No entries.

No changes
file .text .rodata .data .bss size % diff
audio_device.c 2890 0 1259 1623 4508 +0.0%
cdc_device.c 1236 16 1092 728 1963 +0.0%
cdc_host.c 6399 487 15 971 7594 +0.0%
dcd_ch32_usbfs.c 1582 0 0 2444 4026 +0.0%
dcd_ch32_usbhs.c 1892 0 0 481 2373 +0.0%
dcd_ci_fs.c 1924 0 0 1290 3214 +0.0%
dcd_ci_hs.c 1756 0 0 1344 2534 +0.0%
dcd_da146xx.c 3067 0 0 144 3211 +0.0%
dcd_dwc2.c 4223 19 0 265 4506 +0.0%
dcd_eptri.c 2273 0 0 259 2532 +0.0%
dcd_ft9xx.c 3284 0 0 172 3456 +0.0%
dcd_khci.c 1952 0 0 1290 3242 +0.0%
dcd_lpc17_40.c 1481 0 0 648 1805 +0.0%
dcd_lpc_ip3511.c 1463 0 0 264 1683 +0.0%
dcd_mm32f327x_otg.c 1477 0 0 1290 2767 +0.0%
dcd_msp430x5xx.c 1801 0 0 176 1977 +0.0%
dcd_nrf5x.c 2939 0 0 292 3231 +0.0%
dcd_nuc120.c 1096 0 0 78 1174 +0.0%
dcd_nuc121.c 1170 0 0 101 1270 +0.0%
dcd_nuc505.c 0 0 1533 157 1690 +0.0%
dcd_rp2040.c 840 0 764 653 2257 +0.0%
dcd_rusb2.c 2918 0 0 156 3074 +0.0%
dcd_samd.c 1036 0 0 266 1302 +0.0%
dcd_samg.c 1322 0 0 72 1394 +0.0%
dcd_stm32_fsdev.c 2553 0 0 291 2844 +0.0%
dfu_device.c 776 28 712 136 912 +0.0%
dfu_rt_device.c 157 0 134 0 157 +0.0%
dwc2_common.c 603 22 0 0 615 +0.0%
ecm_rndis_device.c 1059 0 1 2759 3818 +0.0%
ehci.c 2763 0 0 6274 7783 +0.0%
fsdev_common.c 180 0 0 0 180 +0.0%
hcd_ch32_usbfs.c 2491 0 0 502 2993 +0.0%
hcd_ci_hs.c 181 0 0 0 181 +0.0%
hcd_dwc2.c 5070 25 1 545 5640 +0.0%
hcd_khci.c 2443 0 0 454 2897 +0.0%
hcd_musb.c 3071 0 0 157 3228 +0.0%
hcd_pio_usb.c 262 0 240 0 502 +0.0%
hcd_rp2040.c 1996 17 4 321 2338 +0.0%
hcd_rusb2.c 2923 0 0 245 3168 +0.0%
hcd_samd.c 2220 0 0 324 2544 +0.0%
hcd_stm32_fsdev.c 3248 0 1 420 3670 +0.0%
hid_device.c 1123 44 997 119 1242 +0.0%
hid_host.c 1241 0 0 1270 2511 +0.0%
hub.c 1384 8 8 30 1419 +0.0%
midi2_device.c 2628 52 1175 561 3211 +0.0%
midi2_host.c 1802 0 0 5921 7723 +0.0%
midi_device.c 1149 0 1007 619 1765 +0.0%
midi_host.c 1339 7 7 3538 4880 +0.0%
msc_device.c 2513 108 2281 804 3318 +0.0%
msc_host.c 1633 0 0 395 2028 +0.0%
mtp_device.c 1713 22 743 589 2309 +0.0%
ncm_device.c 1754 28 815 4393 6160 +0.0%
ohci.c 1925 0 0 2503 4428 +0.0%
printer_device.c 828 0 706 560 1387 +0.0%
rp2040_usb.c 386 35 619 11 1051 +0.0%
rusb2_common.c 160 0 16 0 176 +0.0%
tusb.c 455 0 387 3 457 +0.0%
tusb_fifo.c 855 0 486 0 850 +0.0%
typec_stm32.c 820 8 2 12 842 +0.0%
usbc.c 420 2 20 166 608 +0.0%
usbd.c 3530 57 90 355 3948 +0.0%
usbh.c 4967 57 82 1165 6238 +0.0%
usbtmc_device.c 2194 24 68 313 2539 +0.0%
vendor_device.c 639 0 534 559 1197 +0.0%
video_device.c 4432 5 1235 480 4904 +0.0%
TOTAL 121907 1071 17034 51958 175444 +0.0%

@github-actions

github-actions Bot commented May 14, 2026

Copy link
Copy Markdown

MemBrowse Memory Report

Top 10 targets by memory change (%) (out of 2384 targets) View Project Dashboard →

target .text .rodata .data .bss total % diff
msp_exp432e401y/dfu_runtime 8,592 → 8,960 (+368) 1,048 → 1,056 (+8) 10,372 → 10,748 (+376) +3.6%
msp_exp432e401y/hid_generic_inout 9,644 → 10,012 (+368) 1,252 → 1,260 (+8) 11,628 → 12,004 (+376) +3.2%
msp_exp432e401y/hid_multiple_interface 10,428 → 10,796 (+368) 1,128 → 1,136 (+8) 12,288 → 12,664 (+376) +3.1%
msp_exp432e401y/hid_boot_interface 10,440 → 10,808 (+368) 1,128 → 1,136 (+8) 12,300 → 12,676 (+376) +3.1%
msp_exp432e401y/hid_composite 10,652 → 11,020 (+368) 1,116 → 1,124 (+8) 12,500 → 12,876 (+376) +3.0%
msp_exp432e401y/midi_test 10,748 → 11,120 (+372) 1,248 → 1,256 (+8) 12,728 → 13,108 (+380) +3.0%
ek_tm4c123gxl/dfu_runtime 8,540 → 8,928 (+388) 612 → 620 (+8) 13,316 → 13,712 (+396) +3.0%
msp_exp432e401y/cdc_dual_ports 11,592 → 11,964 (+372) 1,432 → 1,440 (+8) 13,756 → 14,136 (+380) +2.8%
apard32690/dfu_runtime 12,996 → 13,384 (+388) 1,356 → 1,364 (+8) 14,432 → 14,828 (+396) +2.7%
ek_tm4c123gxl/hid_generic_inout 9,536 → 9,924 (+388) 816 → 824 (+8) 14,512 → 14,908 (+396) +2.7%

Handle cases where a new SETUP arrives before the previous control transfer fully completes by buffering the SETUP and replaying it after status completion. Split EP0 DATA state into DATA_IN/DATA_OUT and finalize pending status-out completion before processing deferred SETUP.

Signed-off-by: HiFiPhile <admin@hifiphile.com>
Signed-off-by: HiFiPhile <admin@hifiphile.com>
@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown

Hardware-in-the-loop (HIL) Test Report

hfp-iar

Board cdc_dual_ports dfu cdc_msc cdc_msc_throughput audio_test_freertos dfu_runtime cdc_msc_freertos hid_boot_interface msc_dual_lun hid_generic_inout printer_to_cdc midi_test mtp
stm32l412nucleo ✅ CDC 635k/405k MSC 798k/752k
stm32f746disco ✅ CDC 12.2M/7.7M MSC 15M/29.7M
stm32f746disco-DMA ✅ CDC 12.6M/9M MSC 13M/30.6M
lpcxpresso43s67 ✅ CDC 10M/9.9M MSC 21.5M/25.3M

Legend: ✅ pass · ❌ fail · ⚪ skipped · blank not run

hfp.json

Board cdc_dual_ports dfu cdc_msc cdc_msc_throughput audio_test_freertos dfu_runtime cdc_msc_freertos hid_boot_interface msc_dual_lun hid_generic_inout printer_to_cdc midi_test mtp
stm32l412nucleo ✅ CDC 631k/404k MSC 796k/789k
stm32f746disco ✅ CDC 12.2M/9.1M MSC 14.6M/28.8M
stm32f746disco-DMA ✅ CDC 13.1M/9M MSC 13.1M/29.8M
lpcxpresso43s67 ✅ CDC 11.8M/12.3M MSC 23.8M/24.9M

Legend: ✅ pass · ❌ fail · ⚪ skipped · blank not run

tinyusb.json

Board cdc_dual_ports dfu cdc_msc cdc_msc_throughput audio_test_freertos dfu_runtime cdc_msc_freertos hid_boot_interface msc_dual_lun hid_generic_inout printer_to_cdc midi_test mtp host_info_to_device_cdc cdc_msc_hid msc_file_explorer msc_file_explorer_freertos device_info hid_composite_freertos
ek_tm4c123gxl ✅ CDC 459k/831k MSC 601k/943k
espressif_p4_function_ev rd 409KB/s
espressif_p4_function_ev-DMA rd 409KB/s
espressif_s3_devkitm rd 409KB/s
espressif_s3_devkitm-DMA rd 409KB/s
feather_nrf52840_express ✅ CDC 446k/530k MSC 407k/479k
max32666fthr ✅ CDC 7M/14.3M MSC 7.4M/21M
metro_m4_express ✅ CDC 526k/537k MSC 403k/498k
mimxrt1015_evk ✅ CDC 12.9M/7.1M MSC 22.9M/18M
mimxrt1064_evk ✅ CDC 22M/17.8M MSC 30.4M/24.1M rd 1361KB/s rd 1365KB/s
lpcxpresso11u37 ✅ CDC 372k/323k MSC 482k/552k
ra4m1_ek ✅ CDC 339k/443k MSC 517k/503k
raspberry_pi_pico ✅ CDC 489k/408k MSC 609k/969k
raspberry_pi_pico_w rd 1106KB/s rd 1022KB/s
raspberry_pi_pico2 rd 1110KB/s rd 1022KB/s
adafruit_fruit_jam ✅ CDC 490k/383k MSC 538k/981k rd 62KB/s rd 62KB/s
stm32f072disco ✅ CDC 407k/309k MSC 479k/503k
stm32f407disco ✅ CDC 911k/475k MSC 828k/1000k
stm32f723disco ✅ CDC 952k/781k MSC 1M/1M rd 17476KB/s rd 4096KB/s
stm32f723disco-DMA ✅ CDC 954k/699k MSC 1M/1M rd 20164KB/s rd 4096KB/s
stm32h743nucleo ✅ CDC 508k/535k MSC 511k/511k
stm32h743nucleo-DMA ✅ CDC 469k/464k MSC 610k/559k
stm32g0b1nucleo ✅ CDC 444k/481k MSC 422k/467k
stm32l476disco ✅ CDC 536k/553k MSC 534k/521k
stm32u083nucleo ✅ CDC 670k/517k MSC 772k/550k
nanoch32v203-fsdev ✅ CDC 869k/670k MSC 849k/1M
nanoch32v203-usbfs ✅ CDC 768k/570k MSC 927k/1M

Legend: ✅ pass · ❌ fail · ⚪ skipped · blank not run

rhgndf pushed a commit to rhgndf/tinyusb that referenced this pull request Jun 11, 2026
The review action currently runs on the default Sonnet 4.6. On PR hathach#3643
(musb EP0 race) it posted "No issues found" while an Opus pass on the
same diff surfaced substantive questions (ISR-boundary RXRDY lifetime,
regression scope of the DATA-state split). Pin the reviewer to
claude-opus-4-8 for higher-signal reviews; subagents keep their cheaper
default models.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Repository owner deleted a comment from claude Bot Jun 11, 2026
@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

@hathach hathach left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review — high effort, cross-checked against MUSBMHDRC spec

Reviewed the EP0 deferred-SETUP state machine and the HIL serial changes (7 finder angles → 1-vote verify), then cross-checked the DCD findings against the MUSBMHDRC Product Specification & Programmer's Guide (Mentor PSPG 40161.003, §3.3.1 CSR0L, §21.1.5 EP0 service routine, §21.1.7 additional actions).

The deferral design is sound for the STATUS_OUT and coalesced DATA_IN/STATUS_IN paths — the spec confirms a post-status SETUP arrives as plain RXRDY with no SETEND, and the TU_ASSERT(count0 == 8) in the deferral cases is spec-sound (status ZLPs don't raise RxPktRdy; oversized status packets auto-STALL). Four correctness issues and a few cleanups below, ranked by severity.

Checked and ruled out (for the record):

  • Task-context replay racing the ISR: dcd_edpt_xfer brackets edpt0_xfer with musb_dcd_int_disable/enable, so pipe0_process_deferred_setup from edpt0_xfer is serialized against the ISR. ✅
  • DATA_IN deferral with remain_wlength > 0: spec-impossible — §21.1.5.2: “If either a SETUP or an OUT token is received whilst the endpoint is in the TX state, this will cause a SetupEnd condition”, and SetupEnd sets at the token, before RxPktRdy; process_ep0 handles SETEND before the RXRDY switch. ✅
  • static MUSB_BASES: only consumer is MUSB_REGS(); the change also fixes a duplicate-definition hazard for combined host+device TI builds. ✅

Comment thread src/portable/mentor/musb/dcd_musb.c Outdated
// - Data IN finished and status OUT is received, both IRQs and new setup packet IRQ arrive at the same time.
// could happen when CPU load is high, save the new setup packet for later processing after current status stage complete.
case PIPE0_STATE_STATUS_OUT:
case PIPE0_STATE_STATUS_OUT_PENDING:

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[bug — race] Deferring in STATUS_OUT_PENDING when it was reached IRQ-first (process_ep0 STATUS_OUT case promoted it, usbd has not yet called edpt0_xfer(STATUS OUT)) consumes the status pairing early and replays the SETUP before usbd has retired the old transfer:

  1. IRQ1: last DATA IN sent → STATUS_OUT, queues xfer_complete(EP0_IN)
  2. IRQ2: status ZLP confirm (csrl==0) → STATUS_OUT_PENDING
  3. IRQ3: new OUT-data SETUP (e.g. SET_LINE_CODING) → deferral case → goto process_status fires xfer_complete(EP0_OUT,0), goes IDLE, replays the SETUP → state = DATA_OUT
  4. usbd task now processes the queued DATA-IN completion → status_stage_xact()edpt0_xfer(EP0_OUT, NULL, 0) lands in case DATA_OUT: sets _dcd.pipe0.buf = NULL and writes RXRDYC, un-NAKing the host
  5. Host's real DATA OUT arrives → tu_hwfifo_read(fifo, NULL, count0) → write through NULL

usbd has no stale-event discard for EP0 XFER_COMPLETE (usbd.c event loop processes them unconditionally), so nothing defuses this. The STATUS_OUT deferral case is fine (its pairing still routes through edpt0_xfer); the problem is only the STATUS_OUT_PENDING-via-IRQ-first ordering. Consider: in the deferral cases only save the packet and synthesize the missed pairing state, but never fire the status completion + replay until the usbd-driven edpt0_xfer confirms the old transfer is retired. A TU_ASSERT(_dcd.pipe0.buf) backstop in the DATA_OUT drain path would also catch this class cheaply.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to let claude work on it ?

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3699 (ee3dd01): STATUS_OUT_PENDING is split into PENDING_XFER/PENDING_IRQ. In the IRQ-first ordering the deferral now only holds the SETUP; usbd's edpt0_xfer(STATUS OUT) fires the status completion and replays, so it can no longer land in the replayed transfer's state. The TU_ASSERT(pipe0.buf) backstop is in the DATA OUT drain.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HiFiPhile no way, I aim to manual review all PRs. LLM is only for making it faster/easier. I am just testing a self-loop setup that have it xhigh effort review and auto-fix since it take pretty long (runing at night). Seem like I give it too much freedom :D

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, I think its suggestion to split STATUS_OUT_PENDING TO STATUS_OUT_PENDING_XFER/IRQ makes sense. There is indeed a race.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think twice and there is a race, I didn't check the fix though.

Comment thread src/portable/mentor/musb/dcd_musb.c Outdated

_dcd.pipe0.deferred_setup = setup_packet.req;
_dcd.pipe0.deferred_setup_valid = true;
goto process_status;

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[bug — race] The deferral path drains the SETUP from the FIFO but never writes ServicedRxPktRdy, and the replay also leaves RXRDY set for OUT/zero-length requests. Per spec, IntrTx is clear-on-read and the EP0 interrupt latches when RxPktRdy is set (§21.1.5 cause 1) — so if the SETUP's IRQ latches after the ISR's intr_tx read (dcd_int_handler) but before this function's csrl read, a second invocation of process_ep0 is guaranteed, and it observes stale state:

  • OUT-data replay (state DATA_OUT, RXRDY still set, FIFO drained): count0 reads 0 → fires a spurious xfer_complete(EP0_OUT, 0) which usbd interprets as a ZLP ending the new transfer's data stage while the host is still sending real data.
  • IN-data replay (RXRDYC written at replay, csrl==0, state DATA_IN): falls into process_status DATA_IN case → spurious xfer_complete(EP0_IN, stale xact_len) for a transfer usbd hasn't started.
  • Zero-length replay (state STATUS_IN, RXRDY set): re-enters the deferral case on a drained FIFO → TU_ASSERT(count0 == 8) fires, or garbage is saved as a deferred SETUP with asserts off.

Note the canonical flow (spec Fig 21-3) writes ServicedRxPktRdy immediately after unloading; the driver's delayed-ack NAK flow-control is fine, but the deferral path needs a guard against re-entry while RXRDY is parked (e.g. skip the RXRDY block when deferred_setup_valid, or track a 'rxrdy consumed' flag).

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3699 (851ebe6): new rxrdy_parked flag set wherever a drained packet's RXRDY is intentionally left set (deferral, OUT/zlp flow control, DATA OUT packets) and cleared at every RXRDYC write site. The RXRDY block returns early while parked, so the guaranteed stale latch is a no-op; replayed IN requests keep RXRDY parked until the edpt0_xfer(DATA IN) arm acks it, which closes the csrl==0 stale-completion variant too.

Comment thread src/portable/mentor/musb/dcd_musb.c Outdated
musb_ep_csr_t* ep_csr = get_ep_csr(musb_regs, 0);
uint_fast8_t csrl = ep_csr->csr0l;

if (csrl & MUSB_CSRL0_DATAEND) {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[bug risk — spec conflict] Three issues with this guard, from the MUSBMHDRC spec (PSPG 40161.003):

  1. Ordering: §21.1.5 mandates “Whenever the Endpoint 0 service routine is entered, the firmware must first check … a STALL condition or a premature end of control transfer” (Fig 21-2: SentStall? → SetupEnd? → state dispatch). This return executes before both. That matters because the core's auto-STALL conditions 1–2 (§21.1.7) are literally “the host sends an OUT/IN token after the DataEnd bit has been set — i.e. SentStall is most likely to fire exactly when DataEnd may still read back set, and this guard would skip clearing it (IntrTx already consumed → recovery lost).
  2. Read-back is unspecified: the CSR0L access table (§3.3.1, p.37) marks DataEnd as CPU set-only (unlike TxPktRdy r/set), while the page note says reads “reflect the status attained … as a result of writing”. So whether this bit ever reads 1 is vendor-dependent — on cores where it reads 0 the guard is dead code and the premature-STATUS_IN race it targets is unguarded.
  3. If it does read back set: the TxPktRdy-cleared interrupt of the last DATA IN packet (armed TXRDY|DATAEND per Fig 21-4) arrives before the status stage runs; dropping it means the later csrl==0 status IRQ gets consumed as the DATA-IN completion, leaving the STATUS_OUT pairing one event short — EP0 parks in STATUS_OUT_PENDING until the next SETUP rescues it via the deferral path.

Suggest moving the check after the STALLED/SETEND handling and adding a comment stating which hardware state it encodes + the read-back caveat.

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3699 (da749ee): SentStall/SetupEnd are checked first per §21.1.5, and the guard now sits below the RXRDY block as well — a coalesced DATAEND|RXRDY read must not swallow a SETUP on cores where DataEnd reads back 1. Comment documents the encoded hw state and the vendor-dependent read-back. On read-back-1 cores the late DATA-IN completion described in point 3 now recovers via the PENDING_XFER deferral case.

Comment thread src/portable/mentor/musb/dcd_musb.c Outdated
if (ep_addr == TU_EP0_OUT) { /* Ignore EP0 OUT */
_dcd.pipe0.state = PIPE0_STATE_IDLE;
_dcd.pipe0.buf = NULL;
_dcd.pipe0.deferred_setup_valid = false;

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[bug — protocol] Clearing deferred_setup_valid here discards a SETUP the hardware already ACKed on the wire. Reachable interleaving: DATA IN completes and the next SETUP gets captured into deferred_setup; the class driver then rejects the old request at CONTROL_STAGE_DATA → usbd calls dcd_edpt_stall(EP0) → the deferred SETUP is wiped and SendStall armed. The host, whose new SETUP was ACKed, proceeds with that request's data/status stage and receives STALL — an innocent request fails host-side with no tud callback ever seeing it. Spec §21.1.7 frames SendStall as aborting the current transfer; here it lands on the next one. Worth either replaying the deferred SETUP after the stall handshake completes, or at minimum documenting the drop (the host will retry in practice, but it surfaces as a transient control failure).

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3699 (92da00c): a deferred SETUP can only exist once the old transfer's status was seen on the wire, so SendStall would land on the host's next, already-ACKed request. dcd_edpt_stall(EP0 OUT) now skips arming the stall and replays the deferred SETUP instead.

Comment thread test/hil/hil_test.py Outdated
while total < len(data):
try:
written = ser.write(data[total:])
except serial.SerialTimeoutException:

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[bug — test harness] Treating SerialTimeoutException as "wrote 0 bytes" and retrying from the same total can duplicate bytes on the wire: pyserial's posix write() loops os.write() internally and raises after partial progress with the count lost (d = d[n:] progress is discarded by the raise). Realistic when the kernel tty buffer is nearly full because the device drains slowly: part of a chunk is pushed, the 2 s write_timeout expires, the retry re-sends the whole remaining slice → duplicated bytes → the CDC echo / write_and_check comparisons fail with data-mismatch asserts that look like device firmware bugs.

Safer: treat the timeout as fatal (the pre-PR behavior, just with the configurable timeout), or write in chunks small enough that a timeout implies ~nothing was sent.

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3699 (2c4b905): serial_write_all writes the whole buffer once and treats SerialTimeoutException as fatal. write_timeout is already a total per-call deadline in pyserial, so the retry loop added duplication risk without extending the budget; default bumped to 10 s to keep the old overall bound.

Comment thread test/hil/hil_test.py
for ch in 'cat README.TXT\r':
ser.write(ch.encode())
ser.flush()
serial_write_all(ser, ch.encode())

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[behavior change] The old ser.write(ch); ser.flush() guaranteed each character hit the wire (tcdrain) before the 2 ms pacing sleep; serial_write_all only guarantees it reached the kernel buffer. Under backpressure the kernel can coalesce several queued chars into one CDC bulk packet, defeating the per-character pacing this loop exists for (same at the dd 1024 loop below, line 822). Cheap fix: keep a ser.flush() after serial_write_all at these two per-char call sites.

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3699 (2c4b905): ser.flush() restored after serial_write_all in the cat/dd per-character loops. Other call sites stay flush-free to avoid reintroducing the unbounded-tcdrain hang this PR was addressing.

Comment thread test/hil/hil_test.py Outdated
total += written
continue

if time.monotonic() >= end:

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[cleanup] Deadline shape nits: (a) the deadline is only checked after a failed attempt, and a write started at t≈deadline still blocks a full write_timeout, so the effective bound is deadline + write_timeout (~12 s); (b) the 10 ms sleep is dead weight — ser.write() already blocks up to write_timeout per attempt, so the exception path already paces retries; (c) per-character call sites (cat README.TXT\r = 15 calls) each get a fresh 10 s deadline, so a sluggish device can stretch one command to minutes against CMD_TIMEOUT=180. Checking the deadline before the write, dropping the sleep, and sharing one deadline per command string would fail fast (≤10 s).

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved in #3699 (2c4b905) by removing the loop entirely — single write with write_timeout as the total deadline; the late deadline check, the dead 10 ms sleep, and the per-call deadline reset all go away with it.

Comment thread src/portable/mentor/musb/dcd_musb.c Outdated
case PIPE0_STATE_STATUS_IN:
case PIPE0_STATE_DATA_IN: {
TU_ASSERT(sizeof(tusb_control_request_t) == count0, );
union {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[cleanup] This 8-byte FIFO-read block (union + two fifo[0] reads + TU_ASSERT on count0) is duplicated verbatim from the PIPE0_STATE_IDLE case above. A small pipe0_read_setup(musb_regs, &req) helper next to pipe0_start_setup() would keep the two hardware drain sequences from drifting apart — a future fix applied to one copy but not the other would only manifest on the rare deferred-race path, where it's hardest to debug.

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3699 (6ecd5c6): extracted pipe0_read_setup(), used by both the IDLE and deferral paths.

Comment thread src/portable/mentor/musb/dcd_musb.c Outdated
return;
}

process_status:

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[cleanup] The goto process_status arrives here with csrl != 0 (RXRDY set), so the guarding comment "When CSRL0 is zero…" is no longer true on that path. Extracting the completion switch into a helper called from both the deferral case and the csrl==0 path would remove the goto, keep the comment truthful, and incidentally match the spec's service-routine structure (Fig 21-2 dispatches on state after the abort checks).

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #3699 (b7715a9): goto removed — the deferral block handles its per-state transitions inline, and the csrl==0 switch is only reached with csrl==0, so the comment is truthful again.

Comment thread src/portable/mentor/musb/dcd_musb.c Outdated
} else {
if (req->bmRequestType & TUSB_DIR_IN_MASK) {
_dcd.pipe0.state = PIPE0_STATE_DATA_IN;
ep_csr->csr0l = MUSB_CSRL0_RXRDYC;

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[cleanup] The pre-PR comment explaining this asymmetry was dropped in the move: “If OUT (rx) direction, let edpt0_xfer() clear RXRDY when it's ready to receive data.” It's load-bearing — acking RXRDY for OUT before edpt0_xfer arms the drain buffer would let the host send data with nowhere to put it. It's also a deliberate deviation from the spec's canonical flow (Fig 21-3 sets ServicedRxPktRdy immediately after unload), used as NAK flow-control — worth restoring the comment so nobody "unifies" the branches later.

@hathach hathach Jun 12, 2026

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored in #3699 (bf6f3fc) in pipe0_start_setup, with the NAK flow-control rationale and a note on the replay path.

hathach added a commit that referenced this pull request Jun 12, 2026
The 8-byte EP0 SETUP drain (count0 assert + two FIFO word reads via a
union) was duplicated verbatim between the IDLE case and the deferral
case; a future fix applied to one copy but not the other would only
show up on the rare deferred-race path. Share one helper. count0 is
now read inside the only remaining user (DATA OUT drain).

Review follow-up for #3643 (dcd_musb.c l.507 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 12, 2026
The pre-existing comment explaining why the OUT branch does not ack
RxPktRdy was dropped when the SETUP handling moved into
pipe0_start_setup(). It is load-bearing: acking before edpt0_xfer()
arms the drain buffer would let the host send data with nowhere to
put it. Restore it with the databook-deviation rationale so the
branches don't get "unified" later.

Review follow-up for #3643 (dcd_musb.c l.116 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 12, 2026
The goto jumped into the csrl==0 completion switch with RXRDY still
set, making its "When CSRL0 is zero" guard comment untrue on that
path. Handle each deferral state in a self-contained switch instead;
the csrl==0 switch is now only reached with csrl==0 and its comment
is truthful again. Behavior unchanged.

Review follow-up for #3643 (dcd_musb.c l.523 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 12, 2026
MUSBMHDRC 21.1.5 requires the EP0 service routine to check SentStall
and SetupEnd first; the early DATAEND return ran before both, and
SentStall is most likely to fire exactly while DataEnd may still read
back set (auto-STALL after DataEnd, 21.1.7), which would skip the
recovery. The guard also moves below the RXRDY block so a coalesced
DATAEND|RXRDY read cannot swallow a SETUP on cores where the
CPU-set-only DataEnd bit reads back 1; the comment documents the
vendor-dependent read-back.

Review follow-up for #3643 (dcd_musb.c l.445 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 12, 2026
STATUS_OUT_PENDING conflated "edpt0_xfer(STATUS OUT) called, awaiting
confirm IRQ" with "confirm IRQ seen, awaiting edpt0_xfer". The
deferral path completed the status and replayed the saved SETUP from
the ISR in both flavors; in the IRQ-first one, usbd's still-
outstanding edpt0_xfer(STATUS OUT) for the old transfer (queued via
status_stage_xact) then landed in the replayed transfer's state and
corrupted it: NULL pipe0.buf armed plus RXRDYC, so the host's next
DATA OUT drained through a NULL pointer. usbd processes EP0
XFER_COMPLETE events unconditionally, so nothing downstream defuses
it.

Split the state into STATUS_OUT_PENDING_XFER / _IRQ. The deferral
completes and replays only in PENDING_XFER (old transfer already
retired); in PENDING_IRQ it only holds the SETUP and the usbd-driven
edpt0_xfer fires the completion and replays. The DATA_IN deferral now
synthesizes PENDING_IRQ (its remain==0 invariant asserted: a SETUP
before DataEnd raises SetupEnd instead), which also makes the old
deferred-promotion in the csrl==0 DATA_IN case unreachable - dropped.
Assert the drain buffer before the DATA OUT FIFO read as a cheap
backstop for this corruption class.

Review follow-up for #3643 (dcd_musb.c l.503 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 12, 2026
The deferral path drains the SETUP but leaves RxPktRdy set, and the
SETUP's IRQ latches after the ISR's clear-on-read intr_tx read - so a
second process_ep0 pass (same ISR, via the intr_tx re-read merge) is
guaranteed and misreads the leftovers: count0==0 fires a spurious
DATA OUT completion, the replay's RXRDYC write turns the second pass
into a phantom csrl==0 DATA IN completion, and a zero-length replay
re-enters the deferral case on a drained FIFO (count0 assert or
garbage saved as a SETUP). The registers cannot expose the staleness:
RxPktRdy and count0 read unchanged until ServicedRxPktRdy is written.

Track it in software: rxrdy_parked means "RxPktRdy is set in hw but
its packet was already consumed". Set wherever a drained packet's
RXRDY is intentionally left set (OUT/zero-length flow-control parks,
every DATA OUT drain awaiting the next arm, the deferral path);
cleared at every RXRDYC write site (edpt0_xfer arms, dcd_set_address,
STALLED/SETEND recovery, bus reset). The RXRDY block returns early
while parked. Replayed IN requests skip the RXRDYC in
pipe0_start_setup and keep the packet parked until the
edpt0_xfer(DATA IN) arm acks it (before loading the shared FIFO), so
the stale pass sees RXRDY+parked instead of csrl==0. The normal IDLE
path is unchanged - master never re-entered these windows because the
single SETUP edge was always consumed by the pass that parked it; the
deferral is what introduced a pending second pass.

Review follow-up for #3643 (dcd_musb.c l.516 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 12, 2026
dcd_edpt_stall(EP0 OUT) discarded the deferred SETUP and armed
SendStall. A deferred SETUP can only exist once the old transfer's
status stage was seen on the wire, so the request usbd is rejecting
(class callback failing at CONTROL_STAGE_DATA) already succeeded
host-side and the hardware already ACKed the next SETUP - the STALL
would land on that innocent request, which then fails host-side
without any tud callback ever seeing it. Skip the stall and replay
the deferred SETUP; the rejected transfer needs no wire-level stall
since it is already over.

Review follow-up for #3643 (dcd_musb.c l.860 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach and others added 9 commits June 13, 2026 23:34
The 8-byte EP0 SETUP drain (count0 assert + two FIFO word reads via a
union) was duplicated verbatim between the IDLE case and the deferral
case; a future fix applied to one copy but not the other would only
show up on the rare deferred-race path. Share one helper. count0 is
now read inside the only remaining user (DATA OUT drain).

Review follow-up for #3643 (dcd_musb.c l.507 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The pre-existing comment explaining why the OUT branch does not ack
RxPktRdy was dropped when the SETUP handling moved into
pipe0_start_setup(). It is load-bearing: acking before edpt0_xfer()
arms the drain buffer would let the host send data with nowhere to
put it. Restore it with the databook-deviation rationale so the
branches don't get "unified" later.

Review follow-up for #3643 (dcd_musb.c l.116 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The goto jumped into the csrl==0 completion switch with RXRDY still
set, making its "When CSRL0 is zero" guard comment untrue on that
path. Handle each deferral state in a self-contained switch instead;
the csrl==0 switch is now only reached with csrl==0 and its comment
is truthful again. Behavior unchanged.

Review follow-up for #3643 (dcd_musb.c l.523 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
MUSBMHDRC 21.1.5 requires the EP0 service routine to check SentStall
and SetupEnd first; the early DATAEND return ran before both, and
SentStall is most likely to fire exactly while DataEnd may still read
back set (auto-STALL after DataEnd, 21.1.7), which would skip the
recovery. The guard also moves below the RXRDY block so a coalesced
DATAEND|RXRDY read cannot swallow a SETUP on cores where the
CPU-set-only DataEnd bit reads back 1; the comment documents the
vendor-dependent read-back.

Review follow-up for #3643 (dcd_musb.c l.445 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
STATUS_OUT_PENDING conflated "edpt0_xfer(STATUS OUT) called, awaiting
confirm IRQ" with "confirm IRQ seen, awaiting edpt0_xfer". The
deferral path completed the status and replayed the saved SETUP from
the ISR in both flavors; in the IRQ-first one, usbd's still-
outstanding edpt0_xfer(STATUS OUT) for the old transfer (queued via
status_stage_xact) then landed in the replayed transfer's state and
corrupted it: NULL pipe0.buf armed plus RXRDYC, so the host's next
DATA OUT drained through a NULL pointer. usbd processes EP0
XFER_COMPLETE events unconditionally, so nothing downstream defuses
it.

Split the state into STATUS_OUT_PENDING_XFER / _IRQ. The deferral
completes and replays only in PENDING_XFER (old transfer already
retired); in PENDING_IRQ it only holds the SETUP and the usbd-driven
edpt0_xfer fires the completion and replays. The DATA_IN deferral now
synthesizes PENDING_IRQ (its remain==0 invariant asserted: a SETUP
before DataEnd raises SetupEnd instead), which also makes the old
deferred-promotion in the csrl==0 DATA_IN case unreachable - dropped.
Assert the drain buffer before the DATA OUT FIFO read as a cheap
backstop for this corruption class.

Review follow-up for #3643 (dcd_musb.c l.503 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The deferral path drains the SETUP but leaves RxPktRdy set, and the
SETUP's IRQ latches after the ISR's clear-on-read intr_tx read - so a
second process_ep0 pass (same ISR, via the intr_tx re-read merge) is
guaranteed and misreads the leftovers: count0==0 fires a spurious
DATA OUT completion, the replay's RXRDYC write turns the second pass
into a phantom csrl==0 DATA IN completion, and a zero-length replay
re-enters the deferral case on a drained FIFO (count0 assert or
garbage saved as a SETUP). The registers cannot expose the staleness:
RxPktRdy and count0 read unchanged until ServicedRxPktRdy is written.

Track it in software: rxrdy_consumed means "RxPktRdy is set in hw but
its packet was already consumed". Set wherever a drained packet's
RXRDY is intentionally left set (OUT/zero-length flow-control parks,
every DATA OUT drain awaiting the next arm, the deferral path);
cleared at every RXRDYC write site (edpt0_xfer arms, dcd_set_address,
STALLED/SETEND recovery, bus reset). The RXRDY block returns early
while parked. Replayed IN requests skip the RXRDYC in
pipe0_start_setup and keep the packet parked until the
edpt0_xfer(DATA IN) arm acks it (before loading the shared FIFO), so
the stale pass sees RXRDY+parked instead of csrl==0. The normal IDLE
path is unchanged - master never re-entered these windows because the
single SETUP edge was always consumed by the pass that parked it; the
deferral is what introduced a pending second pass.

Review follow-up for #3643 (dcd_musb.c l.516 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
dcd_edpt_stall(EP0 OUT) discarded the deferred SETUP and armed
SendStall. A deferred SETUP can only exist once the old transfer's
status stage was seen on the wire, so the request usbd is rejecting
(class callback failing at CONTROL_STAGE_DATA) already succeeded
host-side and the hardware already ACKed the next SETUP - the STALL
would land on that innocent request, which then fails host-side
without any tud callback ever seeing it. Skip the stall and replay
the deferred SETUP; the rejected transfer needs no wire-level stall
since it is already over.

Review follow-up for #3643 (dcd_musb.c l.860 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The actual-STALL path (no deferred SETUP) forced EP0 to IDLE but left
rxrdy_consumed set if the aborted transfer had parked RXRDY via NAK flow
control (e.g. a rejected OUT-data request in DATA_OUT). A subsequent
SETUP IRQ would then hit the parked-gate early return and be ignored,
relying on SentStall/SetupEnd to clear the flag first. Clear it here
so recovery never depends on that ordering.

Addresses Copilot review on #3699.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
pyserial's posix write() raises SerialTimeoutException after partial
progress with the byte count lost, so the retry loop re-sent from the
same offset and could duplicate bytes on the wire — surfacing as bogus
data-mismatch failures that look like device firmware bugs.
write_timeout is already a total per-call deadline, so the loop added
duplication risk without extending the budget: write once and treat a
timeout as fatal. Default bumped 2 -> 10 s to keep the old overall
bound; HIL_SERIAL_WRITE_DEADLINE removed.

The per-character CLI loops keep their existing pacing (the 2 ms sleep
between single-byte writes already spaces them on the wire); no
unbounded ser.flush()/tcdrain is added.

Review follow-up for #3643 (hil_test.py l.257/264 findings).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 13, 2026
The deferral path drains the SETUP but leaves RxPktRdy set, and the
SETUP's IRQ latches after the ISR's clear-on-read intr_tx read - so a
second process_ep0 pass (same ISR, via the intr_tx re-read merge) is
guaranteed and misreads the leftovers: count0==0 fires a spurious
DATA OUT completion, the replay's RXRDYC write turns the second pass
into a phantom csrl==0 DATA IN completion, and a zero-length replay
re-enters the deferral case on a drained FIFO (count0 assert or
garbage saved as a SETUP). The registers cannot expose the staleness:
RxPktRdy and count0 read unchanged until ServicedRxPktRdy is written.

Track it in software: rxrdy_parked means "RxPktRdy is set in hw but
its packet was already consumed". Set wherever a drained packet's
RXRDY is intentionally left set (OUT/zero-length flow-control parks,
every DATA OUT drain awaiting the next arm, the deferral path);
cleared at every RXRDYC write site (edpt0_xfer arms, dcd_set_address,
STALLED/SETEND recovery, bus reset). The RXRDY block returns early
while parked. Replayed IN requests skip the RXRDYC in
pipe0_start_setup and keep the packet parked until the
edpt0_xfer(DATA IN) arm acks it (before loading the shared FIFO), so
the stale pass sees RXRDY+parked instead of csrl==0. The normal IDLE
path is unchanged - master never re-entered these windows because the
single SETUP edge was always consumed by the pass that parked it; the
deferral is what introduced a pending second pass.

Review follow-up for #3643 (dcd_musb.c l.516 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 13, 2026
dcd_edpt_stall(EP0 OUT) discarded the deferred SETUP and armed
SendStall. A deferred SETUP can only exist once the old transfer's
status stage was seen on the wire, so the request usbd is rejecting
(class callback failing at CONTROL_STAGE_DATA) already succeeded
host-side and the hardware already ACKed the next SETUP - the STALL
would land on that innocent request, which then fails host-side
without any tud callback ever seeing it. Skip the stall and replay
the deferred SETUP; the rejected transfer needs no wire-level stall
since it is already over.

Review follow-up for #3643 (dcd_musb.c l.860 finding).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach added a commit that referenced this pull request Jun 13, 2026
pyserial's posix write() raises SerialTimeoutException after partial
progress with the byte count lost, so the retry loop re-sent from the
same offset and could duplicate bytes on the wire — surfacing as bogus
data-mismatch failures that look like device firmware bugs.
write_timeout is already a total per-call deadline, so the loop added
duplication risk without extending the budget: write once and treat a
timeout as fatal. Default bumped 2 -> 10 s to keep the old overall
bound; HIL_SERIAL_WRITE_DEADLINE removed.

The per-character CLI loops keep their existing pacing (the 2 ms sleep
between single-byte writes already spaces them on the wire); no
unbounded ser.flush()/tcdrain is added.

Review follow-up for #3643 (hil_test.py l.257/264 findings).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
hathach and others added 9 commits June 15, 2026 14:53
Pure cleanup, no behavior change:
- Extract the EP0 control-transfer state into a named pipe0_state_t
  typedef instead of an anonymous nested struct, and access it through
  a local pipe0_state_t* in the functions that touch it repeatedly.
- Group the pipe0 fields so the two bools sit together and the larger
  tusb_control_request_t deferred_setup is last.
- Reword the deferral comments: "coalesced" -> "combined".

Note: separating the edpt0_xfer DATA_IN/DATA_OUT case (dispatch on
state instead of dir_in) was attempted and reverted — it breaks ADI
MUSB enumeration. usbd can arm the opposite-direction status while
pipe0 is still in a DATA state, and only dir-dispatch routes that
correctly; a comment on the combined case records this.

Verified: HIL pass on ek_tm4c123gxl and max32666fthr (13/13 each).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A short IN control response (device sends fewer bytes than wLength —
e.g. the 18-byte device descriptor answering a 64-byte GET_DESCRIPTOR)
left remain_wlength != 0, so the DATA_IN -> STATUS_OUT transition never
fired and pipe0 stayed in DATA_IN through the status stage. usbd then
armed the status-OUT while state was still DATA_IN. Set DATAEND and
transition on the last packet: remain_wlength == 0, or a short packet
(incl. a terminating ZLP) which ends the data stage.

With state now tracking the stage, split edpt0_xfer's DATA handling
into separate DATA_IN / DATA_OUT cases dispatching on state (asserting
state == call direction) instead of the combined dir_in branch.

Verified: HIL pass on ek_tm4c123gxl and max32666fthr (13/13 each),
including the #3643 high-CPU-load IRQ-toggle coalescing stress.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Rename process_ep0/process_epin/process_epout/process_bus_reset (all
invoked only from dcd_int_handler) to *_isr, making their ISR context
explicit at every call site. pipe0_process_deferred_setup is left
as-is since it also runs from task context (dcd_edpt_stall).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The deferral (RXRDY-combined) and csrl==0 tail paths in process_ep0_isr
ran the same per-state status-stage logic. Move all of it into one
pipe0_process_status_isr() helper covering every state including
DATA_IN, which picks STATUS_OUT vs STATUS_OUT_PENDING_IRQ from
deferred_setup_valid (a deferred SETUP means the status confirm was
coalesced with it). Both callers now just invoke the helper; the
deferral path saves the SETUP and sets deferred_setup_valid first.

Also drops the deferral path's TU_ASSERT(remain_wlength == 0), which
was wrong for a short last DATA-IN packet, and renames
pipe0_process_deferred_setup -> pipe0_try_deferred_setup (it no-ops
when nothing is deferred).

Verified: HIL pass on ek_tm4c123gxl and max32666fthr (13/13 each),
including the #3643 high-CPU-load IRQ-toggle coalescing stress.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
pipe0_read_setup() copied the FIFO into a local union, then copied that
into the caller's struct. Read the two FIFO words straight into the
caller's uint32_t[2] (one copy) and cast to tusb_control_request_t* in
pipe0_start_setup(). pipe0.deferred_setup becomes uint32_t[2] so the
deferral path reads directly into it as well.

Verified: HIL pass on ek_tm4c123gxl and max32666fthr (13/13 each).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Mirror the IN-side short-packet fix on the OUT drain: end the data
stage (-> STATUS_IN) when wLength is received OR a short OUT packet
(count0 < CFG_TUD_ENDPOINT0_SIZE) signals the host's end-of-data, not
only when remain_wlength hits exactly 0. Also clamp the
remain_wlength subtraction so a host that overruns wLength can't
underflow it and strand the transfer.

Without this, a control-OUT whose host sends fewer bytes than wLength
left pipe0 in DATA_OUT; usbd then armed STATUS IN and tripped the
split's TU_ASSERT(!dir_in). Found by /code-review; conformant hosts
send exactly wLength so HIL was already green.

Verified: HIL pass on ek_tm4c123gxl and max32666fthr (13/13 each).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…_isr

The helper advances the whole EP0 control state machine on a
completion/confirmation IRQ — it dispatches on pipe0->state and also
fires the DATA_IN completion, not just the status stage — so
"process_status" undersold it. Matches the process_*_isr family.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Cleanup from a code-review pass, no behavior change:
- Replace the open-coded "last DATA packet" test (remain_wlength == 0 ||
  len < CFG_TUD_ENDPOINT0_SIZE), duplicated in the edpt0_xfer DATA IN
  arm, pipe0_process_xfer_state_isr, and the DATA OUT drain, with one
  inline pipe0_data_stage_done() so IN and OUT can't drift.
- Correct the xact_len comment (only the IN path reports it; OUT reports
  count0) and the dcd_edpt_stall comment (a deferred SETUP means the old
  transfer ended on the wire, not that its status stage was "seen").

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
dcd/musb: harden & refactor the EP0 control state machine (review follow-up for #3643)

@hathach hathach left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, thank you very much for yet another the excellent PR

@hathach hathach merged commit 52035e2 into master Jun 17, 2026
334 of 336 checks passed
@hathach hathach deleted the musb_ep0_race branch June 17, 2026 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants