Skip to content

Conversation

@PetervdPerk-NXP
Copy link
Contributor

Summary

This adds a Kconfig option to optionally track the allocating context of each IOB (task PID or ISR). The extra metadata makes it much easier to diagnose IOB leaks, starvation, and ownership/lifetime bugs across task/ISR boundaries without changing the IOB API seen by callers.

Example

nsh> cat /proc/iobinfo
    ntotal     nfree     nwait nthrottle
        24         0         2         0
00 0x2024f77c -> 0x2024f850 pid=1306 flags=0x01
01 0x2024f850 -> 0 pid=1306 flags=0x01
02 0x2024f924 -> 0x2024f77c pid=1306 flags=0x01
03 0x2024f9f8 -> 0x2024f924 pid=1306 flags=0x01
04 0x2024facc -> 0x2024f9f8 pid=1306 flags=0x01
05 0x2024fba0 -> 0x2024facc pid=1306 flags=0x01
06 0x2024fc74 -> 0x2024fba0 pid=1306 flags=0x01
07 0x2024fd48 -> 0 pid=1306 flags=0x01
08 0x2024fe1c -> 0x2024fd48 pid=1306 flags=0x01
09 0x2024fef0 -> 0x2024fe1c pid=1306 flags=0x01
10 0x2024ffc4 -> 0x2024fef0 pid=1306 flags=0x01
11 0x20250098 -> 0x2024ffc4 pid=1306 flags=0x01
12 0x2025016c -> 0x20250098 pid=1306 flags=0x01
13 0x20250240 -> 0x2025016c pid=1306 flags=0x01
14 0x20250314 -> 0x20250240 pid=1306 flags=0x01
15 0x202503e8 -> 0 pid=1306 flags=0x01
16 0x202504bc -> 0x202503e8 pid=1306 flags=0x01
17 0x20250590 -> 0x202504bc pid=1306 flags=0x01
18 0x20250664 -> 0x20250590 pid=1306 flags=0x01
19 0x20250738 -> 0x20250664 pid=1306 flags=0x01
20 0x2025080c -> 0x20250738 pid=1306 flags=0x01
21 0x202508e0 -> 0x2025080c pid=1306 flags=0x01
22 0x202509b4 -> 0x202508e0 pid=1306 flags=0x01
23 0x20250a88 -> 0 pid=1306 flags=0x01

Impact

Kconfig debug toggle

Testing

IMXRT

@github-actions github-actions bot added Area: File System File System issues Area: Memory Management Memory Management issues Size: M The size of the change in this PR is medium labels Jan 12, 2026

/* Print: PID (or -1 for ISR) and flags in hex */

if (iob->io_owner_flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (iob->io_owner_flags)
if (iob->io_owner_flags != 0)

Copy link
Contributor

@acassis acassis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PetervdPerk-NXP this is a new IOB feature, please include a Documentation to it: https://nuttx.apache.org/docs/latest/reference/os/iob.html

@cederom
Copy link
Contributor

cederom commented Jan 13, 2026

Thank you @PetervdPerk-NXP :-)
TODO: syntax :-)

@Donny9
Copy link
Contributor

Donny9 commented Jan 13, 2026

@PetervdPerk-NXP Very good PR, but I have some questions:

  1. How to update the owner of iob? Since iob will be exchanged between the protocol stack and the driver, the current PR only records the "allocate" information, which is far from sufficient. These pieces of information may not reflect the actual owner information.
  2. In the current PR, proc/iobinfo will search for each iob. There are duplicate pieces of information. Can they be printed in the form of iob chain, so that it is more intuitive to see the organization of iob?

We are also enhancing the debugging capabilities of iob internally to address issues such as iob memory leak, iob exhaustion, and iob trampling.

This adds a Kconfig option to optionally track the allocating context
of each IOB (task PID or ISR). The extra metadata makes it much easier
to diagnose IOB leaks, starvation, and ownership/lifetime bugs across
task/ISR boundaries without changing the IOB API seen by callers.

Signed-off-by: Peter van der Perk <[email protected]>
@PetervdPerk-NXP
Copy link
Contributor Author

How to update the owner of iob? Since iob will be exchanged between the protocol stack and the driver, the current PR only records the "allocate" information, which is far from sufficient. These pieces of information may not reflect the actual owner information.

You’re correct, the current implementation only tracks allocation, and I wasn’t aware that ownership could change. My assumption was that copyout typically results in freeing the buffer. If you’d like to improve the PR to handle ownership changes, please feel free to do so.

In the current PR, proc/iobinfo will search for each iob. There are duplicate pieces of information. Can they be printed in the form of iob chain, so that it is more intuitive to see the organization of iob?

Regarding proc/iobinfo, the current approach searches for each IOB individually, which does lead to duplicate information. Printing the data as an IOB chain would indeed make the structure more intuitive, but that requires significant bookkeeping to reconstruct the chains. Additionally, the procfs printf API isn’t very convenient for this. For now, I’d prefer to keep it simple since the main goal is post-mortem analysis in cases like deadlocks, where you can manually rebuild the chains and identify owners.

We are also enhancing the debugging capabilities of iob internally to address issues such as iob memory leak, iob exhaustion, and iob trampling.

Any additional tooling to improve IOB debugging would be extremely valuable. At present, IOB is quite complex and difficult to trace, which makes diagnosing issues challenging. Ideally, these problems would be addressed alongside the broader concerns in the NuttX networking stack (see #17299 and #5973). Unfortunately, I don’t have the bandwidth to work on this right now.

Without significant improvements, the current state effectively renders the NuttX networking stack impractical for serious use, and it may be worth considering alternatives until these issues are resolved. It seems Li Auto faced similar challenges when using NuttX for its Li Auto Halo OS and ultimately decided not to use the NuttX networking stack, which is unfortunate.

@Donny9
Copy link
Contributor

Donny9 commented Jan 13, 2026

How to update the owner of iob? Since iob will be exchanged between the protocol stack and the driver, the current PR only records the "allocate" information, which is far from sufficient. These pieces of information may not reflect the actual owner information.

You’re correct, the current implementation only tracks allocation, and I wasn’t aware that ownership could change. My assumption was that copyout typically results in freeing the buffer. If you’d like to improve the PR to handle ownership changes, please feel free to do so.

Yes, his flow is complex. We need to add updates for "ower" at specific locations. These locations could be within the internal APIs related to "pkt", not just the "iob" API itself.

In the current PR, proc/iobinfo will search for each iob. There are duplicate pieces of information. Can they be printed in the form of iob chain, so that it is more intuitive to see the organization of iob?

Regarding proc/iobinfo, the current approach searches for each IOB individually, which does lead to duplicate information. Printing the data as an IOB chain would indeed make the structure more intuitive, but that requires significant bookkeeping to reconstruct the chains. Additionally, the procfs printf API isn’t very convenient for this. For now, I’d prefer to keep it simple since the main goal is post-mortem analysis in cases like deadlocks, where you can manually rebuild the chains and identify owners.

ok,I am currently working on an optimization: removing qentry and concatenating all iob elements onto the iob queue. The end of the iob chain has a special flag to indicate it. After this optimization, it will be much easier to present this organizational structure.

We are also enhancing the debugging capabilities of iob internally to address issues such as iob memory leak, iob exhaustion, and iob trampling.

Any additional tooling to improve IOB debugging would be extremely valuable. At present, IOB is quite complex and difficult to trace, which makes diagnosing issues challenging. Ideally, these problems would be addressed alongside the broader concerns in the NuttX networking stack (see #17299 and #5973). Unfortunately, I don’t have the bandwidth to work on this right now.

Without significant improvements, the current state effectively renders the NuttX networking stack impractical for serious use, and it may be worth considering alternatives until these issues are resolved. It seems Li Auto faced similar challenges when using NuttX for its Li Auto Halo OS and ultimately decided not to use the NuttX networking stack, which is unfortunate.

I am currently working on adding this feature. The internal coding has been largely completed. The main idea is to implement the native iob using Nuttx's multi-level memory pool. This allows us to reuse the trace functionality of the memory pool, including PID and backtrace. Additionally, we can protect the iob using KASAN and monitor memory overflows.

I will submit it around next week. At that time, please provide your review suggestions based on my patch. We can compare your current implementation and collectively assess which solution is the best. The overall aim is to enable faster identification of the iob issue. @PetervdPerk-NXP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: File System File System issues Area: Memory Management Memory Management issues Size: M The size of the change in this PR is medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants