-
Notifications
You must be signed in to change notification settings - Fork 1.5k
net/iob: Add CONFIG_IOB_OWNER_TRACKING to record IOB owner for debugging #17847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
fs/procfs/fs_procfsiobinfo.c
Outdated
|
|
||
| /* Print: PID (or -1 for ISR) and flags in hex */ | ||
|
|
||
| if (iob->io_owner_flags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (iob->io_owner_flags) | |
| if (iob->io_owner_flags != 0) |
acassis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PetervdPerk-NXP this is a new IOB feature, please include a Documentation to it: https://nuttx.apache.org/docs/latest/reference/os/iob.html
|
Thank you @PetervdPerk-NXP :-) |
|
@PetervdPerk-NXP Very good PR, but I have some questions:
We are also enhancing the debugging capabilities of iob internally to address issues such as iob memory leak, iob exhaustion, and iob trampling. |
This adds a Kconfig option to optionally track the allocating context of each IOB (task PID or ISR). The extra metadata makes it much easier to diagnose IOB leaks, starvation, and ownership/lifetime bugs across task/ISR boundaries without changing the IOB API seen by callers. Signed-off-by: Peter van der Perk <[email protected]>
0852305 to
15756da
Compare
You’re correct, the current implementation only tracks allocation, and I wasn’t aware that ownership could change. My assumption was that copyout typically results in freeing the buffer. If you’d like to improve the PR to handle ownership changes, please feel free to do so.
Regarding proc/iobinfo, the current approach searches for each IOB individually, which does lead to duplicate information. Printing the data as an IOB chain would indeed make the structure more intuitive, but that requires significant bookkeeping to reconstruct the chains. Additionally, the procfs printf API isn’t very convenient for this. For now, I’d prefer to keep it simple since the main goal is post-mortem analysis in cases like deadlocks, where you can manually rebuild the chains and identify owners.
Any additional tooling to improve IOB debugging would be extremely valuable. At present, IOB is quite complex and difficult to trace, which makes diagnosing issues challenging. Ideally, these problems would be addressed alongside the broader concerns in the NuttX networking stack (see #17299 and #5973). Unfortunately, I don’t have the bandwidth to work on this right now. Without significant improvements, the current state effectively renders the NuttX networking stack impractical for serious use, and it may be worth considering alternatives until these issues are resolved. It seems Li Auto faced similar challenges when using NuttX for its Li Auto Halo OS and ultimately decided not to use the NuttX networking stack, which is unfortunate. |
Yes, his flow is complex. We need to add updates for "ower" at specific locations. These locations could be within the internal APIs related to "pkt", not just the "iob" API itself.
ok,I am currently working on an optimization: removing qentry and concatenating all iob elements onto the iob queue. The end of the iob chain has a special flag to indicate it. After this optimization, it will be much easier to present this organizational structure.
I am currently working on adding this feature. The internal coding has been largely completed. The main idea is to implement the native iob using Nuttx's multi-level memory pool. This allows us to reuse the trace functionality of the memory pool, including PID and backtrace. Additionally, we can protect the iob using KASAN and monitor memory overflows. I will submit it around next week. At that time, please provide your review suggestions based on my patch. We can compare your current implementation and collectively assess which solution is the best. The overall aim is to enable faster identification of the iob issue. @PetervdPerk-NXP |
Summary
This adds a Kconfig option to optionally track the allocating context of each IOB (task PID or ISR). The extra metadata makes it much easier to diagnose IOB leaks, starvation, and ownership/lifetime bugs across task/ISR boundaries without changing the IOB API seen by callers.
Example
Impact
Kconfig debug toggle
Testing
IMXRT