Conversation
We observed that ClickHouse can spend upwards of 80% of its cpu time in pthread_setname_np. This happens because (1) ClickHouse constantly renames its threads for debugging purposes, and (2) thread renaming is relatively expensive on illumos. Since we don't make use of this debugging path, we can work around the performance issue by bailing out of ClickHouse's setThreadName helper early. Note: this patch can't be upstreamed, so a proper long-term fix would involve adding a faster thread rename facility in illumos. Fixes oxidecomputer/customer-support#1101. h/t @wfchandler and @JustinAzoff, who found the bug and did the research.
bnaecker
left a comment
There was a problem hiding this comment.
I think this seems ok for us, since we don't really use the name at this point. It's probably worth filing a host OS issue for the underlying problem, that setting the name appears much slower than other OS's.
Thanks to you, @wfchandler and @JustinAzoff for tracking this down!
|
I wonder if you could avoid the patch entirely, but preloading a small shared object that overloads the, |
Yes, but that requires another patch. It turns out that clickhouse unsets LD_PRELOAD on start, so we'd also have to patch it for that approach anyway. |
|
Oh, the irony. :-/ |
We observed that ClickHouse can spend upwards of 80% of its cpu time in pthread_setname_np. This happens because (1) ClickHouse constantly renames its threads for debugging purposes, and (2) thread renaming is relatively expensive on illumos. Since we don't make use of this debugging path, we can work around the performance issue by bailing out of ClickHouse's setThreadName helper early.
Note: this patch can't be upstreamed, so a proper long-term fix would involve adding a faster thread rename facility in illumos.
Fixes https://github.com/oxidecomputer/customer-support/issues/1101. h/t @wfchandler and @JustinAzoff, who found the bug and did the research.