Skip to content

[ENHANCEMENT]: Robin Hood Hashing performance improvement - would a PR be welcome? #817

@aterenin

Description

@aterenin

Is your feature request related to a problem? Please describe.

See below.

Describe the solution you'd like

I recently decided, mostly for fun, to challenge myself to re-implement a concurrent hashing table based on Robin Hood hashing that I designed and built in 2016 as part of a project on GPU-accelerated Latent Dirichlet Allocation, but never actually got to work.

This time, I did get it to work - and according to my benchmarks, the resulting table has higher get throughput than cuCollections' static_map with either linear probing or double hashing, by quite a bit at sufficiently high load factors.

Performance

Repo for this comparison: https://github.com/aterenin/GPURobinHoodHashing. I've also got a blog post around some of the ideas and why I found this exercise interesting: https://avt.im/blog/sculpting-fragile-glass/.

I don't know what Nvidia's policy on external contributions to this library is, but if there is interest, I'd love to contribute the core algorithmic ideas to cuCollections so that other people can use them. I'd image this would look like a third option like cuco::double_hashing or cuco::linear_probing but which would implement the Robin Hood strategy. It'd probably improve performance in static_map and possibly other data structures depending on how advanced your templating logic is.

Would a PR on this be welcome? If yes, I'll put one together - I'd appreciate pointers on the cleanest way to add this structure-wise in case you have any: I value high-quality code and will work to do everything the right way. If not, no worries, feel free to just close the issue. Please let me know either way.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions