Conversation
3c02fb5 to
3381152
Compare
3381152 to
0e5727a
Compare
|
Depends on VectorDB-NTU/RaBitQ-Library#36 |
|
|
||
| void HnswRabitqQueryAlgorithm::expand_neighbors_by_group( | ||
| TopkHeap &topk, HnswRabitqContext *ctx) const { | ||
| // if (!ctx->group_by().is_valid()) { |
There was a problem hiding this comment.
to return with some message when group's setup?
There was a problem hiding this comment.
implemented group by search, will test it after framework support group by .
| continue; | ||
| } | ||
| } else { | ||
| // Candidate cand{ResultRecord(candest.est_dist, candest.low_dist), |
There was a problem hiding this comment.
no need to check ex bits here? can it be checked earlier?
There was a problem hiding this comment.
It's necessary to check ex_bits here.
- first get_bin_est use single bit to estimate score
- ex_bits > 0
- yes. check if need to do full estimate with extra bits
- no. Nothing to be done since we already used all the information.
|
|
||
| // TODO: check loop type | ||
|
|
||
| // if ((!topk.full()) || cur_dist < topk[0].second) { |
There was a problem hiding this comment.
lagacy codes here and elsewhere can be cleaned up.
| query_wrapper.set_g_add(q_to_centroids[cluster_id]); | ||
| float est_dist = rabitqlib::split_distance_boosting( | ||
| ex_data, ip_func_, query_wrapper, padded_dim_, ex_bits_, res.ip_x0_qr); | ||
| float low_dist = est_dist - (res.est_dist - res.low_dist) / (1 << ex_bits_); |
There was a problem hiding this comment.
can predefine param to convert to multiply manipulation:
double inv_divisor = 1.0 / (1 << ex_bits_);
auto result = (res.est_dist - res.low_dist) * inv_divisor;
There was a problem hiding this comment.
get_ex_est is not used, deleted.
|
@greptile |
|
@Greptile |
|
@greptile |
|
@greptile |
| // limitations under the License. | ||
|
|
||
| #include "rabitq_converter.h" | ||
| #include <strings.h> |
There was a problem hiding this comment.
<strings.h> is a POSIX-specific header (providing strncasecmp) that is not available on MSVC. The VLA issues flagged in earlier review rounds were fixed to ensure MSVC compatibility — this is the same class of problem. On Windows you would need _strnicmp instead.
Consider using a portable helper or a conditional compilation guard:
| #include <strings.h> | |
| #include <cstring> |
Then replace the strncasecmp calls (lines 86, 88) with a portable wrapper, e.g.:
#ifdef _MSC_VER
# define strncasecmp _strnicmp
#endifor simply convert to lowercase first with std::transform.
|
|
||
| for (uint32_t i = 0; i < size; ++i) { | ||
| node_id_t node = neighbor_ids[i]; | ||
| EstimateRecord candest; | ||
| auto *cand_vector = entity_.get_vector(node); | ||
| ailego_prefetch(cand_vector); | ||
| get_bin_est(cand_vector, candest, *query_entity); | ||
|
|
||
| if (ex_bits_ > 0) { | ||
| // Check preliminary score against current worst full estimate. | ||
| bool flag_update_KNNs = | ||
| (!topk.full()) || candest.low_dist < topk[0].second.est_dist; | ||
|
|
||
| if (flag_update_KNNs) { | ||
| // Compute the full estimate if promising. | ||
| get_full_est(cand_vector, candest, *query_entity); | ||
| } else { | ||
| continue; | ||
| } | ||
| } | ||
| candidates.emplace(node, ResultRecord(candest)); | ||
| // update entry_point for next level scan | ||
| if (candest < *dist) { | ||
| *entry_point = node; | ||
| *dist = candest; | ||
| } | ||
| if (!filter(node)) { | ||
| topk.emplace(node, ResultRecord(candest)); | ||
| } | ||
| } // end for |
There was a problem hiding this comment.
Missing topk guard for
ex_bits_ == 0 case
When ex_bits_ > 0 the code correctly gates adding a node to candidates behind a preliminary lower-bound check:
bool flag_update_KNNs =
(!topk.full()) || candest.low_dist < topk[0].second.est_dist;
if (!flag_update_KNNs) continue;However, when ex_bits_ == 0 (i.e. total_bits = 1 quantization) that entire block is skipped and every non-visited neighbor is unconditionally inserted into candidates. With ex_bits_ == 0, the binary estimate from get_bin_est is already the best available estimate (low_dist ≈ est_dist), so the same guard can be applied without any change in correctness. Without it, the search degrades toward exhaustive BFS for the 1-bit case, defeating the beam-search pruning of HNSW.
Suggested fix:
for (uint32_t i = 0; i < size; ++i) {
node_id_t node = neighbor_ids[i];
EstimateRecord candest;
auto *cand_vector = entity_.get_vector(node);
ailego_prefetch(cand_vector);
get_bin_est(cand_vector, candest, *query_entity);
if (ex_bits_ > 0) {
bool flag_update_KNNs =
(!topk.full()) || candest.low_dist < topk[0].second.est_dist;
if (flag_update_KNNs) {
get_full_est(cand_vector, candest, *query_entity);
} else {
continue;
}
} else {
// ex_bits_ == 0: est_dist is already the best estimate
if (topk.full() && candest.est_dist >= topk[0].second.est_dist) {
continue;
}
}
candidates.emplace(node, ResultRecord(candest));
...
resolve #42
Greptile Summary
This PR adds full HNSW-RaBitQ support — an approximate nearest-neighbor algorithm that combines HNSW graph traversal with RaBitQ binary quantization — across the core algorithm layer, streaming/searching/building interfaces, Python bindings, DB integration, and test coverage. It is a large feature addition (~14,500 lines) that mirrors the structure of the existing HNSW implementation.
Previous review rounds have addressed the most critical issues (VLA portability, null-pointer dereferences in
cleanup, silentsearch_bf_by_p_keys_implno-op,cluster->mount()return value ignored, and memory-leak TODOs). The remaining concerns are:<strings.h>inrabitq_converter.cc—strncasecmpis not available on MSVC, the same portability class as the VLA issues that were already fixed.ex_bits_ == 0path inHnswRabitqQueryAlgorithm::search_neighbors— whentotal_bits=1is configured every non-visited neighbor is unconditionally pushed into the candidates heap without the topk-worst pruning that guards theex_bits_ > 0path, causing unnecessary work and degraded latency for this quantization setting.Confidence Score: 3/5
src/core/algorithm/hnsw_rabitq/rabitq_converter.cc(POSIX header) andsrc/core/algorithm/hnsw_rabitq/hnsw_rabitq_query_algorithm.cc(missing candidate guard for ex_bits_==0)Important Files Changed
ex_bits_==0all neighbor nodes are unconditionally added to the candidates heap without topk-based pruning, causing performance degradation in the 1-bit quantization case.<strings.h>header used forstrncasecmpis not available on MSVC; same portability class as the VLA issues already fixed in this PR.cleanup, dead null-check, rand() thread-safety) have been addressed; logic looks correct.search_bf_by_p_keys_implhas been properly implemented (was previously fully commented out returning 0); all three search paths (ANN, brute-force, p-keys brute-force) look correct.transform_to_entityproperly sizesq_to_centroidsfor both metric types.open()correctly handles both fresh and existing storage; add paths convert vectors via reformer before insertion.std::vector.dump_rabitq_centroidswrites header, rotated centroids, original centroids, rotator data, and padding to a dumper; CRC computed and appended correctly.Sequence Diagram
sequenceDiagram participant Caller participant HnswRabitqBuilder participant RabitqConverter participant RabitqReformer participant HnswRabitqEntity participant HnswRabitqAlgorithm Caller->>HnswRabitqBuilder: init(meta, params) HnswRabitqBuilder->>RabitqConverter: init(meta, params) HnswRabitqBuilder->>HnswRabitqEntity: init() HnswRabitqBuilder->>HnswRabitqAlgorithm: init() Caller->>HnswRabitqBuilder: train(holder) HnswRabitqBuilder->>RabitqConverter: train(holder) Note over RabitqConverter: KMeans clustering → centroids HnswRabitqBuilder->>RabitqConverter: dump(MemoryDumper) HnswRabitqBuilder->>RabitqReformer: init() + load(MemoryReadStorage) Note over RabitqReformer: Loads centroids + rotator Caller->>HnswRabitqBuilder: build(threads, holder) loop For each vector HnswRabitqBuilder->>RabitqReformer: convert(vec) → quantized_vec HnswRabitqBuilder->>HnswRabitqEntity: add_vector(level, key, quantized_vec) end par Parallel build threads HnswRabitqBuilder->>HnswRabitqAlgorithm: add_node(id, level, ctx) Note over HnswRabitqAlgorithm: HNSW graph construction end Caller->>HnswRabitqBuilder: dump(dumper) HnswRabitqBuilder->>RabitqConverter: dump(dumper) HnswRabitqBuilder->>HnswRabitqEntity: dump(dumper) Caller->>HnswRabitqSearcher: load(storage, metric) HnswRabitqSearcher->>RabitqReformer: load(storage) HnswRabitqSearcher->>HnswRabitqEntity: load(storage) HnswRabitqSearcher->>HnswRabitqQueryAlgorithm: create(entity, num_clusters, metric_type) Caller->>HnswRabitqSearcher: search_impl(query, qmeta, count, ctx) HnswRabitqSearcher->>RabitqReformer: transform_to_entity(query) → QueryEntity Note over RabitqReformer: rotate query, quantize, compute q_to_centroids HnswRabitqSearcher->>HnswRabitqQueryAlgorithm: search(entity, ctx) Note over HnswRabitqQueryAlgorithm: HNSW beam search with RaBitQ distance estimatesLast reviewed commit: 3fb3261