Refactor search post-processing to default-processor traits#817
Refactor search post-processing to default-processor traits#817narendatha wants to merge 17 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors DiskANN graph search post-processing by removing the post-processor associated type from SearchStrategy and replacing it with a default-processor delegation model (HasDefaultProcessor) plus an explicit, call-site selectable bridge trait (PostProcess).
Changes:
- Introduce
PostProcess,HasDefaultProcessor, andDefaultPostProcessindiskann/src/graph/glue.rs, including a helper macrodelegate_default_post_process!. - Update KNN/multihop/range/diverse search flows to use
post_process_with(and addKnnWith<PP>for caller-supplied post-processing). - Migrate providers/decorators and async wrappers/benchmarks to the new
HasDefaultProcessorbounds; update inplace delete to construct a delete-search processor explicitly.
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| diskann/src/graph/test/provider.rs | Updates test strategy to implement HasDefaultProcessor and wires delete-search post-processor. |
| diskann/src/graph/search/range_search.rs | Switches range search to post_process_with(&DefaultPostProcess, ...). |
| diskann/src/graph/search/multihop_search.rs | Switches multihop search to post_process_with(&DefaultPostProcess, ...). |
| diskann/src/graph/search/mod.rs | Re-exports KnnWith for explicit post-processing usage. |
| diskann/src/graph/search/knn_search.rs | Adds KnnWith<PP> and centralizes shared KNN logic in search_core. |
| diskann/src/graph/search/diverse_search.rs | Updates diverse search to PostProcess-based post-processing. |
| diskann/src/graph/index.rs | Adapts flat search and delete-search flows to HasDefaultProcessor / PostProcess. |
| diskann/src/graph/glue.rs | Introduces the new post-processing traits, macro, and DefaultPostProcess marker. |
| diskann-providers/src/model/graph/provider/layers/betafilter.rs | Delegates default post-processor construction via HasDefaultProcessor + Pipeline. |
| diskann-providers/src/model/graph/provider/async_/inmem/test.rs | Updates test strategies to HasDefaultProcessor. |
| diskann-providers/src/model/graph/provider/async_/inmem/spherical.rs | Migrates in-mem spherical strategies to HasDefaultProcessor. |
| diskann-providers/src/model/graph/provider/async_/inmem/scalar.rs | Migrates in-mem scalar strategies to HasDefaultProcessor. |
| diskann-providers/src/model/graph/provider/async_/inmem/product.rs | Migrates product-quantized strategies and delete-search processor typing. |
| diskann-providers/src/model/graph/provider/async_/inmem/full_precision.rs | Migrates full-precision strategies and delete-search processor typing. |
| diskann-providers/src/model/graph/provider/async_/debug_provider.rs | Migrates debug provider strategies and delete-search processor typing. |
| diskann-providers/src/model/graph/provider/async_/caching/provider.rs | Delegates default processor through caching layer and propagates delete-search processor. |
| diskann-providers/src/model/graph/provider/async_/bf_tree/provider.rs | Migrates BF-tree strategies and delete-search processor typing. |
| diskann-providers/src/index/wrapped_async.rs | Propagates HasDefaultProcessor bound through async wrapper APIs. |
| diskann-providers/src/index/diskann_async.rs | Updates async index tests/constraints to require HasDefaultProcessor. |
| diskann-label-filter/src/inline_beta_search/inline_beta_filter.rs | Delegates default processor via HasDefaultProcessor for inline beta strategy. |
| diskann-disk/src/search/provider/disk_provider.rs | Implements HasDefaultProcessor for disk search strategy (e.g., RerankAndFilter). |
| diskann-benchmark/src/backend/index/benchmarks.rs | Updates benchmark generic bounds to include HasDefaultProcessor. |
| diskann-benchmark-core/src/search/graph/range.rs | Updates benchmark-core range search bounds to include HasDefaultProcessor. |
| diskann-benchmark-core/src/search/graph/multihop.rs | Updates benchmark-core multihop bounds to include HasDefaultProcessor. |
| diskann-benchmark-core/src/search/graph/knn.rs | Updates benchmark-core KNN bounds to include HasDefaultProcessor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
hildebrandmw
left a comment
There was a problem hiding this comment.
Thanks @narendatha - this is headed in the right direction, but I found some pretty large structural issues. Basically, you're helping to untangle a pretty large knot here (thank you so much!) and we should lean into the simplification that this approach can allow.
diskann-providers/src/model/graph/provider/async_/inmem/full_precision.rs
Outdated
Show resolved
Hide resolved
diskann-providers/src/model/graph/provider/async_/determinant_diversity_post_process.rs
Show resolved
Hide resolved
- Simplify Search trait: move processor/output buffer to method-level generics - Remove Internal<FullPrecision> strategy split; use RemoveDeletedIdsAndCopy for delete ops - Add DefaultSearchStrategy aggregate trait combining SearchStrategy + HasDefaultProcessor - Update benchmark-core helpers to use aggregate trait (reduce recurring bounds) - Wire range search output buffer through to caller (support dynamic output handling) - Add no-op SearchOutputBuffer impl for () type to preserve compatibility
…roviders This commit moves the determinant_diversity_post_process module from diskann to diskann-providers, as it does not depend on diskann internals and logically belongs with other post-processing logic in the providers layer. Changes: - Move determinant_diversity_post_process.rs from diskann/src/graph/search/ to diskann-providers/src/model/graph/provider/async_/ - Update all imports across workspace to use diskann_providers location - Add diskann-providers dependency to diskann-benchmark-core (required for DeterminantDiversitySearchParams access) - Remove old module reference from diskann/src/graph/search/mod.rs - Update diskann-benchmark, diskann-disk imports to use new location Validated with: - cargo clippy --workspace --all-targets -- -D warnings - cargo fmt --all This results in cleaner architectural separation where determinant-diversity search parameters stay with the provider infrastructure that implements them.
…uce clones - Add DeterminantDiversityError enum for parameter validation - Convert DeterminantDiversitySearchParams::new() to return Result<Self, Error> - Validate top_k > 0, eta >= 0.0 and finite, power > 0.0 and finite - Optimize post_process_with_eta_f32: precompute projections to eliminate vector clones - Optimize post_process_greedy_orthogonalization_f32: single r_star_copy before projection loop - Expand test suite from 3 to 11 tests (7 validation + 4 algorithm tests) - Update callsites in disk_index/search.rs and index/search/knn.rs for error handling - Add early validation checks in main router function
- Extract shared run-loop logic into reusable helpers - Route both knn and determinant-diversity through closure-based parameter builders - Preserve determinant-diversity parameter validation/error propagation - Reduce duplicated benchmark orchestration code
- Promote DelegateDefaultPostProcessor as the canonical trait in glue - Remove compatibility alias layer for HasDefaultProcessor - Rename all trait bounds/impls/usages across diskann, providers, disk, benchmark, and label-filter - Keep delegate_default_post_process! macro usage aligned with trait naming
- Add runtime filter_start_points flag to RemoveDeletedIdsAndCopy and Rerank - Route default search through runtime-configurable processors (no FilterStartPoints pipeline) - Set inplace-delete search processors to filter_start_points=false - Remove Internal<T> strategy indirection and update async providers accordingly
- Preserve inner search_post_processor for Cached<S> inplace-delete path - Add CachedPostProcess wrapper to avoid PostProcess impl overlap - Keep default post-processing delegation unchanged for normal search
Summary
This PR refactors search post-processing to an explicit processor model, adds determinant-diversity reranking support in benchmark paths, and extends disk strategy support with end-to-end benchmark wiring and inputs.
What
1) Search post-processing architecture refactor
SearchStrategyinto explicit processor contracts.PostProcess,HasDefaultProcessor, and default delegation mechanisms.knn,multihop,range, and related paths) to use explicit post-process dispatch.2) Search API updates
search_with(...).search(...)behavior via default processor delegation for compatibility.3) Determinant-diversity support (core + benchmark)
4) Disk strategy integration
DeterminantDiversitySearchParams.determinant_diversity_etadeterminant_diversity_powerdeterminant_diversity_results_kndims(),npoints()).Credits
All credit goes to Mark for brainstorming and proposing best design that solves many tradeoffs. Thanks to his detailed design template that enabled copilot to do most heavy lifting.