Add BuildTreesParallel() method to BKTree that parallelizes tree construction
using a level-order (BFS) approach instead of the existing depth-first recursive
method. At each level of the tree, all sibling nodes are processed in parallel
using OpenMP, with each thread running independent k-means clustering. The tree
structure is then assembled sequentially to maintain correctness.
This is controlled by a new ParallelBKTBuild parameter (default: false) that
can be enabled in both BKT index and SPANN select-head configurations.
Benchmark on SIFT 50M (128-dim, L2) with 32 threads on Azure L32s_v2:
- Select Head (BKT build): 16.6 hours -> 1.2 hours (13.6x speedup)
- Build Head graph (RefineGraph): unchanged (~10 hours, memory-bound)
- Total end-to-end build: ~30 hours -> ~15 hours
- Recall@1: 91% -> 94% (slight improvement)
- Query latency: comparable (P50 ~40ms)
Summary
Add a parallel BKT (Balanced K-means Tree) build option that dramatically accelerates tree construction for large datasets. The existing
BuildTrees()method uses a depth-first recursive approach where each tree level is built sequentially. The newBuildTreesParallel()method uses a level-order (BFS) approach with OpenMP to process all sibling nodes at each level in parallel.Motivation
For large-scale SPANN indexes (e.g., 50M+ vectors), the Select Head phase — where BKT construction contributes a large portion of the build time — becomes a major bottleneck. On a 50M SIFT dataset, the sequential BKT build takes ~16.6 hours with 16 threads because only 1 thread is active at deeper tree levels. The parallel approach reduces this to ~1.2 hours with 32 threads, a 13.6x speedup.
Design
The parallel build works by:
#pragma omp parallel forKmeansArgswith 1 thread (parallelism is at the node level, not within k-means)This is an opt-in feature controlled by a new
ParallelBKTBuildparameter (default:false), so there is zero impact on existing behavior.Changes
BKTree.hBuildTreesParallel()method +m_parallelBuildmemberBKT/ParameterDefinitionList.hParallelBKTBuildparameter for BKT indexSPANN/Options.hm_parallelBKTBuildoptionSPANN/ParameterDefinitionList.hParallelBKTBuildparameter for SPANN select headBKTIndex.cppSPANNIndex.cppParallelBKTBuildflag and dispatch accordinglyBenchmark Results
Dataset: SIFT 50M vectors, 128-dim float, L2 distance
Hardware: Azure Standard_L32s_v2 (32 vCPUs, AMD EPYC 7551, 256 GiB RAM)
BKT Build Time Comparison
Usage
Enable in config:
Or for BKT index directly: