Revise README for DiskANN3 #1046
Conversation
Updated README to reflect changes in DiskANN3 and added details about the Provider API and getting started guide.
Corrected formatting and improved clarity in the README.
There was a problem hiding this comment.
Pull request overview
Updates the top-level README to describe DiskANN3 (Rust main branch) and its Provider API, plus a short “Getting Started” section pointing to benchmarking and provider integration entry points.
Changes:
- Replaces the prior badge/paper-heavy intro with a DiskANN3 overview and feature list.
- Adds Provider API context and a “Getting Started” section with links to benchmarks and the provider contract.
- Moves badges and paper links into the “Legacy C++ Code” section.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Updated the README to clarify the DiskANN3 library's purpose and usage, including changes to the description of the API and algorithmic features.
Updated project name and description in README.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1046 +/- ##
==========================================
- Coverage 90.60% 89.47% -1.13%
==========================================
Files 461 461
Lines 85494 85494
==========================================
- Hits 77462 76498 -964
- Misses 8032 8996 +964
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
| [](https://harsha-simhadri.org/pubs/Filtered-DiskANN23.pdf) | ||
| To use DiskANN3 in your system, you would implement the `DataProvider` trait for your store to describe how index terms such as vectors, adjacency lists should be store and retrieved. DiskANN3 provides vector update and query API to users and internally uses the implementation of `DataProvider` trait to serve these requests. | ||
|
|
||
| This repo offers the following Provider implementations as illustrative examples: |
There was a problem hiding this comment.
Do we want to future-proof these claims with dates or version tags of competitors?
| The provider for [Cosmos DB NoSQL Vector Search](https://learn.microsoft.com/en-us/azure/cosmos-db/vector-search) is not included here but documented in the [VLDB'25 paper](https://www.vldb.org/pvldb/vol18/p5166-upreti.pdf). | ||
|
|
||
| The library supports the following algorithmic features | ||
| - Real-time updates (using [IP-DiskANN](https://arxiv.org/abs/2502.13826)) that support stable recall under long update streams -- no merges, rebuilds, patches needed. |
There was a problem hiding this comment.
We also support the FreshDiskANN version, right? IP-DiskANN is just the default.
There was a problem hiding this comment.
Do we support the whole tier merge architecturein main branch?
|
|
||
| The provider for [Cosmos DB NoSQL Vector Search](https://learn.microsoft.com/en-us/azure/cosmos-db/vector-search) is not included here but documented in the [VLDB'25 paper](https://www.vldb.org/pvldb/vol18/p5166-upreti.pdf). | ||
|
|
||
| The library supports the following algorithmic features |
There was a problem hiding this comment.
Maybe we should mention special types of search such as range, diverse?
| - Real-time updates (using [IP-DiskANN](https://arxiv.org/abs/2502.13826)) that support stable recall under long update streams -- no merges, rebuilds, patches needed. | ||
| - A diverse set of distance functions and quantizers (PQ, MinMax, Scalar, Spherical) implemented for x86 and aarch64. | ||
| - Choice of memory tiers to allow operation at different price-performance points. | ||
| - Hooks to allow attribute filters (predicate) processsing along with vector search. |
| @@ -24,7 +32,18 @@ See [guidelines](CONTRIBUTING.md) for contributing to this project. | |||
|
|
|||
| ## Legacy C++ Code | |||
There was a problem hiding this comment.
Mention that this is DiskANN2?
Updated README to reflect changes in DiskANN3 and added details about the Provider API and getting started guide.