Reindexing and Incremental Indexing
Reindexing updates indexes after new data arrives so queries continue to perform well. For ANN search, new data always appears in results, but if it is not indexed yet LanceDB falls back to brute-force search for the new segments—which increases latency.Both LanceDB OSS and Cloud support reindexing, but the workflows differ depending on the index type.While a reindex job runs, LanceDB combines results from the existing index with exhaustive search on new data. The more unindexed data accumulates, the higher the latency cost.
Incremental Indexing in LanceDB Cloud
LanceDB Cloud & Enterprise support incremental reindexing automatically. When new data lands, a background job rebuilds indexes so they stay in sync with the data size.
During rebuilding, queries still cover all data but brute-force search handles the new rows. Use fast_search=True to limit searches to indexed data only.
Use
index_stats() to monitor the number of unindexed rows. The value drops to zero once background indexing completes.Incremental Indexing in LanceDB OSS
LanceDB OSS also supports incremental indexing—you can append records without rebuilding the entire index, then optimize when convenient.New data added after building an index still appears in search results, but searches require a flat scan over the unindexed portion until reindexing completes. Cloud & Enterprise automate this merging process to minimize the impact.