Skip to main content

Storage

LanceDB provides flexible storage backends that support both cloud object storage and local high-performance storage for different deployment scenarios.
FeatureDescriptionOSSCloudEnterprise
Object, File, Block StorageSupport for AWS, GCS, Azure and S3-compatible vendors.
Local SSD/NVMe StorageSupport for storage on customer’s custom servers.

Tables

LanceDB’s table abstraction provides ACID-compliant data management with schema evolution, versioning, and consistency guarantees for vector and scalar data.
FeatureDescriptionOSSCloudEnterprise
Tables - CRUD OperationsBasic API to create, read, update, drop tables.
Tables - Data EvolutionAlter column schema, datatype, backfill + merge data
Tables - VersioningAppend, overwrite, check versions + tag them.
Tables - ConsistencySynchronize database with underlying storage.

Ingestion

LanceDB’s ingestion pipeline handles both vector embedding generation and data loading with support for multiple formats and efficient batch operations.
FeatureDescriptionOSSCloudEnterprise
Embedding - Text DataGenerate vector embeddings from text data using various embedding models.
Embedding - Multimodal DataGenerate embeddings from images, audio, and other multimodal content.
Embedding - CPU & GPU Device ConfigurationConfigure CPU or GPU acceleration for embedding generation performance.
Embedding - Environment VariablesManage API keys and configuration for embedding model access.
Data Ingestion - DefaultFormerly called Adding Data to a Table.
Data Ingestion - FormatsPandas, Polars, Pyarrow, Pydantic
Data Ingestion - UpsertUpdate existing records or insert new ones based on key.
Data Ingestion - Merge InsertCombine data from multiple sources into a single table.

Indexing

LanceDB’s indexing system provides multiple vector and scalar index types with automated optimization for fast similarity search and retrieval operations.
FeatureDescriptionOSSCloudEnterprise
Vector Index - IVF_FLATMinimal index that looks at IVF partitions, instead of brute forcing.
Vector Index - IVF_PQDefault vector index using Euclidean distance.
Vector Index - IVF_SQIVF index built using scalar quantized vectors.
Vector Index - IVF_HNSW_SQHNSW built on IVF’s partitions + vectors that are scalar quantized.
Vector Index - BinaryIVF_FLAT with Hamming distance for binary vectors.
Scalar IndexBTREE, BITMAP, LABEL_LIST
Automated IndexingIndexing happens in the background no config.
Bypass Automated IndexingWhen you want to search over all available vectors.
Reindexing - ManualUser needs to specify that they want to reindex.
Reindexing - AutomatedReindexing happens in the background no config
GPU Indexing - ManualUser needs to specify which indexing device to use.
GPU Indexing - AutomatedIndexing device is automatically set for user.
Full Text Search IndexInverted index
LanceDB’s search capabilities combine vector similarity search, full-text search, and hybrid approaches to provide comprehensive retrieval functionality across different data types.
FeatureDescriptionOSSCloudEnterprise
Vector Search - No IndexGoes through all the available vectors.
Vector Search - ANN IndexRetrieves top K similar vectors.
Vector Search - MultivectorsLate interaction vector search.
Vector Search - Distance RangeSearch for vectors within a specific distance threshold.
Vector Search - Binary VectorsSearch using binary vector representations for efficiency.
Vector Search - FilteringApply scalar filters during vector search operations.
Vector Search - Batch APIProcess multiple search queries in a single request.
Vector Search - Async IndexingFallback brute force for fast performance.
Full Text Search - FTS IndexInverted Index
Full Text Search - TokenizerNgram and other common methods of splitting text data.
Full Text Search - Scalar IndexBTREE, BITMAP, LABEL_LIST for non-vector data.
Full Text Search - Fuzzy SearchSearching when there is a typo on the query.
Full Text Search - Prefix MatchingSearch for text that starts with specific characters.
Full Text Search - Score BoostingIncrease relevance scores for specific terms or fields.
Full Text Search - Boolean LogicUse AND, OR, NOT operators in text search queries.
Full Text Search - Array FieldsSearch within array or list data types.
Hybrid Search - FTS IndexCombine vector and full-text search in single query.
Hybrid Search - RerankingReorder search results using additional ranking models.
SQL QueriesExecute standard SQL queries on LanceDB tables.
Query OptimizationExplain query plan, analyze query plan, optimization config settings.

Filtering

LanceDB’s filtering system provides flexible query capabilities that can be applied independently or in combination with vector and full-text search operations.
FeatureDescriptionOSSCloudEnterprise
Filtering - no Vector SearchApply filters without vector search operations.
Filtering - Vector SearchApply filters during vector search operations.
Filtering - Full Text SearchApply filters during full-text search operations.