Skip to main content
With LanceDB’s GPU-powered indexing you can build vector indexes for billions of rows in just a few hours—dramatically improving ingestion speed.
Internal tests show GPU indexing processing billions of vectors in under four hours.

Automatic GPU Indexing in LanceDB Enterprise

Automatic GPU indexing is currently available only in LanceDB Enterprise. Contact us to enable the feature.
Whenever you call create_index, Enterprise automatically selects GPU resources to build IVF or HNSW indexes. Indexing is asynchronous; call wait_for_index() to block until completion.

Manual GPU Indexing in LanceDB OSS

Use the Python SDK with PyTorch ≥ 2.0 to manually create IVF_PQ indexes on GPUs. GPU indexing currently requires the synchronous SDK. Specify the device via the accelerator parameter ("cuda" on Linux/NVIDIA, "mps" on Apple Silicon).

GPU Indexing on Linux

GPU Indexing on macOS (Apple Silicon)

Performance Considerations

  • GPU memory usage scales with num_partitions and vector dimension.
  • Ensure GPU memory comfortably exceeds the dataset you’re indexing.
  • Batch size is tuned automatically based on available GPU memory.
  • Larger batches further improve throughput.

Troubleshooting

If you encounter AssertionError: Torch not compiled with CUDA enabled, install a PyTorch build that includes CUDA support.