1. Install LanceDB
Install LanceDB in your client SDK.2. Connect to a LanceDB database
Using LanceDB’s open source version is as simple as running the following import statement — no servers needed!Connect to a database
Once you import LanceDB as a library, you can connect to a LanceDB database by specifying a local file path.LanceDB Cloud or Enterprise versions
If you want a fully-managed solution, you can opt for LanceDB Cloud, which provides managed infrastructure, security, and automatic backups. Simply replace the local path with a remoteuri
that points to where your data is stored.
3. Obtain some data
LanceDB uses the notion of tables, where each row represents a record, and each column represents a field and/or its metadata. The simplest way to begin is to define a list of objects, where each object contains a vector field (list of floats) and optional fields for metadata. Let’s look at an example. We have the following records of characters in an adventure board game. Thevector field is
a list of floats. In the real world, these would contain hundreds of floating-point values and be generated via an embedding model, but the example below shows a simple version with just 3 values.
5. Create a table
Next, let’s create aTable in LanceDB and ingest the data into it.
If not provided explicitly, the table infers the schema from the data you provide.
If the table already exists, you’ll get an error message.
mode=overwrite parameter.
6. Vector search
Now, let’s perform a vector similarity search. The query vector should have the same dimensionality as your data vectors. The search returns the most similar vectors based on Euclidean distance. Our query is a vector that represents awarrior, which isn’t in the data we ingested.
We’ll find the result that’s most similar to it!
knight is the most similar adventurer to the warrior from our query!
| id | text | vector | _distance |
|---|---|---|---|
| 1 | knight | [0.9, 0.4, 0.8] | 0.02 |
| 2 | ranger | [0.8, 0.4, 0.7] | 0.02 |
to_pandas() method to display the results as a Pandas DataFrame.
7. Add data and run more queries
If you obtain more data, it’s simple to add it to an existing table. In the same script or a new one, you can connect to the LanceDB database, open an existing table, and use thetable.add command.
Python
wizard.
Python
| id | text | vector | _distance |
|---|---|---|---|
| 7 | mage | [0.6, 0.3, 0.4] | 0.02 |
| 9 | priest | [0.6, 0.2, 0.6] | 0.03 |
mage is the most magical of all our characters!