Lesson 4.3: Vector Databases
Duration: 75 minutes
Learning Objectives
By the end of this lesson, you will be able to:
- Explain what vector databases are and why they are needed
- Compare popular vector database options
- Use Chroma for local development
- Use Pinecone for production deployments
- Implement basic CRUD operations with vector stores
Why Vector Databases
In the previous lesson, we built a simple in-memory vector store. This approach has limitations:
- Memory Limits: Cannot handle millions of documents
- No Persistence: Data is lost when the application restarts
- Slow Search: Linear scan through all vectors is inefficient
- No Scaling: Cannot distribute across multiple machines
Vector databases solve these problems with:
- Efficient Indexing: Specialized algorithms for fast similarity search
- Persistence: Data survives application restarts
- Scalability: Handle billions of vectors
- Filtering: Combine vector search with metadata filters
How Vector Search Works
Traditional databases use exact matching (SQL WHERE clauses). Vector databases use Approximate Nearest Neighbor (ANN) algorithms:
Exact Search (Brute Force):
- Compare query to every vector
- O(n) complexity
- Slow for large datasets
ANN Search (Vector DB):
- Build index structure
- Search only nearby regions
- O(log n) complexity
- Fast even for billions of vectors
Common ANN algorithms include:
- HNSW (Hierarchical Navigable Small World): Used by Pinecone, Chroma
- IVF (Inverted File Index): Used by Faiss
- ScaNN (Scalable Nearest Neighbors): Used by Google
Popular Vector Databases
| Database | Type | Best For |
|---|---|---|
| Chroma | Local/Embedded | Development, small projects |
| Pinecone | Cloud | Production, managed service |
| Weaviate | Self-hosted/Cloud | Full-text + vector search |
| Qdrant | Self-hosted/Cloud | High performance, Rust-based |
| Milvus | Self-hosted | Enterprise, large scale |
| pgvector | Postgres extension | Existing Postgres users |
For this course, we focus on:
- Chroma: Easy local development
- Pinecone: Production-ready cloud service
Setting Up Chroma
Chroma is a lightweight vector database perfect for development and small applications.
Installation
npm install chromadb
Basic Usage
import { ChromaClient, Collection } from 'chromadb';
async function main() {
// Create client (connects to local instance)
const client = new ChromaClient();
// Create or get a collection
const collection = await client.getOrCreateCollection({
name: 'my_documents',
metadata: { description: 'Product documentation' },
});
console.log(`Collection: ${collection.name}`);
}
Adding Documents
Chroma can generate embeddings automatically using its default embedding function:
import { ChromaClient } from 'chromadb';
const client = new ChromaClient();
const collection = await client.getOrCreateCollection({
name: 'documents',
});
// Add documents with automatic embedding
await collection.add({
ids: ['doc1', 'doc2', 'doc3'],
documents: [
'How to reset your password in the settings menu',
'Enable two-factor authentication for security',
'Contact support for billing questions',
],
metadatas: [
{ category: 'security', priority: 'high' },
{ category: 'security', priority: 'medium' },
{ category: 'billing', priority: 'low' },
],
});
console.log(`Added ${await collection.count()} documents`);
Querying Documents
// Search for similar documents
const results = await collection.query({
queryTexts: ['How do I change my password?'],
nResults: 3,
});
console.log('Search Results:');
for (let i = 0; i < results.ids[0].length; i++) {
console.log(`\nID: ${results.ids[0][i]}`);
console.log(`Document: ${results.documents[0][i]}`);
console.log(`Distance: ${results.distances?.[0][i]}`);
console.log(`Metadata: ${JSON.stringify(results.metadatas?.[0][i])}`);
}
Filtering with Metadata
// Search with metadata filter
const securityDocs = await collection.query({
queryTexts: ['authentication help'],
nResults: 5,
where: { category: 'security' },
});
// Complex filters
const highPriorityDocs = await collection.query({
queryTexts: ['help'],
nResults: 5,
where: {
$and: [{ category: 'security' }, { priority: 'high' }],
},
});
Complete Chroma Example
Here is a complete example of a document Q&A system using Chroma:
import { ChromaClient, Collection } from 'chromadb';
interface DocumentResult {
id: string;
content: string;
score: number;
metadata: Record<string, string>;
}
class DocumentStore {
private client: ChromaClient;
private collection: Collection | null = null;
constructor() {
this.client = new ChromaClient();
}
async initialize(collectionName: string): Promise<void> {
this.collection = await this.client.getOrCreateCollection({
name: collectionName,
});
}
async addDocuments(
documents: { id: string; content: string; metadata?: Record<string, string> }[]
): Promise<void> {
if (!this.collection) throw new Error('Not initialized');
await this.collection.add({
ids: documents.map((d) => d.id),
documents: documents.map((d) => d.content),
metadatas: documents.map((d) => d.metadata || {}),
});
}
async search(
query: string,
limit: number = 5,
filter?: Record<string, string>
): Promise<DocumentResult[]> {
if (!this.collection) throw new Error('Not initialized');
const results = await this.collection.query({
queryTexts: [query],
nResults: limit,
where: filter,
});
return results.ids[0].map((id, index) => ({
id,
content: results.documents[0][index] || '',
score: 1 - (results.distances?.[0][index] || 0), // Convert distance to similarity
metadata: (results.metadatas?.[0][index] as Record<string, string>) || {},
}));
}
async deleteDocument(id: string): Promise<void> {
if (!this.collection) throw new Error('Not initialized');
await this.collection.delete({ ids: [id] });
}
async updateDocument(
id: string,
content: string,
metadata?: Record<string, string>
): Promise<void> {
if (!this.collection) throw new Error('Not initialized');
await this.collection.update({
ids: [id],
documents: [content],
metadatas: metadata ? [metadata] : undefined,
});
}
}
// Usage
async function main() {
const store = new DocumentStore();
await store.initialize('product_docs');
// Add documents
await store.addDocuments([
{
id: 'password-reset',
content:
'To reset your password: 1. Go to Settings 2. Click Security 3. Select Reset Password 4. Follow the email instructions',
metadata: { category: 'security', version: '2.0' },
},
{
id: '2fa-setup',
content:
'Enable two-factor authentication: 1. Go to Settings 2. Click Security 3. Enable 2FA 4. Scan QR code with authenticator app',
metadata: { category: 'security', version: '2.0' },
},
{
id: 'billing-contact',
content:
'For billing inquiries, email billing@example.com or call 1-800-EXAMPLE during business hours (9 AM - 5 PM EST)',
metadata: { category: 'billing', version: '1.0' },
},
]);
// Search
console.log('=== General Search ===');
const results = await store.search('How do I secure my account?');
for (const result of results) {
console.log(`\n[${result.id}] Score: ${result.score.toFixed(3)}`);
console.log(`Category: ${result.metadata.category}`);
console.log(`Content: ${result.content.substring(0, 100)}...`);
}
// Filtered search
console.log('\n=== Security Category Only ===');
const securityResults = await store.search('help', 5, {
category: 'security',
});
for (const result of securityResults) {
console.log(`[${result.id}] ${result.metadata.category}`);
}
}
main();
Setting Up Pinecone
Pinecone is a managed cloud vector database for production applications.
Installation
npm install @pinecone-database/pinecone
Creating an Index
First, create an index in the Pinecone console or via API:
import { Pinecone } from '@pinecone-database/pinecone';
const pinecone = new Pinecone({
apiKey: process.env.PINECONE_API_KEY!,
});
// Create an index (do this once)
async function createIndex() {
await pinecone.createIndex({
name: 'documents',
dimension: 1536, // Must match your embedding model
metric: 'cosine',
spec: {
serverless: {
cloud: 'aws',
region: 'us-east-1',
},
},
});
console.log('Index created. Waiting for it to be ready...');
// Wait for index to be ready
await new Promise((resolve) => setTimeout(resolve, 60000));
}
Connecting to an Index
import { Pinecone } from '@pinecone-database/pinecone';
import OpenAI from 'openai';
const pinecone = new Pinecone({
apiKey: process.env.PINECONE_API_KEY!,
});
const openai = new OpenAI();
const index = pinecone.index('documents');
// Generate embedding
async function getEmbedding(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
});
return response.data[0].embedding;
}
Upserting Vectors
interface Document {
id: string;
content: string;
metadata: Record<string, string>;
}
async function upsertDocuments(documents: Document[]): Promise<void> {
// Generate embeddings for all documents
const embeddings = await Promise.all(documents.map((doc) => getEmbedding(doc.content)));
// Prepare vectors for upsert
const vectors = documents.map((doc, i) => ({
id: doc.id,
values: embeddings[i],
metadata: {
...doc.metadata,
content: doc.content, // Store content in metadata for retrieval
},
}));
// Upsert in batches of 100
const batchSize = 100;
for (let i = 0; i < vectors.length; i += batchSize) {
const batch = vectors.slice(i, i + batchSize);
await index.upsert(batch);
console.log(`Upserted batch ${Math.floor(i / batchSize) + 1}`);
}
}
Querying Vectors
interface SearchResult {
id: string;
score: number;
content: string;
metadata: Record<string, string>;
}
async function searchDocuments(
query: string,
topK: number = 5,
filter?: Record<string, string>
): Promise<SearchResult[]> {
const queryEmbedding = await getEmbedding(query);
const results = await index.query({
vector: queryEmbedding,
topK,
includeMetadata: true,
filter,
});
return results.matches.map((match) => ({
id: match.id,
score: match.score || 0,
content: (match.metadata?.content as string) || '',
metadata: match.metadata as Record<string, string>,
}));
}
Complete Pinecone Example
import { Pinecone } from '@pinecone-database/pinecone';
import OpenAI from 'openai';
const pinecone = new Pinecone({
apiKey: process.env.PINECONE_API_KEY!,
});
const openai = new OpenAI();
class PineconeDocumentStore {
private index;
constructor(indexName: string) {
this.index = pinecone.index(indexName);
}
private async getEmbedding(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
});
return response.data[0].embedding;
}
private async getEmbeddings(texts: string[]): Promise<number[][]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: texts,
});
return response.data.sort((a, b) => a.index - b.index).map((item) => item.embedding);
}
async addDocuments(
documents: { id: string; content: string; metadata?: Record<string, string> }[]
): Promise<void> {
const embeddings = await this.getEmbeddings(documents.map((d) => d.content));
const vectors = documents.map((doc, i) => ({
id: doc.id,
values: embeddings[i],
metadata: {
content: doc.content,
...doc.metadata,
},
}));
// Batch upsert
const batchSize = 100;
for (let i = 0; i < vectors.length; i += batchSize) {
await this.index.upsert(vectors.slice(i, i + batchSize));
}
console.log(`Added ${documents.length} documents`);
}
async search(
query: string,
limit: number = 5,
filter?: Record<string, any>
): Promise<{ id: string; content: string; score: number }[]> {
const queryEmbedding = await this.getEmbedding(query);
const results = await this.index.query({
vector: queryEmbedding,
topK: limit,
includeMetadata: true,
filter,
});
return results.matches.map((match) => ({
id: match.id,
content: (match.metadata?.content as string) || '',
score: match.score || 0,
}));
}
async deleteDocument(id: string): Promise<void> {
await this.index.deleteOne(id);
}
async deleteAll(): Promise<void> {
await this.index.deleteAll();
}
}
// Usage
async function main() {
const store = new PineconeDocumentStore('documents');
await store.addDocuments([
{
id: 'doc1',
content: 'Reset password in Settings > Security > Reset Password',
metadata: { category: 'security' },
},
{
id: 'doc2',
content: 'Enable 2FA for enhanced account protection',
metadata: { category: 'security' },
},
]);
const results = await store.search('How to change password?');
console.log('Results:', results);
}
main();
Chroma vs Pinecone Comparison
| Feature | Chroma | Pinecone |
|---|---|---|
| Setup | npm install | npm install + account setup |
| Hosting | Local/self-hosted | Cloud managed |
| Cost | Free | Free tier, then pay-per-use |
| Scaling | Limited | Automatic |
| Persistence | Local files | Cloud storage |
| Built-in Embeddings | Yes | No (bring your own) |
| Best For | Development, small projects | Production, enterprise |
When to Use Chroma
- Local development and testing
- Small datasets (< 100k documents)
- Privacy-sensitive applications
- Quick prototyping
When to Use Pinecone
- Production deployments
- Large datasets (millions of vectors)
- Need for high availability
- Team collaboration
Namespace and Filtering
Both databases support organizing data:
Pinecone Namespaces
// Use namespaces to separate data
const index = pinecone.index('documents');
const namespace = index.namespace('product-docs');
await namespace.upsert([...vectors]);
const results = await namespace.query({ vector, topK: 5 });
Metadata Filtering
// Pinecone filter syntax
const results = await index.query({
vector: queryEmbedding,
topK: 5,
filter: {
category: { $eq: "security" },
priority: { $in: ["high", "medium"] },
},
});
// Chroma filter syntax
const results = await collection.query({
queryTexts: ["query"],
where: {
$and: [
{ category: { $eq: "security" } },
{ priority: { $in: ["high", "medium"] } },
],
},
});
Key Takeaways
- Vector databases enable efficient similarity search at scale with ANN algorithms
- Chroma is ideal for development with its simple setup and built-in embeddings
- Pinecone provides production-ready infrastructure with automatic scaling
- Metadata filtering combines semantic search with traditional filters
- Choose based on your needs: local development vs production, scale requirements
Resources
| Resource | Type | Level |
|---|---|---|
| Chroma Documentation | Documentation | Beginner |
| Pinecone Documentation | Documentation | Beginner |
| Vector Database Comparison | Article | Intermediate |
Next Lesson
In the next lesson, you will learn about chunking strategies - how to split documents effectively for optimal RAG performance.