From Zero to AI

Lesson 4.3: Vector Databases

Duration: 75 minutes

Learning Objectives

By the end of this lesson, you will be able to:

  1. Explain what vector databases are and why they are needed
  2. Compare popular vector database options
  3. Use Chroma for local development
  4. Use Pinecone for production deployments
  5. Implement basic CRUD operations with vector stores

Why Vector Databases

In the previous lesson, we built a simple in-memory vector store. This approach has limitations:

  1. Memory Limits: Cannot handle millions of documents
  2. No Persistence: Data is lost when the application restarts
  3. Slow Search: Linear scan through all vectors is inefficient
  4. No Scaling: Cannot distribute across multiple machines

Vector databases solve these problems with:

  • Efficient Indexing: Specialized algorithms for fast similarity search
  • Persistence: Data survives application restarts
  • Scalability: Handle billions of vectors
  • Filtering: Combine vector search with metadata filters

How Vector Search Works

Traditional databases use exact matching (SQL WHERE clauses). Vector databases use Approximate Nearest Neighbor (ANN) algorithms:

Exact Search (Brute Force):
- Compare query to every vector
- O(n) complexity
- Slow for large datasets

ANN Search (Vector DB):
- Build index structure
- Search only nearby regions
- O(log n) complexity
- Fast even for billions of vectors

Common ANN algorithms include:

  • HNSW (Hierarchical Navigable Small World): Used by Pinecone, Chroma
  • IVF (Inverted File Index): Used by Faiss
  • ScaNN (Scalable Nearest Neighbors): Used by Google

Database Type Best For
Chroma Local/Embedded Development, small projects
Pinecone Cloud Production, managed service
Weaviate Self-hosted/Cloud Full-text + vector search
Qdrant Self-hosted/Cloud High performance, Rust-based
Milvus Self-hosted Enterprise, large scale
pgvector Postgres extension Existing Postgres users

For this course, we focus on:

  • Chroma: Easy local development
  • Pinecone: Production-ready cloud service

Setting Up Chroma

Chroma is a lightweight vector database perfect for development and small applications.

Installation

npm install chromadb

Basic Usage

import { ChromaClient, Collection } from 'chromadb';

async function main() {
  // Create client (connects to local instance)
  const client = new ChromaClient();

  // Create or get a collection
  const collection = await client.getOrCreateCollection({
    name: 'my_documents',
    metadata: { description: 'Product documentation' },
  });

  console.log(`Collection: ${collection.name}`);
}

Adding Documents

Chroma can generate embeddings automatically using its default embedding function:

import { ChromaClient } from 'chromadb';

const client = new ChromaClient();
const collection = await client.getOrCreateCollection({
  name: 'documents',
});

// Add documents with automatic embedding
await collection.add({
  ids: ['doc1', 'doc2', 'doc3'],
  documents: [
    'How to reset your password in the settings menu',
    'Enable two-factor authentication for security',
    'Contact support for billing questions',
  ],
  metadatas: [
    { category: 'security', priority: 'high' },
    { category: 'security', priority: 'medium' },
    { category: 'billing', priority: 'low' },
  ],
});

console.log(`Added ${await collection.count()} documents`);

Querying Documents

// Search for similar documents
const results = await collection.query({
  queryTexts: ['How do I change my password?'],
  nResults: 3,
});

console.log('Search Results:');
for (let i = 0; i < results.ids[0].length; i++) {
  console.log(`\nID: ${results.ids[0][i]}`);
  console.log(`Document: ${results.documents[0][i]}`);
  console.log(`Distance: ${results.distances?.[0][i]}`);
  console.log(`Metadata: ${JSON.stringify(results.metadatas?.[0][i])}`);
}

Filtering with Metadata

// Search with metadata filter
const securityDocs = await collection.query({
  queryTexts: ['authentication help'],
  nResults: 5,
  where: { category: 'security' },
});

// Complex filters
const highPriorityDocs = await collection.query({
  queryTexts: ['help'],
  nResults: 5,
  where: {
    $and: [{ category: 'security' }, { priority: 'high' }],
  },
});

Complete Chroma Example

Here is a complete example of a document Q&A system using Chroma:

import { ChromaClient, Collection } from 'chromadb';

interface DocumentResult {
  id: string;
  content: string;
  score: number;
  metadata: Record<string, string>;
}

class DocumentStore {
  private client: ChromaClient;
  private collection: Collection | null = null;

  constructor() {
    this.client = new ChromaClient();
  }

  async initialize(collectionName: string): Promise<void> {
    this.collection = await this.client.getOrCreateCollection({
      name: collectionName,
    });
  }

  async addDocuments(
    documents: { id: string; content: string; metadata?: Record<string, string> }[]
  ): Promise<void> {
    if (!this.collection) throw new Error('Not initialized');

    await this.collection.add({
      ids: documents.map((d) => d.id),
      documents: documents.map((d) => d.content),
      metadatas: documents.map((d) => d.metadata || {}),
    });
  }

  async search(
    query: string,
    limit: number = 5,
    filter?: Record<string, string>
  ): Promise<DocumentResult[]> {
    if (!this.collection) throw new Error('Not initialized');

    const results = await this.collection.query({
      queryTexts: [query],
      nResults: limit,
      where: filter,
    });

    return results.ids[0].map((id, index) => ({
      id,
      content: results.documents[0][index] || '',
      score: 1 - (results.distances?.[0][index] || 0), // Convert distance to similarity
      metadata: (results.metadatas?.[0][index] as Record<string, string>) || {},
    }));
  }

  async deleteDocument(id: string): Promise<void> {
    if (!this.collection) throw new Error('Not initialized');
    await this.collection.delete({ ids: [id] });
  }

  async updateDocument(
    id: string,
    content: string,
    metadata?: Record<string, string>
  ): Promise<void> {
    if (!this.collection) throw new Error('Not initialized');

    await this.collection.update({
      ids: [id],
      documents: [content],
      metadatas: metadata ? [metadata] : undefined,
    });
  }
}

// Usage
async function main() {
  const store = new DocumentStore();
  await store.initialize('product_docs');

  // Add documents
  await store.addDocuments([
    {
      id: 'password-reset',
      content:
        'To reset your password: 1. Go to Settings 2. Click Security 3. Select Reset Password 4. Follow the email instructions',
      metadata: { category: 'security', version: '2.0' },
    },
    {
      id: '2fa-setup',
      content:
        'Enable two-factor authentication: 1. Go to Settings 2. Click Security 3. Enable 2FA 4. Scan QR code with authenticator app',
      metadata: { category: 'security', version: '2.0' },
    },
    {
      id: 'billing-contact',
      content:
        'For billing inquiries, email billing@example.com or call 1-800-EXAMPLE during business hours (9 AM - 5 PM EST)',
      metadata: { category: 'billing', version: '1.0' },
    },
  ]);

  // Search
  console.log('=== General Search ===');
  const results = await store.search('How do I secure my account?');
  for (const result of results) {
    console.log(`\n[${result.id}] Score: ${result.score.toFixed(3)}`);
    console.log(`Category: ${result.metadata.category}`);
    console.log(`Content: ${result.content.substring(0, 100)}...`);
  }

  // Filtered search
  console.log('\n=== Security Category Only ===');
  const securityResults = await store.search('help', 5, {
    category: 'security',
  });
  for (const result of securityResults) {
    console.log(`[${result.id}] ${result.metadata.category}`);
  }
}

main();

Setting Up Pinecone

Pinecone is a managed cloud vector database for production applications.

Installation

npm install @pinecone-database/pinecone

Creating an Index

First, create an index in the Pinecone console or via API:

import { Pinecone } from '@pinecone-database/pinecone';

const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY!,
});

// Create an index (do this once)
async function createIndex() {
  await pinecone.createIndex({
    name: 'documents',
    dimension: 1536, // Must match your embedding model
    metric: 'cosine',
    spec: {
      serverless: {
        cloud: 'aws',
        region: 'us-east-1',
      },
    },
  });

  console.log('Index created. Waiting for it to be ready...');

  // Wait for index to be ready
  await new Promise((resolve) => setTimeout(resolve, 60000));
}

Connecting to an Index

import { Pinecone } from '@pinecone-database/pinecone';
import OpenAI from 'openai';

const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY!,
});

const openai = new OpenAI();
const index = pinecone.index('documents');

// Generate embedding
async function getEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
  return response.data[0].embedding;
}

Upserting Vectors

interface Document {
  id: string;
  content: string;
  metadata: Record<string, string>;
}

async function upsertDocuments(documents: Document[]): Promise<void> {
  // Generate embeddings for all documents
  const embeddings = await Promise.all(documents.map((doc) => getEmbedding(doc.content)));

  // Prepare vectors for upsert
  const vectors = documents.map((doc, i) => ({
    id: doc.id,
    values: embeddings[i],
    metadata: {
      ...doc.metadata,
      content: doc.content, // Store content in metadata for retrieval
    },
  }));

  // Upsert in batches of 100
  const batchSize = 100;
  for (let i = 0; i < vectors.length; i += batchSize) {
    const batch = vectors.slice(i, i + batchSize);
    await index.upsert(batch);
    console.log(`Upserted batch ${Math.floor(i / batchSize) + 1}`);
  }
}

Querying Vectors

interface SearchResult {
  id: string;
  score: number;
  content: string;
  metadata: Record<string, string>;
}

async function searchDocuments(
  query: string,
  topK: number = 5,
  filter?: Record<string, string>
): Promise<SearchResult[]> {
  const queryEmbedding = await getEmbedding(query);

  const results = await index.query({
    vector: queryEmbedding,
    topK,
    includeMetadata: true,
    filter,
  });

  return results.matches.map((match) => ({
    id: match.id,
    score: match.score || 0,
    content: (match.metadata?.content as string) || '',
    metadata: match.metadata as Record<string, string>,
  }));
}

Complete Pinecone Example

import { Pinecone } from '@pinecone-database/pinecone';
import OpenAI from 'openai';

const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY!,
});

const openai = new OpenAI();

class PineconeDocumentStore {
  private index;

  constructor(indexName: string) {
    this.index = pinecone.index(indexName);
  }

  private async getEmbedding(text: string): Promise<number[]> {
    const response = await openai.embeddings.create({
      model: 'text-embedding-3-small',
      input: text,
    });
    return response.data[0].embedding;
  }

  private async getEmbeddings(texts: string[]): Promise<number[][]> {
    const response = await openai.embeddings.create({
      model: 'text-embedding-3-small',
      input: texts,
    });
    return response.data.sort((a, b) => a.index - b.index).map((item) => item.embedding);
  }

  async addDocuments(
    documents: { id: string; content: string; metadata?: Record<string, string> }[]
  ): Promise<void> {
    const embeddings = await this.getEmbeddings(documents.map((d) => d.content));

    const vectors = documents.map((doc, i) => ({
      id: doc.id,
      values: embeddings[i],
      metadata: {
        content: doc.content,
        ...doc.metadata,
      },
    }));

    // Batch upsert
    const batchSize = 100;
    for (let i = 0; i < vectors.length; i += batchSize) {
      await this.index.upsert(vectors.slice(i, i + batchSize));
    }

    console.log(`Added ${documents.length} documents`);
  }

  async search(
    query: string,
    limit: number = 5,
    filter?: Record<string, any>
  ): Promise<{ id: string; content: string; score: number }[]> {
    const queryEmbedding = await this.getEmbedding(query);

    const results = await this.index.query({
      vector: queryEmbedding,
      topK: limit,
      includeMetadata: true,
      filter,
    });

    return results.matches.map((match) => ({
      id: match.id,
      content: (match.metadata?.content as string) || '',
      score: match.score || 0,
    }));
  }

  async deleteDocument(id: string): Promise<void> {
    await this.index.deleteOne(id);
  }

  async deleteAll(): Promise<void> {
    await this.index.deleteAll();
  }
}

// Usage
async function main() {
  const store = new PineconeDocumentStore('documents');

  await store.addDocuments([
    {
      id: 'doc1',
      content: 'Reset password in Settings > Security > Reset Password',
      metadata: { category: 'security' },
    },
    {
      id: 'doc2',
      content: 'Enable 2FA for enhanced account protection',
      metadata: { category: 'security' },
    },
  ]);

  const results = await store.search('How to change password?');
  console.log('Results:', results);
}

main();

Chroma vs Pinecone Comparison

Feature Chroma Pinecone
Setup npm install npm install + account setup
Hosting Local/self-hosted Cloud managed
Cost Free Free tier, then pay-per-use
Scaling Limited Automatic
Persistence Local files Cloud storage
Built-in Embeddings Yes No (bring your own)
Best For Development, small projects Production, enterprise

When to Use Chroma

  • Local development and testing
  • Small datasets (< 100k documents)
  • Privacy-sensitive applications
  • Quick prototyping

When to Use Pinecone

  • Production deployments
  • Large datasets (millions of vectors)
  • Need for high availability
  • Team collaboration

Namespace and Filtering

Both databases support organizing data:

Pinecone Namespaces

// Use namespaces to separate data
const index = pinecone.index('documents');
const namespace = index.namespace('product-docs');

await namespace.upsert([...vectors]);
const results = await namespace.query({ vector, topK: 5 });

Metadata Filtering

// Pinecone filter syntax
const results = await index.query({
  vector: queryEmbedding,
  topK: 5,
  filter: {
    category: { $eq: "security" },
    priority: { $in: ["high", "medium"] },
  },
});

// Chroma filter syntax
const results = await collection.query({
  queryTexts: ["query"],
  where: {
    $and: [
      { category: { $eq: "security" } },
      { priority: { $in: ["high", "medium"] } },
    ],
  },
});

Key Takeaways

  1. Vector databases enable efficient similarity search at scale with ANN algorithms
  2. Chroma is ideal for development with its simple setup and built-in embeddings
  3. Pinecone provides production-ready infrastructure with automatic scaling
  4. Metadata filtering combines semantic search with traditional filters
  5. Choose based on your needs: local development vs production, scale requirements

Resources

Resource Type Level
Chroma Documentation Documentation Beginner
Pinecone Documentation Documentation Beginner
Vector Database Comparison Article Intermediate

Next Lesson

In the next lesson, you will learn about chunking strategies - how to split documents effectively for optimal RAG performance.

Continue to Lesson 4.4: Chunking Strategies