Lesson 5.5: Practice - Research Agent

Duration: 90 minutes

Learning Objectives

By the end of this lesson, you will be able to:

Design and build a complete research agent from scratch
Implement web search, note-taking, and synthesis tools
Create planning and execution logic for multi-step research
Handle real-world challenges like rate limiting and error recovery

Project Overview

You will build a research agent that can:

Accept a research topic from the user
Plan a research strategy
Search the web for information
Take structured notes
Synthesize findings into a comprehensive report
Cite sources properly

This project combines everything you have learned about agents, planning, tools, and LangChain.js.

Project Setup

Create the project structure:

mkdir research-agent
cd research-agent
npm init -y

Install dependencies:

npm install typescript tsx @types/node openai @langchain/openai langchain dotenv zod

Create tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "outDir": "dist"
  },
  "include": ["src/**/*"]
}

Create .env:

OPENAI_API_KEY=sk-proj-your-key-here

Project structure:

research-agent/
├── src/
│   ├── tools/
│   │   ├── search.ts
│   │   ├── notes.ts
│   │   └── index.ts
│   ├── agent/
│   │   ├── planner.ts
│   │   ├── researcher.ts
│   │   └── synthesizer.ts
│   ├── types.ts
│   └── index.ts
├── .env
├── .gitignore
├── package.json
└── tsconfig.json

Step 1: Define Types

Create src/types.ts:

export interface SearchResult {
  title: string;
  url: string;
  snippet: string;
}

export interface Note {
  id: string;
  topic: string;
  content: string;
  source?: string;
  timestamp: Date;
}

export interface ResearchPlan {
  topic: string;
  questions: string[];
  searchQueries: string[];
}

export interface ResearchReport {
  topic: string;
  summary: string;
  keyFindings: string[];
  sources: string[];
  fullReport: string;
}

export interface AgentState {
  topic: string;
  plan: ResearchPlan | null;
  notes: Note[];
  searchResults: SearchResult[];
  currentStep: string;
  errors: string[];
}

Step 2: Implement the Search Tool

Create src/tools/search.ts:

import { tool } from '@langchain/core/tools';
import { z } from 'zod';

import { SearchResult } from '../types.js';

// In production, use a real search API like SerpAPI, Tavily, or Brave Search
// This is a simulated search for demonstration
async function performSearch(query: string): Promise<SearchResult[]> {
  // Simulate API delay
  await new Promise((resolve) => setTimeout(resolve, 500));

  // Simulated results - replace with real API call
  const simulatedResults: SearchResult[] = [
    {
      title: `Understanding ${query} - Comprehensive Guide`,
      url: `https://example.com/guide/${encodeURIComponent(query)}`,
      snippet: `A detailed exploration of ${query}, covering key concepts, recent developments, and practical applications in various fields.`,
    },
    {
      title: `${query}: Latest Research and Findings`,
      url: `https://research.example.com/${encodeURIComponent(query)}`,
      snippet: `Recent studies have shown significant progress in ${query}. Researchers highlight important breakthroughs and their implications.`,
    },
    {
      title: `The Future of ${query}`,
      url: `https://future.example.com/${encodeURIComponent(query)}`,
      snippet: `Experts predict major changes in ${query} over the next decade. Key trends and challenges are examined in detail.`,
    },
  ];

  return simulatedResults;
}

export const searchTool = tool(
  async ({ query, maxResults = 5 }) => {
    try {
      const results = await performSearch(query);
      const limitedResults = results.slice(0, maxResults);

      const formatted = limitedResults
        .map((r, i) => `[${i + 1}] ${r.title}\nURL: ${r.url}\n${r.snippet}`)
        .join('\n\n');

      return formatted || 'No results found.';
    } catch (error) {
      return `Search error: ${error instanceof Error ? error.message : 'Unknown error'}`;
    }
  },
  {
    name: 'web_search',
    description:
      'Searches the web for information on a given query. Returns titles, URLs, and snippets.',
    schema: z.object({
      query: z.string().describe('The search query'),
      maxResults: z.number().default(5).describe('Maximum number of results to return'),
    }),
  }
);

// Tool for more targeted searches
export const academicSearchTool = tool(
  async ({ query, year }) => {
    // Simulated academic search
    await new Promise((resolve) => setTimeout(resolve, 500));

    const yearFilter = year ? ` from ${year}` : '';
    return `Academic results for "${query}"${yearFilter}:
    
[1] "Research Advances in ${query}" - Journal of Science, 2024
Key findings include improved methodologies and novel applications.

[2] "A Systematic Review of ${query}" - Academic Press, 2023
Comprehensive analysis of 50+ studies with meta-analysis results.`;
  },
  {
    name: 'academic_search',
    description: 'Searches academic papers and research publications',
    schema: z.object({
      query: z.string().describe('The academic search query'),
      year: z.number().optional().describe('Filter by publication year'),
    }),
  }
);

Step 3: Implement the Notes Tool

Create src/tools/notes.ts:

import { tool } from '@langchain/core/tools';
import { z } from 'zod';

import { Note } from '../types.js';

// In-memory note storage
const noteStore: Map<string, Note> = new Map();

function generateId(): string {
  return Math.random().toString(36).substring(2, 9);
}

export const addNoteTool = tool(
  async ({ topic, content, source }) => {
    const id = generateId();
    const note: Note = {
      id,
      topic,
      content,
      source,
      timestamp: new Date(),
    };

    noteStore.set(id, note);

    return `Note saved with ID: ${id}\nTopic: ${topic}\nContent preview: ${content.substring(0, 100)}...`;
  },
  {
    name: 'add_note',
    description: 'Saves a research note with topic, content, and optional source',
    schema: z.object({
      topic: z.string().describe('The topic or category of the note'),
      content: z.string().describe('The note content'),
      source: z.string().optional().describe('Source URL or reference'),
    }),
  }
);

export const listNotesTool = tool(
  async ({ topic }) => {
    const notes = Array.from(noteStore.values());

    const filtered = topic
      ? notes.filter((n) => n.topic.toLowerCase().includes(topic.toLowerCase()))
      : notes;

    if (filtered.length === 0) {
      return topic ? `No notes found for topic: ${topic}` : 'No notes saved yet.';
    }

    const formatted = filtered
      .map((n) => `[${n.id}] ${n.topic}\n${n.content}\n${n.source ? `Source: ${n.source}` : ''}`)
      .join('\n\n---\n\n');

    return `Found ${filtered.length} note(s):\n\n${formatted}`;
  },
  {
    name: 'list_notes',
    description: 'Lists all saved notes, optionally filtered by topic',
    schema: z.object({
      topic: z.string().optional().describe('Filter notes by topic'),
    }),
  }
);

export const searchNotesTool = tool(
  async ({ query }) => {
    const notes = Array.from(noteStore.values());

    const matching = notes.filter(
      (n) =>
        n.content.toLowerCase().includes(query.toLowerCase()) ||
        n.topic.toLowerCase().includes(query.toLowerCase())
    );

    if (matching.length === 0) {
      return `No notes matching: ${query}`;
    }

    const formatted = matching
      .map((n) => `[${n.id}] ${n.topic}: ${n.content.substring(0, 150)}...`)
      .join('\n\n');

    return `Found ${matching.length} matching note(s):\n\n${formatted}`;
  },
  {
    name: 'search_notes',
    description: 'Searches through saved notes for specific content',
    schema: z.object({
      query: z.string().describe('Search query for notes'),
    }),
  }
);

export const clearNotesTool = tool(
  async () => {
    const count = noteStore.size;
    noteStore.clear();
    return `Cleared ${count} note(s).`;
  },
  {
    name: 'clear_notes',
    description: 'Clears all saved notes',
    schema: z.object({}),
  }
);

// Export all notes for report generation
export function getAllNotes(): Note[] {
  return Array.from(noteStore.values());
}

export function clearNoteStore(): void {
  noteStore.clear();
}

Step 4: Export Tools

Create src/tools/index.ts:

export { searchTool, academicSearchTool } from './search.js';
export {
  addNoteTool,
  listNotesTool,
  searchNotesTool,
  clearNotesTool,
  getAllNotes,
  clearNoteStore,
} from './notes.js';

Step 5: Implement the Planner

Create src/agent/planner.ts:

import { ChatOpenAI } from '@langchain/openai';

import { ResearchPlan } from '../types.js';

export class ResearchPlanner {
  private model: ChatOpenAI;

  constructor(model: ChatOpenAI) {
    this.model = model;
  }

  async createPlan(topic: string): Promise<ResearchPlan> {
    const response = await this.model.invoke([
      {
        role: 'system',
        content: `You are a research planning assistant. Given a topic, create a research plan.

Output a JSON object with:
{
  "topic": "the research topic",
  "questions": ["key questions to answer", ...],
  "searchQueries": ["specific search queries to run", ...]
}

Create 3-5 questions and 3-5 search queries. Focus on:
- Understanding the basics
- Recent developments
- Different perspectives
- Practical applications`,
      },
      {
        role: 'user',
        content: `Create a research plan for: ${topic}`,
      },
    ]);

    try {
      const content = response.content as string;
      // Extract JSON from response
      const jsonMatch = content.match(/\{[\s\S]*\}/);
      if (jsonMatch) {
        return JSON.parse(jsonMatch[0]);
      }
      throw new Error('No JSON found in response');
    } catch {
      // Fallback plan
      return {
        topic,
        questions: [
          `What is ${topic}?`,
          `What are the key aspects of ${topic}?`,
          `What are recent developments in ${topic}?`,
        ],
        searchQueries: [topic, `${topic} overview`, `${topic} latest developments`],
      };
    }
  }

  async refinePlan(plan: ResearchPlan, findingsSoFar: string): Promise<ResearchPlan> {
    const response = await this.model.invoke([
      {
        role: 'system',
        content: `You are a research planning assistant. Refine the research plan based on findings so far.

Current plan:
${JSON.stringify(plan, null, 2)}

Add new questions or search queries to fill gaps in the research.
Output a JSON object with the same structure.`,
      },
      {
        role: 'user',
        content: `Findings so far:\n${findingsSoFar}\n\nRefine the plan to address any gaps.`,
      },
    ]);

    try {
      const content = response.content as string;
      const jsonMatch = content.match(/\{[\s\S]*\}/);
      if (jsonMatch) {
        return JSON.parse(jsonMatch[0]);
      }
      return plan;
    } catch {
      return plan;
    }
  }
}

Step 6: Implement the Researcher Agent

Create src/agent/researcher.ts:

import { ChatPromptTemplate } from '@langchain/core/prompts';
import { ChatOpenAI } from '@langchain/openai';
import { AgentExecutor, createToolCallingAgent } from 'langchain/agents';

import {
  academicSearchTool,
  addNoteTool,
  listNotesTool,
  searchNotesTool,
  searchTool,
} from '../tools/index.js';
import { ResearchPlan } from '../types.js';

export class ResearcherAgent {
  private executor: AgentExecutor;
  private model: ChatOpenAI;

  constructor(model: ChatOpenAI) {
    this.model = model;
    this.executor = this.createExecutor();
  }

  private createExecutor(): AgentExecutor {
    const tools = [searchTool, academicSearchTool, addNoteTool, listNotesTool, searchNotesTool];

    const prompt = ChatPromptTemplate.fromMessages([
      [
        'system',
        `You are a thorough research assistant. Your job is to:

1. Search for information using the provided tools
2. Take detailed notes on important findings
3. Always cite your sources
4. Be systematic and comprehensive

When researching:
- Use web_search for general information
- Use academic_search for scholarly sources
- Use add_note to save important findings (always include the source)
- Use list_notes to review what you've gathered
- Use search_notes to find specific information in your notes

Be thorough but focused. Quality over quantity.`,
      ],
      ['human', '{input}'],
      ['placeholder', '{agent_scratchpad}'],
    ]);

    const agent = createToolCallingAgent({
      llm: this.model,
      tools,
      prompt,
    });

    return new AgentExecutor({
      agent,
      tools,
      verbose: false,
      maxIterations: 10,
    });
  }

  async research(plan: ResearchPlan): Promise<void> {
    console.log(`\nStarting research on: ${plan.topic}`);
    console.log(`Questions to answer: ${plan.questions.length}`);
    console.log(`Search queries planned: ${plan.searchQueries.length}\n`);

    // Execute search queries
    for (const query of plan.searchQueries) {
      console.log(`Searching: "${query}"...`);

      await this.executor.invoke({
        input: `Search for: "${query}"
        
After getting results, save the most relevant and important findings as notes.
Include the source URL in each note.`,
      });
    }

    // Answer research questions
    for (const question of plan.questions) {
      console.log(`Researching: "${question}"...`);

      await this.executor.invoke({
        input: `Research this question: "${question}"

1. First, check your existing notes to see if you have relevant information
2. If needed, search for additional information
3. Save any new important findings as notes with sources`,
      });
    }

    console.log('\nResearch phase complete.');
  }

  async answerQuestion(question: string): Promise<string> {
    const result = await this.executor.invoke({
      input: `Answer this question based on your research notes and additional searching if needed:

${question}

First check your notes, then search if you need more information.
Provide a well-sourced answer.`,
    });

    return result.output;
  }
}

Step 7: Implement the Synthesizer

Create src/agent/synthesizer.ts:

import { ChatOpenAI } from '@langchain/openai';

import { getAllNotes } from '../tools/index.js';
import { Note, ResearchReport } from '../types.js';

export class ReportSynthesizer {
  private model: ChatOpenAI;

  constructor(model: ChatOpenAI) {
    this.model = model;
  }

  async synthesize(topic: string): Promise<ResearchReport> {
    const notes = getAllNotes();

    if (notes.length === 0) {
      return {
        topic,
        summary: 'No research notes available.',
        keyFindings: [],
        sources: [],
        fullReport: 'Unable to generate report: No research data collected.',
      };
    }

    const notesText = notes
      .map((n) => `Topic: ${n.topic}\nContent: ${n.content}\nSource: ${n.source || 'N/A'}`)
      .join('\n\n---\n\n');

    const response = await this.model.invoke([
      {
        role: 'system',
        content: `You are a research report writer. Create a comprehensive report from research notes.

Your report should include:
1. An executive summary (2-3 paragraphs)
2. Key findings (bullet points)
3. Detailed analysis organized by theme
4. Conclusions and implications
5. Sources cited

Write in a clear, professional style. Be objective and evidence-based.`,
      },
      {
        role: 'user',
        content: `Create a research report on: ${topic}

Research Notes:
${notesText}

Generate a comprehensive report with all sections.`,
      },
    ]);

    const fullReport = response.content as string;

    // Extract key findings
    const keyFindings = await this.extractKeyFindings(fullReport);

    // Extract sources
    const sources = this.extractSources(notes);

    // Generate summary
    const summary = await this.generateSummary(fullReport);

    return {
      topic,
      summary,
      keyFindings,
      sources,
      fullReport,
    };
  }

  private async extractKeyFindings(report: string): Promise<string[]> {
    const response = await this.model.invoke([
      {
        role: 'system',
        content: 'Extract 5-7 key findings from this report as a JSON array of strings.',
      },
      {
        role: 'user',
        content: report,
      },
    ]);

    try {
      const content = response.content as string;
      const arrayMatch = content.match(/\[[\s\S]*\]/);
      if (arrayMatch) {
        return JSON.parse(arrayMatch[0]);
      }
      return ['Key findings extraction failed'];
    } catch {
      return ['Key findings extraction failed'];
    }
  }

  private extractSources(notes: Note[]): string[] {
    const sources = new Set<string>();

    for (const note of notes) {
      if (note.source) {
        sources.add(note.source);
      }
    }

    return Array.from(sources);
  }

  private async generateSummary(report: string): Promise<string> {
    const response = await this.model.invoke([
      {
        role: 'system',
        content: 'Generate a 2-3 sentence executive summary of this report.',
      },
      {
        role: 'user',
        content: report,
      },
    ]);

    return response.content as string;
  }
}

Step 8: Create the Main Application

Create src/index.ts:

import { ChatOpenAI } from '@langchain/openai';
import * as dotenv from 'dotenv';
import * as readline from 'readline';

import { ResearchPlanner } from './agent/planner.js';
import { ResearcherAgent } from './agent/researcher.js';
import { ReportSynthesizer } from './agent/synthesizer.js';
import { clearNoteStore } from './tools/index.js';

dotenv.config();

class ResearchAssistant {
  private planner: ResearchPlanner;
  private researcher: ResearcherAgent;
  private synthesizer: ReportSynthesizer;

  constructor() {
    const model = new ChatOpenAI({
      modelName: 'gpt-4o',
      temperature: 0,
    });

    this.planner = new ResearchPlanner(model);
    this.researcher = new ResearcherAgent(model);
    this.synthesizer = new ReportSynthesizer(model);
  }

  async research(topic: string): Promise<string> {
    console.log('='.repeat(60));
    console.log(`Research Assistant: Investigating "${topic}"`);
    console.log('='.repeat(60));

    // Clear previous notes
    clearNoteStore();

    // Step 1: Create research plan
    console.log('\n[Step 1] Creating research plan...');
    const plan = await this.planner.createPlan(topic);
    console.log('Plan created:');
    console.log(`  Questions: ${plan.questions.length}`);
    console.log(`  Search queries: ${plan.searchQueries.length}`);

    // Step 2: Execute research
    console.log('\n[Step 2] Executing research...');
    await this.researcher.research(plan);

    // Step 3: Synthesize report
    console.log('\n[Step 3] Synthesizing report...');
    const report = await this.synthesizer.synthesize(topic);

    // Format output
    const output = this.formatReport(report);

    console.log('\n[Complete] Research finished.');
    console.log('='.repeat(60));

    return output;
  }

  private formatReport(report: {
    topic: string;
    summary: string;
    keyFindings: string[];
    sources: string[];
    fullReport: string;
  }): string {
    return `
# Research Report: ${report.topic}

## Executive Summary
${report.summary}

## Key Findings
${report.keyFindings.map((f, i) => `${i + 1}. ${f}`).join('\n')}

## Full Report
${report.fullReport}

## Sources
${report.sources.map((s, i) => `[${i + 1}] ${s}`).join('\n')}
`;
  }
}

// Interactive CLI
async function main() {
  const assistant = new ResearchAssistant();

  const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout,
  });

  console.log('\n=== Research Assistant ===');
  console.log("Enter a topic to research, or 'quit' to exit.\n");

  const prompt = () => {
    rl.question('Research topic: ', async (input) => {
      const topic = input.trim();

      if (topic.toLowerCase() === 'quit') {
        console.log('Goodbye!');
        rl.close();
        return;
      }

      if (!topic) {
        prompt();
        return;
      }

      try {
        const report = await assistant.research(topic);
        console.log(report);
      } catch (error) {
        console.error('Research failed:', error);
      }

      prompt();
    });
  };

  prompt();
}

main().catch(console.error);

Step 9: Run the Agent

Add scripts to package.json:

{
  "type": "module",
  "scripts": {
    "start": "tsx src/index.ts",
    "build": "tsc",
    "dev": "tsx watch src/index.ts"
  }
}

Run the agent:

npm start

Example interaction:

=== Research Assistant ===
Enter a topic to research, or 'quit' to exit.

Research topic: impact of artificial intelligence on healthcare

============================================================
Research Assistant: Investigating "impact of artificial intelligence on healthcare"
============================================================

[Step 1] Creating research plan...
Plan created:
  Questions: 4
  Search queries: 4

[Step 2] Executing research...

Starting research on: impact of artificial intelligence on healthcare
Questions to answer: 4
Search queries planned: 4

Searching: "artificial intelligence healthcare applications"...
Searching: "AI medical diagnosis accuracy"...
Searching: "healthcare AI ethical concerns"...
Searching: "future of AI in medicine"...
Researching: "What are the main applications of AI in healthcare?"...
Researching: "How accurate is AI in medical diagnosis?"...
Researching: "What ethical concerns exist around AI in healthcare?"...
Researching: "What does the future hold for AI in medicine?"...

Research phase complete.

[Step 3] Synthesizing report...

[Complete] Research finished.
============================================================

# Research Report: impact of artificial intelligence on healthcare

## Executive Summary
Artificial intelligence is transforming healthcare through improved diagnostics,
personalized treatment, and operational efficiency...

## Key Findings
1. AI diagnostic systems show 85-95% accuracy in imaging analysis
2. Machine learning enables personalized treatment recommendations
3. Natural language processing streamlines clinical documentation
4. Ethical concerns include data privacy and algorithmic bias
5. Cost reduction of 30-50% projected in administrative tasks

## Full Report
...

## Sources
[1] https://example.com/guide/artificial%20intelligence%20healthcare
[2] https://research.example.com/AI%20medical%20diagnosis
...

Extending the Agent

Add Real Search APIs

Replace the simulated search with real APIs:

// Using Tavily API
import { TavilySearchResults } from '@langchain/community/tools/tavily_search';

const searchTool = new TavilySearchResults({
  maxResults: 5,
  apiKey: process.env.TAVILY_API_KEY,
});

Add Document Processing

Process PDFs and documents:

import { PDFLoader } from 'langchain/document_loaders/fs/pdf';

const loader = new PDFLoader('research-paper.pdf');
const docs = await loader.load();

Add Vector Storage

Store notes in a vector database for semantic search:

import { OpenAIEmbeddings } from '@langchain/openai';
import { MemoryVectorStore } from 'langchain/vectorstores/memory';

const vectorStore = await MemoryVectorStore.fromTexts(
  notes.map((n) => n.content),
  notes.map((n) => ({ source: n.source })),
  new OpenAIEmbeddings()
);

const relevantNotes = await vectorStore.similaritySearch(query, 3);

Error Handling and Robustness

Add retry logic and error handling:

async function withRetry<T>(
  fn: () => Promise<T>,
  maxRetries: number = 3,
  delay: number = 1000
): Promise<T> {
  let lastError: Error | null = null;

  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error instanceof Error ? error : new Error(String(error));
      console.log(`Attempt ${i + 1} failed, retrying in ${delay}ms...`);
      await new Promise((resolve) => setTimeout(resolve, delay));
      delay *= 2; // Exponential backoff
    }
  }

  throw lastError;
}

// Usage
const results = await withRetry(() => searchTool.invoke({ query }));

Key Takeaways

Research agents combine planning, searching, note-taking, and synthesis
Modular design separates concerns into planner, researcher, and synthesizer
Tools provide capabilities for search, notes, and data management
Error handling is essential for production-ready agents
The agent pattern can be extended with real APIs and vector storage

Challenges to Try

Add a fact-checker: Verify claims against multiple sources
Implement source ranking: Prioritize authoritative sources
Add conversation memory: Let users ask follow-up questions
Create a web interface: Build a simple frontend for the agent
Add export options: Export reports as PDF or Markdown files

Resources

Resource	Type	Level
LangChain.js Agents	Documentation	Intermediate
Tavily Search API	Tool	Beginner
LangGraph.js	Documentation	Advanced
AutoGPT	Repository	Advanced

Module Summary

Congratulations! You have completed the AI Agents module. You now understand:

What AI agents are and how they differ from chatbots
Planning and reasoning patterns for complex tasks
Multi-agent systems and coordination strategies
LangChain.js for building production agents
How to build a complete research agent from scratch

Continue to Module 6: Capstone Project to put all your skills together in a comprehensive AI application.