From Zero to AI

Lesson 3.5: Practice - Prompt Optimization

Duration: 60 minutes

Learning Objectives

By the end of this lesson, you will be able to:

  • Apply all prompt engineering techniques in practical scenarios
  • Identify and fix common prompt problems
  • Optimize prompts for consistency, accuracy, and cost
  • Build a systematic approach to prompt development
  • Create production-ready prompts for real applications

Introduction

You have learned the building blocks of prompt engineering: clear instructions, system prompts, few-shot learning, and chain-of-thought reasoning. Now it is time to put these techniques together.

In this lesson, you will work through real-world scenarios, optimizing prompts from basic to production-ready. You will learn to diagnose prompt issues and systematically improve them.


The Prompt Optimization Workflow

Before diving into exercises, let us establish a systematic approach:

┌─────────────────────────────────────────────────────────┐
│              Prompt Optimization Workflow                │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  1. DEFINE THE TASK                                     │
│     └─► What exactly should the prompt accomplish?      │
│                                                         │
│  2. WRITE THE BASELINE PROMPT                           │
│     └─► Start simple, get something working             │
│                                                         │
│  3. TEST WITH DIVERSE INPUTS                            │
│     └─► Edge cases, typical cases, adversarial inputs   │
│                                                         │
│  4. IDENTIFY FAILURE MODES                              │
│     └─► Where does the prompt fail or produce           │
│         inconsistent results?                           │
│                                                         │
│  5. APPLY TECHNIQUES                                    │
│     └─► System prompts, few-shot, CoT, constraints      │
│                                                         │
│  6. TEST AGAIN                                          │
│     └─► Verify improvements, check for regressions      │
│                                                         │
│  7. OPTIMIZE FOR COST                                   │
│     └─► Reduce tokens while maintaining quality         │
│                                                         │
└─────────────────────────────────────────────────────────┘

Scenario 1: Customer Intent Classification

The Task

Build a prompt that classifies customer support messages into categories: billing, technical, account, shipping, or general.

Step 1: Baseline Prompt

const baselinePrompt = `
Classify this customer message: "${message}"
`;

Problems:

  • No defined categories
  • No output format
  • Inconsistent results

Step 2: Add Structure

const structuredPrompt = `
Classify this customer message into one of these categories:
- billing
- technical
- account
- shipping
- general

Message: "${message}"

Category:`;

Improvement: Categories defined, but still inconsistent format.

Step 3: Add Few-Shot Examples

const fewShotPrompt = `
Classify customer messages into categories.

Message: "I was charged twice for my subscription"
Category: billing

Message: "The app crashes when I try to upload photos"
Category: technical

Message: "How do I change my password?"
Category: account

Message: "My order hasn't arrived after 2 weeks"
Category: shipping

Message: "Do you have any recommendations for beginners?"
Category: general

Message: "${message}"
Category:`;

Improvement: Examples guide classification, more consistent.

Step 4: Add System Prompt and Edge Cases

import OpenAI from 'openai';

const openai = new OpenAI();

async function classifyIntent(message: string): Promise<string> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: `You are a customer support classifier. Classify messages into exactly one category.
        
Categories:
- billing: Payment issues, charges, refunds, subscription
- technical: Bugs, errors, app problems, feature issues
- account: Login, password, profile, settings
- shipping: Delivery, tracking, address, returns
- general: Everything else, questions, feedback

Rules:
- Output ONLY the category name, nothing else
- If a message fits multiple categories, choose the primary concern
- When uncertain, classify as "general"`,
      },
      {
        role: 'user',
        content: `Message: "${message}"

Examples:
"I was charged twice" → billing
"App crashes on startup" → technical
"Can't log in" → account
"Where is my package?" → shipping
"Love your product!" → general

Category:`,
      },
    ],
    temperature: 0,
    max_tokens: 20,
  });

  return response.choices[0].message.content?.trim().toLowerCase() || 'general';
}

Step 5: Test and Validate

const testCases = [
  { message: 'I need a refund', expected: 'billing' },
  { message: "The button doesn't work", expected: 'technical' },
  { message: 'How do I update my email?', expected: 'account' },
  { message: 'Package arrived damaged', expected: 'shipping' },
  { message: 'Great service!', expected: 'general' },
  // Edge cases
  { message: 'I paid for shipping but order is late', expected: 'shipping' }, // Could be billing
  { message: "Can't login after password reset", expected: 'account' }, // Could be technical
  { message: '', expected: 'general' }, // Empty input
];

async function runTests() {
  let passed = 0;
  for (const test of testCases) {
    const result = await classifyIntent(test.message);
    const success = result === test.expected;
    console.log(
      `${success ? '✓' : '✗'} "${test.message}" → ${result} (expected: ${test.expected})`
    );
    if (success) passed++;
  }
  console.log(`\nPassed: ${passed}/${testCases.length}`);
}

Scenario 2: Code Review Assistant

The Task

Build a prompt that reviews code and provides actionable feedback.

Step 1: Baseline

const baselineReview = `
Review this code:
${code}
`;

Problems: Vague, no focus, inconsistent output format.

Step 2: Define Scope and Format

const scopedReview = `
Review this TypeScript code for:
1. Security issues
2. Bugs and logic errors
3. Type safety

Code:
\`\`\`typescript
${code}
\`\`\`

For each issue, provide:
- Line number
- Issue description
- Suggested fix
`;

Step 3: Add System Prompt with Expertise

const systemPrompt = `You are an expert TypeScript code reviewer with 10 years of experience.

Your review priorities (in order):
1. Security vulnerabilities (critical)
2. Runtime bugs and errors (high)
3. Type safety issues (medium)
4. Performance concerns (low)

Review guidelines:
- Be specific: reference line numbers and variable names
- Be actionable: provide concrete fixes, not vague suggestions
- Be concise: focus on real issues, not style preferences
- Acknowledge good practices when present

Output format for each issue:
[SEVERITY] Line X: Brief description
  Problem: What is wrong
  Fix: How to fix it

If no issues found, respond with "No issues found" and briefly praise what was done well.`;

Step 4: Full Implementation with Examples

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

async function reviewCode(code: string): Promise<string> {
  const response = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1500,
    system: `You are an expert TypeScript code reviewer. Focus on security, bugs, and type safety.

Output format for each issue:
[SEVERITY] Line X: Brief title
  Problem: Detailed explanation
  Fix: Concrete solution with code

Severity levels: CRITICAL, HIGH, MEDIUM, LOW

If the code is good, say "No issues found" and note what was done well.`,
    messages: [
      {
        role: 'user',
        content: `Review this code:

\`\`\`typescript
${code}
\`\`\`

Provide your review following the specified format.`,
      },
    ],
  });

  return response.content[0].text;
}

// Test with code that has issues
const testCode = `
async function getUser(id: string) {
  const query = "SELECT * FROM users WHERE id = " + id;
  const result = await db.query(query);
  return result[0];
}

function calculateDiscount(price, discount) {
  return price - (price * discount);
}
`;

const review = await reviewCode(testCode);
console.log(review);

Expected output:

[CRITICAL] Line 2: SQL Injection Vulnerability
  Problem: User input is concatenated directly into SQL query, allowing attackers to execute arbitrary SQL
  Fix: Use parameterized queries:
    const query = "SELECT * FROM users WHERE id = $1";
    const result = await db.query(query, [id]);

[MEDIUM] Lines 6-8: Missing TypeScript types
  Problem: Function parameters have implicit 'any' type, losing type safety benefits
  Fix: Add explicit types:
    function calculateDiscount(price: number, discount: number): number {
      return price - (price * discount);
    }

[LOW] Line 7: No input validation
  Problem: Discount could be negative or greater than 1, causing unexpected results
  Fix: Add validation:
    if (discount < 0 || discount > 1) {
      throw new Error("Discount must be between 0 and 1");
    }

Scenario 3: Data Extraction

The Task

Extract structured data from unstructured text reliably.

Step 1: Start Simple

const basicExtraction = `
Extract the name, email, and company from this text:
"${text}"
`;

Problems: Inconsistent format, fails on missing data.

Step 2: Define Schema

const schemaExtraction = `
Extract information and return as JSON:

{
  "name": "string or null",
  "email": "string or null",
  "company": "string or null"
}

Text: "${text}"
`;

Step 3: Few-Shot with Edge Cases

const robustExtraction = `
Extract contact information from text. Return valid JSON only.

Text: "Hi, I'm Sarah Chen from TechCorp. Reach me at sarah@techcorp.com"
Output: {"name": "Sarah Chen", "email": "sarah@techcorp.com", "company": "TechCorp"}

Text: "Contact John at john.doe@email.com"
Output: {"name": "John", "email": "john.doe@email.com", "company": null}

Text: "Questions? Visit our website."
Output: {"name": null, "email": null, "company": null}

Text: "Mike Johnson, Engineering Lead"
Output: {"name": "Mike Johnson", "email": null, "company": null}

Text: "${text}"
Output:`;

Step 4: Production Implementation

import OpenAI from 'openai';

const openai = new OpenAI();

interface ContactInfo {
  name: string | null;
  email: string | null;
  company: string | null;
  confidence: number;
}

async function extractContact(text: string): Promise<ContactInfo> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: `You extract contact information from text. Always return valid JSON.

Schema:
{
  "name": string or null,
  "email": string or null,
  "company": string or null,
  "confidence": number between 0 and 1
}

Rules:
- Set confidence based on how certain you are (1.0 = explicit, 0.5 = inferred, 0 = not found)
- Only extract if reasonably confident
- Return null for fields not found
- ONLY output JSON, no explanation`,
      },
      {
        role: 'user',
        content: `Extract contact info:

"${text}"

JSON:`,
      },
    ],
    temperature: 0,
    max_tokens: 200,
  });

  try {
    const content = response.choices[0].message.content || '{}';
    return JSON.parse(content);
  } catch {
    return { name: null, email: null, company: null, confidence: 0 };
  }
}

// Test cases
const tests = [
  "Hi, I'm Alex Rivera from DataFlow Inc. Email me at alex@dataflow.io",
  'Send questions to support@company.com',
  'Best regards, Dr. Emily Watson',
  'No contact info here, just a random message.',
];

for (const test of tests) {
  const result = await extractContact(test);
  console.log(`Input: "${test}"`);
  console.log(`Output:`, result);
  console.log('---');
}

Scenario 4: Multi-Step Analysis

The Task

Analyze a business problem and provide recommendations.

Step 1: Combine Techniques

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

async function analyzeBusinessProblem(context: string): Promise<string> {
  const response = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 2000,
    system: `You are a business analyst helping startups make strategic decisions.

Approach:
1. Understand the situation fully before analyzing
2. Consider multiple perspectives
3. Provide concrete, actionable recommendations
4. Be honest about trade-offs and risks

Communication style:
- Clear and structured
- Data-driven when possible
- Practical over theoretical`,
    messages: [
      {
        role: 'user',
        content: `Analyze this business situation and provide recommendations:

${context}

Please structure your analysis as follows:

## Understanding the Problem
Summarize the key challenge and constraints.

## Analysis
Think through this step by step:
- What are the main factors to consider?
- What are the options?
- What are the trade-offs?

## Recommendation
Provide a clear recommendation with reasoning.

## Action Items
List 3-5 specific next steps.

## Risks to Monitor
What could go wrong? What should they watch for?`,
      },
    ],
  });

  return response.content[0].text;
}

// Example usage
const businessContext = `
Our SaaS startup has 500 paying customers and $50k MRR. We're debating whether to:
A) Focus on getting more customers (growth)
B) Build more features for existing customers (retention/expansion)

Current metrics:
- Monthly churn: 5%
- Customer acquisition cost: $200
- Average revenue per user: $100/month
- NPS score: 35

We have 6 months of runway and a team of 5 engineers.
`;

const analysis = await analyzeBusinessProblem(businessContext);
console.log(analysis);

Optimization Techniques

Reducing Token Usage

// Verbose prompt (many tokens)
const verbosePrompt = `
I would like you to please help me by analyzing the following 
piece of code and identifying any potential issues that might 
exist within it. Please be thorough in your analysis and 
provide detailed explanations for each issue you find.
`;

// Optimized prompt (fewer tokens, same result)
const optimizedPrompt = `
Analyze this code for issues. For each issue, state:
- The problem
- How to fix it
`;

Using Smaller Models When Possible

// Route based on complexity
async function smartRoute(task: string, complexity: 'simple' | 'complex') {
  const model = complexity === 'simple' ? 'gpt-4o-mini' : 'gpt-4o';

  return await openai.chat.completions.create({
    model,
    messages: [{ role: 'user', content: task }],
  });
}

Caching Repeated Operations

const promptCache = new Map<string, string>();

async function cachedClassify(message: string): Promise<string> {
  const cacheKey = message.toLowerCase().trim();

  if (promptCache.has(cacheKey)) {
    return promptCache.get(cacheKey)!;
  }

  const result = await classifyIntent(message);
  promptCache.set(cacheKey, result);
  return result;
}

Debugging Prompts

Common Issues and Solutions

Problem Symptom Solution
Inconsistent output Different format each time Add few-shot examples
Wrong answers Logical errors Add chain-of-thought
Off-topic responses Ignores constraints Strengthen system prompt
Too verbose Long responses Add length constraints
Missing information Incomplete answers Ask explicitly for all parts

Debugging Workflow

async function debugPrompt(prompt: string, inputs: string[]): Promise<void> {
  console.log('=== Prompt Debug Report ===\n');
  console.log('Prompt:', prompt.substring(0, 100) + '...\n');

  for (const input of inputs) {
    console.log(`\nInput: "${input}"`);

    // Run multiple times to check consistency
    const results: string[] = [];
    for (let i = 0; i < 3; i++) {
      const result = await runPrompt(prompt, input);
      results.push(result);
    }

    const unique = [...new Set(results)];
    console.log(`Results (${unique.length} unique):`, results);
    console.log(`Consistency: ${unique.length === 1 ? 'GOOD' : 'POOR'}`);
  }
}

Exercises

Exercise 1: Fix the Prompt

This prompt produces inconsistent results. Fix it.

const brokenPrompt = `
Tell me if this review is good or bad:
${review}
`;
Solution
const fixedPrompt = `
Classify this product review as positive, negative, or neutral.

Examples:
Review: "Love it! Works perfectly."
Classification: positive

Review: "Terrible quality, broke after one day."
Classification: negative

Review: "It's okay, nothing special."
Classification: neutral

Review: "${review}"
Classification:`;

Fixes applied:

  • Defined clear categories (positive, negative, neutral)
  • Added few-shot examples
  • Specified output format (single word)

Exercise 2: Optimize for Cost

This prompt is too expensive. Reduce token usage while maintaining quality.

const expensivePrompt = `
I need your help with a task. I'm working on a project where I need to 
analyze customer feedback messages. For each message, I would like you 
to carefully read through the text and determine what the customer is 
primarily asking about or complaining about. This could be related to 
our billing system, technical issues with our product, their account 
settings, shipping and delivery concerns, or just general questions 
and comments that don't fit into any of those categories.

Please analyze the following message and tell me which of these five 
categories it belongs to: billing, technical, account, shipping, or 
general. Please just respond with the category name and nothing else.

Here is the customer message to analyze:
"${message}"

What category does this message belong to?
`;
Solution
const optimizedPrompt = `
Classify into: billing, technical, account, shipping, general

"${message}"

Category:`;

Token reduction:

  • Removed verbose explanation
  • Trust the model to understand common categories
  • Direct instruction
  • From ~150 tokens to ~20 tokens

For better accuracy while still being efficient:

const balancedPrompt = `
Classify customer message.
Categories: billing, technical, account, shipping, general

"${message}"

Reply with category only:`;

Exercise 3: Build a Complete Solution

Build a production-ready prompt system for summarizing meeting notes that:

  • Extracts action items
  • Identifies key decisions
  • Notes attendees mentioned
  • Handles various meeting formats
Solution
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

interface MeetingSummary {
  summary: string;
  attendees: string[];
  keyDecisions: string[];
  actionItems: {
    task: string;
    owner: string | null;
    deadline: string | null;
  }[];
  nextSteps: string[];
}

async function summarizeMeeting(notes: string): Promise<MeetingSummary> {
  const response = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1500,
    system: `You summarize meeting notes into structured data.

Output valid JSON matching this schema:
{
  "summary": "2-3 sentence overview",
  "attendees": ["names mentioned"],
  "keyDecisions": ["decisions made"],
  "actionItems": [
    {"task": "description", "owner": "name or null", "deadline": "date or null"}
  ],
  "nextSteps": ["follow-up items"]
}

Rules:
- Extract explicit information only
- Use null for missing owners/deadlines
- Keep summary concise
- Output JSON only`,
    messages: [
      {
        role: 'user',
        content: `Summarize these meeting notes:

${notes}

JSON:`,
      },
    ],
  });

  try {
    return JSON.parse(response.content[0].text);
  } catch {
    return {
      summary: 'Failed to parse meeting notes',
      attendees: [],
      keyDecisions: [],
      actionItems: [],
      nextSteps: [],
    };
  }
}

// Test
const testNotes = `
Team sync - March 15, 2024
Present: Sarah, Mike, Lisa

Discussed Q2 roadmap. Decided to prioritize the mobile app redesign over 
the API v2 work. Sarah will lead the design phase, targeting end of April.

Mike raised concerns about server costs. Lisa to review AWS spending and 
report back by Friday.

Action items:
- Sarah: Create mobile redesign mockups (by March 25)
- Mike: Document current API usage patterns
- Lisa: AWS cost analysis (by March 22)

Next sync: March 22
`;

const summary = await summarizeMeeting(testNotes);
console.log(JSON.stringify(summary, null, 2));

Prompt Engineering Checklist

Before deploying a prompt to production, verify:

Clarity

  • Task is explicitly defined
  • Output format is specified
  • Constraints are documented

Robustness

  • Tested with typical inputs
  • Tested with edge cases
  • Tested with adversarial inputs
  • Handles missing/malformed data

Consistency

  • Multiple runs produce similar results
  • Format is predictable and parseable

Performance

  • Appropriate model selected
  • Token usage optimized
  • Response time acceptable

Safety

  • Cannot produce harmful outputs
  • Handles prompt injection attempts
  • Fails gracefully on errors

Key Takeaways

  1. Start simple, iterate - Begin with basic prompts and add complexity as needed
  2. Test thoroughly - Use diverse inputs including edge cases
  3. Combine techniques - System prompts + few-shot + CoT as needed
  4. Optimize for production - Balance accuracy, cost, and latency
  5. Document your prompts - Version control and maintain them like code
  6. Monitor in production - Track failures and continuously improve

Resources

Resource Type Description
OpenAI Playground Tool Interactive prompt testing
Anthropic Workbench Tool Claude prompt experimentation
PromptPerfect Tool Automatic prompt optimization
LangSmith Tool Prompt testing and monitoring

Module Complete

Congratulations! You have completed the Prompt Engineering module. You now have the skills to:

  • Write clear, effective prompts
  • Use system prompts to define AI behavior
  • Apply few-shot learning for consistent outputs
  • Use chain-of-thought for complex reasoning
  • Optimize and debug prompts for production

Next module: Module 4: AI Providers - Learn about different AI providers and their APIs.