Lesson 3.4: Chain of Thought

Duration: 55 minutes

Learning Objectives

By the end of this lesson, you will be able to:

Understand what chain-of-thought (CoT) prompting is and why it works
Apply CoT to improve reasoning in complex tasks
Implement zero-shot and few-shot CoT techniques
Use CoT for math, logic, and multi-step problems
Combine CoT with self-verification for more reliable outputs

Introduction

Have you ever solved a math problem by writing out each step? Or talked through a decision out loud? This process of "thinking step by step" helps humans avoid mistakes and reach better conclusions.

Chain-of-thought prompting applies this same principle to AI. Instead of asking for just the answer, you ask the model to show its reasoning. This simple change dramatically improves accuracy on complex tasks like math, logic, and multi-step reasoning.

What is Chain of Thought?

Chain of thought (CoT) prompting encourages the model to break down problems into intermediate steps before reaching a conclusion.

Without Chain of Thought

Question: A store sells apples for $2 each. If someone buys 5 apples
and pays with a $20 bill, how much change do they receive?

Answer: $10

The model jumps directly to the answer. It might be right, but we cannot verify its reasoning.

With Chain of Thought

Question: A store sells apples for $2 each. If someone buys 5 apples
and pays with a $20 bill, how much change do they receive?

Let me think through this step by step:
1. Cost per apple: $2
2. Number of apples: 5
3. Total cost: $2 × 5 = $10
4. Amount paid: $20
5. Change: $20 - $10 = $10

Answer: $10

Now we can see each step and verify the reasoning. More importantly, the model is less likely to make mistakes when it thinks through the problem.

Why Chain of Thought Works

The Problem with Direct Answers

LLMs predict the most likely next token based on patterns. For simple questions, this works well. But for complex problems, the "most likely" answer based on surface patterns might be wrong.

The CoT Advantage

When you ask for step-by-step reasoning, the model generates intermediate tokens that guide it toward the correct answer:

Each intermediate step adds context that makes the next step more likely to be correct.

Implementing Chain of Thought

Zero-Shot CoT: The Magic Phrase

The simplest way to enable chain-of-thought reasoning is to add "Let's think step by step" to your prompt:

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

// Without CoT
const directPrompt = `
If a train travels at 60 mph for 2.5 hours, then at 80 mph for 1.5 hours, 
what is the total distance traveled?
`;

// With Zero-Shot CoT
const cotPrompt = `
If a train travels at 60 mph for 2.5 hours, then at 80 mph for 1.5 hours, 
what is the total distance traveled?

Let's think through this step by step.
`;

const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 500,
  messages: [{ role: 'user', content: cotPrompt }],
});

console.log(response.content[0].text);

Expected output with CoT:

Let's think through this step by step.

1. First leg of the journey:
   - Speed: 60 mph
   - Time: 2.5 hours
   - Distance = Speed × Time = 60 × 2.5 = 150 miles

2. Second leg of the journey:
   - Speed: 80 mph
   - Time: 1.5 hours
   - Distance = Speed × Time = 80 × 1.5 = 120 miles

3. Total distance:
   - 150 miles + 120 miles = 270 miles

The train traveled a total of 270 miles.

Few-Shot CoT: Teaching by Example

For more consistent results, show the model examples of step-by-step reasoning:

const fewShotCoT = `
Solve the following problems step by step.

Problem: A restaurant bill is $85. If you want to leave a 20% tip, 
how much should you pay in total?

Solution:
Step 1: Calculate the tip amount
- Tip = Bill × Tip percentage
- Tip = $85 × 0.20 = $17

Step 2: Calculate total payment
- Total = Bill + Tip
- Total = $85 + $17 = $102

Answer: $102

---

Problem: A store offers 30% off on a $60 item. If tax is 8%, 
what is the final price?

Solution:
Step 1: Calculate the discount
- Discount = $60 × 0.30 = $18

Step 2: Calculate discounted price
- Discounted price = $60 - $18 = $42

Step 3: Calculate tax on discounted price
- Tax = $42 × 0.08 = $3.36

Step 4: Calculate final price
- Final price = $42 + $3.36 = $45.36

Answer: $45.36

---

Problem: ${userProblem}

Solution:
`;

Effective CoT Phrases

Different phrases can trigger chain-of-thought reasoning:

const cotPhrases = [
  "Let's think step by step.",
  "Let's work through this problem.",
  "Let's break this down into steps.",
  'Let me reason through this carefully.',
  "I'll solve this systematically.",
  "Let's analyze this step by step.",
  'First, let me understand the problem, then solve it step by step.',
];

// Using structured prompts
const structuredCoT = `
${problem}

Please solve this by:
1. First, identify what we're trying to find
2. List the relevant information
3. Show each calculation step
4. State the final answer
`;

Use Cases for Chain of Thought

Mathematical Reasoning

const mathPrompt = `
A farmer has chickens and cows. There are 50 heads and 140 legs in total.
How many chickens and how many cows are there?

Let's solve this step by step:
1. Define variables
2. Set up equations
3. Solve the system
4. Verify the answer
`;

Logical Deduction

const logicPrompt = `
In a room of 5 people: Alice, Bob, Carol, David, and Eve.
- Alice is taller than Bob
- Carol is shorter than Bob
- David is taller than Alice
- Eve is shorter than Carol

Who is the tallest person? Work through this step by step.
`;

Code Debugging

const debugPrompt = `
This function should return the sum of even numbers, but it has a bug:

\`\`\`typescript
function sumEven(numbers: number[]): number {
  let sum = 0;
  for (let i = 1; i <= numbers.length; i++) {
    if (numbers[i] % 2 === 0) {
      sum += numbers[i];
    }
  }
  return sum;
}
\`\`\`

Think through this step by step:
1. What is the function supposed to do?
2. Trace through the code with an example input
3. Identify where the bug occurs
4. Explain the fix
`;

Decision Analysis

const decisionPrompt = `
Should our startup use a monolithic or microservices architecture?

Context:
- Team size: 5 developers
- Expected users: 10,000 in year 1
- Budget: Limited
- Timeline: MVP in 3 months

Please analyze this step by step:
1. List pros and cons of each approach
2. Consider each constraint
3. Make a recommendation with reasoning
`;

Advanced CoT Techniques

Self-Consistency

Run the same CoT prompt multiple times and pick the most common answer:

import OpenAI from 'openai';

const openai = new OpenAI();

async function selfConsistentCoT(
  problem: string,
  runs: number = 5
): Promise<{ answer: string; confidence: number }> {
  const answers: string[] = [];

  for (let i = 0; i < runs; i++) {
    const response = await openai.chat.completions.create({
      model: 'gpt-4o',
      messages: [
        {
          role: 'user',
          content: `${problem}\n\nLet's think step by step. At the end, clearly state your final answer after "ANSWER:"`,
        },
      ],
      temperature: 0.7, // Allow variation
    });

    const text = response.choices[0].message.content || '';
    const match = text.match(/ANSWER:\s*(.+)/i);
    if (match) {
      answers.push(match[1].trim());
    }
  }

  // Find most common answer
  const counts = answers.reduce(
    (acc, answer) => {
      acc[answer] = (acc[answer] || 0) + 1;
      return acc;
    },
    {} as Record<string, number>
  );

  const mostCommon = Object.entries(counts).sort((a, b) => b[1] - a[1])[0];

  return {
    answer: mostCommon[0],
    confidence: mostCommon[1] / runs,
  };
}

// Usage
const result = await selfConsistentCoT('What is 23 × 47?', 5);
console.log(`Answer: ${result.answer} (confidence: ${result.confidence * 100}%)`);

Verification Step

Ask the model to verify its own answer:

const verificationPrompt = `
Problem: In a class of 30 students, 40% are girls. If 5 more boys join, 
what percentage of the class are girls now?

Step 1: Solve the problem
Let me think step by step:
- Original number of girls: 30 × 0.40 = 12 girls
- Original number of boys: 30 - 12 = 18 boys
- After 5 boys join: 18 + 5 = 23 boys
- Total students now: 12 + 23 = 35 students
- Percentage of girls: 12/35 ≈ 0.343 = 34.3%

Step 2: Verify the answer
Let me check:
- 12 girls out of 35 total
- 12 ÷ 35 = 0.3428...
- As a percentage: 34.28% ≈ 34.3%
- The original class was 40% girls (12/30 = 0.4 ✓)
- We added 5 boys (18 + 5 = 23 ✓)
- Total is now 35 (12 + 23 = 35 ✓)

The answer is verified: approximately 34.3% of the class are girls.
`;

Structured Output with CoT

Combine CoT with structured output for reliable parsing:

const structuredCoTPrompt = `
Analyze whether this code has any security vulnerabilities.

\`\`\`typescript
app.get('/user/:id', async (req, res) => {
  const query = \`SELECT * FROM users WHERE id = \${req.params.id}\`;
  const user = await db.query(query);
  res.json(user);
});
\`\`\`

Think through this step by step, then provide your analysis in JSON format:

{
  "reasoning": "Your step-by-step analysis",
  "vulnerabilities": [
    {
      "type": "vulnerability type",
      "severity": "critical|high|medium|low",
      "description": "what the vulnerability is",
      "fix": "how to fix it"
    }
  ],
  "safe": true/false
}
`;

Implementing a CoT Helper

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic();

interface CoTResult {
  reasoning: string;
  answer: string;
  fullResponse: string;
}

async function solveWithCoT(
  problem: string,
  options: {
    model?: string;
    steps?: string[];
    verifyAnswer?: boolean;
  } = {}
): Promise<CoTResult> {
  const { model = 'claude-sonnet-4-20250514', steps, verifyAnswer = false } = options;

  let prompt = problem + '\n\n';

  if (steps) {
    prompt += 'Please solve this by following these steps:\n';
    steps.forEach((step, i) => {
      prompt += `${i + 1}. ${step}\n`;
    });
  } else {
    prompt += "Let's think through this step by step.\n";
  }

  if (verifyAnswer) {
    prompt += '\nAfter reaching your answer, verify it by checking your work.\n';
  }

  prompt += "\nClearly mark your final answer with 'FINAL ANSWER:'";

  const response = await anthropic.messages.create({
    model,
    max_tokens: 1500,
    messages: [{ role: 'user', content: prompt }],
  });

  const fullResponse = response.content[0].text;

  // Extract answer
  const answerMatch = fullResponse.match(/FINAL ANSWER:\s*(.+?)(?:\n|$)/i);
  const answer = answerMatch ? answerMatch[1].trim() : '';

  // Extract reasoning (everything before FINAL ANSWER)
  const reasoning = fullResponse.split(/FINAL ANSWER:/i)[0].trim();

  return {
    reasoning,
    answer,
    fullResponse,
  };
}

// Usage
const result = await solveWithCoT(
  'A car travels 120 miles in 2 hours, then 180 miles in 3 hours. What was the average speed for the entire trip?',
  {
    steps: [
      'Calculate total distance',
      'Calculate total time',
      'Calculate average speed (total distance / total time)',
      'State the answer with units',
    ],
    verifyAnswer: true,
  }
);

console.log('Reasoning:', result.reasoning);
console.log('Answer:', result.answer);

When to Use Chain of Thought

Good Use Cases

Task Type	Example	Why CoT Helps
Math problems	Word problems, calculations	Breaks down computation steps
Logic puzzles	Deduction, ordering	Shows logical connections
Code analysis	Bug finding, optimization	Traces execution mentally
Decision making	Pros/cons analysis	Considers multiple factors
Multi-step tasks	Complex queries	Handles dependencies

When CoT May Not Help

Task Type	Example	Why
Simple lookups	"What is the capital of France?"	No reasoning needed
Creative tasks	"Write a poem"	Not logic-based
Classification	"Is this spam?"	Often pattern matching
Translation	"Translate to Spanish"	Different skill

Common Pitfalls

Pitfall 1: Overthinking Simple Problems

Problem: Using CoT for trivial questions adds latency and cost

Solution: Only use CoT for problems that benefit from reasoning

// No CoT needed
const simple = 'What is 2 + 2?';

// CoT beneficial
const complex =
  'What is the total cost of 3 items at $19.99 each, with 8.5% tax and a $5 discount?';

Pitfall 2: Stopping at Wrong Step

Problem: Model reaches correct intermediate step but wrong final answer

Solution: Add explicit verification step

const withVerification = `
${problem}

Think step by step, then:
1. State your preliminary answer
2. Check your work by verifying each step
3. Confirm or correct your final answer
`;

Pitfall 3: Inconsistent Formatting

Problem: Answers buried in reasoning, hard to extract

Solution: Use clear answer markers

const clearFormat = `
${problem}

Show your step-by-step reasoning, then clearly format your response as:

REASONING:
[Your step-by-step work]

FINAL ANSWER:
[Just the answer]
`;

Exercises

Exercise 1: Write a CoT Prompt

Create a CoT prompt for this problem: "Three friends split a restaurant bill. The bill is $87.50 and they want to leave a 20% tip. They agree to split everything equally. How much does each person pay?"

Solution

const billSplitPrompt = `
Three friends split a restaurant bill. The bill is $87.50 and they want to 
leave a 20% tip. They agree to split everything equally. 
How much does each person pay?

Let's solve this step by step:

Step 1: Calculate the tip
- Tip = Bill × Tip percentage
- Tip = $87.50 × 0.20

Step 2: Calculate the total amount (bill + tip)
- Total = Bill + Tip

Step 3: Split equally among 3 people
- Per person = Total ÷ 3

Step 4: State the final answer with appropriate rounding

FINAL ANSWER:
`;

Exercise 2: Debug with CoT

Use chain-of-thought to find the bug in this code:

function findMax(arr: number[]): number {
  let max = 0;
  for (const num of arr) {
    if (num > max) max = num;
  }
  return max;
}

Solution

const debugPrompt = `
Find the bug in this function:

\`\`\`typescript
function findMax(arr: number[]): number {
  let max = 0;
  for (const num of arr) {
    if (num > max) max = num;
  }
  return max;
}
\`\`\`

Let me trace through this step by step:

Step 1: Understand the function's purpose
- It should return the maximum value in an array

Step 2: Trace with a normal case: [1, 5, 3]
- max starts at 0
- 1 > 0? Yes, max = 1
- 5 > 1? Yes, max = 5
- 3 > 5? No
- Returns 5 ✓

Step 3: Trace with edge cases
- Empty array []: Returns 0 (might be wrong, but acceptable)
- All negative numbers [-5, -3, -1]:
  - max starts at 0
  - -5 > 0? No
  - -3 > 0? No
  - -1 > 0? No
  - Returns 0 ✗ (should return -1!)

Step 4: Identify the bug
- Initial value of max = 0 is wrong
- Fails when all numbers are negative

Step 5: Correct solution
\`\`\`typescript
function findMax(arr: number[]): number {
  if (arr.length === 0) throw new Error("Array is empty");
  let max = arr[0]; // Start with first element
  for (const num of arr) {
    if (num > max) max = num;
  }
  return max;
}
\`\`\`

FINAL ANSWER: The bug is initializing max to 0. This fails for arrays with all negative numbers. Initialize max to arr[0] instead.
`;

Exercise 3: Multi-Step Reasoning

Create a CoT prompt for this logic puzzle: "Five houses in a row are painted different colors. The English person lives in the red house. The Spanish person lives to the left of the green house. The green house is immediately to the right of the white house. What color is the house at each position?"

Solution

const logicPuzzlePrompt = `
Solve this logic puzzle step by step:

Five houses in a row are painted different colors.
- The English person lives in the red house.
- The Spanish person lives to the left of the green house.
- The green house is immediately to the right of the white house.

Determine possible arrangements for the house colors.

Step 1: List what we know
- 5 houses, positions 1-5 (left to right)
- 5 different colors (we know: red, green, white - need to identify others)
- Constraints about positions

Step 2: Analyze the "immediately to the right" constraint
- Green is immediately right of white
- So we have a "white-green" pair
- Possible positions: (1,2), (2,3), (3,4), or (4,5)

Step 3: Analyze the "to the left of" constraint
- Spanish person lives LEFT of green house
- This means Spanish is not in the green house
- Spanish could be in positions 1, 2, 3, or 4 (depending on where green is)

Step 4: Consider each possibility for white-green pair
- If white-green is at (4,5): Spanish could be 1,2,3,4
- If white-green is at (3,4): Spanish could be 1,2,3
- And so on...

Step 5: Determine what additional information we need
- We don't have enough constraints to uniquely determine all positions
- Red house position depends on where English person lives
- We need to list possible valid arrangements

FINAL ANSWER: Multiple arrangements are possible. One valid arrangement:
Position 1: (other color) - Spanish
Position 2: White
Position 3: Green  
Position 4: Red - English
Position 5: (other color)

The puzzle as stated is underconstrained for a unique solution.
`;

Key Takeaways

Chain of thought improves reasoning by generating intermediate steps
"Let's think step by step" is often enough to trigger CoT (zero-shot)
Few-shot CoT with examples provides more consistent results
Verification steps catch errors in reasoning
Self-consistency (multiple runs) increases reliability
Use CoT for complex tasks - math, logic, debugging, multi-step problems
Skip CoT for simple tasks - it adds latency and cost without benefit

Resources

Resource	Type	Description
Chain-of-Thought Paper	Paper	Original research on CoT prompting
Self-Consistency Paper	Paper	Multiple reasoning paths improve accuracy
Prompting Guide - CoT	Tutorial	Practical CoT examples and techniques
OpenAI Best Practices	Documentation	Official guidance on reasoning prompts

Next Lesson

You have learned how to make AI think step by step. In the next lesson, you will put all these techniques together in a hands-on practice session, optimizing prompts for real-world scenarios.

Continue to Lesson 3.5: Practice - Prompt Optimization