Lesson 3.4: Chain of Thought
Duration: 55 minutes
Learning Objectives
By the end of this lesson, you will be able to:
- Understand what chain-of-thought (CoT) prompting is and why it works
- Apply CoT to improve reasoning in complex tasks
- Implement zero-shot and few-shot CoT techniques
- Use CoT for math, logic, and multi-step problems
- Combine CoT with self-verification for more reliable outputs
Introduction
Have you ever solved a math problem by writing out each step? Or talked through a decision out loud? This process of "thinking step by step" helps humans avoid mistakes and reach better conclusions.
Chain-of-thought prompting applies this same principle to AI. Instead of asking for just the answer, you ask the model to show its reasoning. This simple change dramatically improves accuracy on complex tasks like math, logic, and multi-step reasoning.
What is Chain of Thought?
Chain of thought (CoT) prompting encourages the model to break down problems into intermediate steps before reaching a conclusion.
Without Chain of Thought
Question: A store sells apples for $2 each. If someone buys 5 apples
and pays with a $20 bill, how much change do they receive?
Answer: $10
The model jumps directly to the answer. It might be right, but we cannot verify its reasoning.
With Chain of Thought
Question: A store sells apples for $2 each. If someone buys 5 apples
and pays with a $20 bill, how much change do they receive?
Let me think through this step by step:
1. Cost per apple: $2
2. Number of apples: 5
3. Total cost: $2 × 5 = $10
4. Amount paid: $20
5. Change: $20 - $10 = $10
Answer: $10
Now we can see each step and verify the reasoning. More importantly, the model is less likely to make mistakes when it thinks through the problem.
Why Chain of Thought Works
The Problem with Direct Answers
LLMs predict the most likely next token based on patterns. For simple questions, this works well. But for complex problems, the "most likely" answer based on surface patterns might be wrong.
┌─────────────────────────────────────────────────────────┐
│ Without CoT: Pattern Matching │
├─────────────────────────────────────────────────────────┤
│ │
│ "What is 17 × 24?" │
│ │ │
│ ▼ │
│ Model sees "multiplication question" │
│ │ │
│ ▼ │
│ Outputs likely-looking number: "398" ❌ │
│ (Actual answer: 408) │
│ │
└─────────────────────────────────────────────────────────┘
The CoT Advantage
When you ask for step-by-step reasoning, the model generates intermediate tokens that guide it toward the correct answer:
┌─────────────────────────────────────────────────────────┐
│ With CoT: Step-by-Step Reasoning │
├─────────────────────────────────────────────────────────┤
│ │
│ "What is 17 × 24? Think step by step." │
│ │ │
│ ▼ │
│ "Let me break this down: │
│ 17 × 24 = 17 × (20 + 4) │
│ = (17 × 20) + (17 × 4) │
│ = 340 + 68 │
│ = 408" ✓ │
│ │
└─────────────────────────────────────────────────────────┘
Each intermediate step adds context that makes the next step more likely to be correct.
Implementing Chain of Thought
Zero-Shot CoT: The Magic Phrase
The simplest way to enable chain-of-thought reasoning is to add "Let's think step by step" to your prompt:
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
// Without CoT
const directPrompt = `
If a train travels at 60 mph for 2.5 hours, then at 80 mph for 1.5 hours,
what is the total distance traveled?
`;
// With Zero-Shot CoT
const cotPrompt = `
If a train travels at 60 mph for 2.5 hours, then at 80 mph for 1.5 hours,
what is the total distance traveled?
Let's think through this step by step.
`;
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 500,
messages: [{ role: 'user', content: cotPrompt }],
});
console.log(response.content[0].text);
Expected output with CoT:
Let's think through this step by step.
1. First leg of the journey:
- Speed: 60 mph
- Time: 2.5 hours
- Distance = Speed × Time = 60 × 2.5 = 150 miles
2. Second leg of the journey:
- Speed: 80 mph
- Time: 1.5 hours
- Distance = Speed × Time = 80 × 1.5 = 120 miles
3. Total distance:
- 150 miles + 120 miles = 270 miles
The train traveled a total of 270 miles.
Few-Shot CoT: Teaching by Example
For more consistent results, show the model examples of step-by-step reasoning:
const fewShotCoT = `
Solve the following problems step by step.
Problem: A restaurant bill is $85. If you want to leave a 20% tip,
how much should you pay in total?
Solution:
Step 1: Calculate the tip amount
- Tip = Bill × Tip percentage
- Tip = $85 × 0.20 = $17
Step 2: Calculate total payment
- Total = Bill + Tip
- Total = $85 + $17 = $102
Answer: $102
---
Problem: A store offers 30% off on a $60 item. If tax is 8%,
what is the final price?
Solution:
Step 1: Calculate the discount
- Discount = $60 × 0.30 = $18
Step 2: Calculate discounted price
- Discounted price = $60 - $18 = $42
Step 3: Calculate tax on discounted price
- Tax = $42 × 0.08 = $3.36
Step 4: Calculate final price
- Final price = $42 + $3.36 = $45.36
Answer: $45.36
---
Problem: ${userProblem}
Solution:
`;
Effective CoT Phrases
Different phrases can trigger chain-of-thought reasoning:
const cotPhrases = [
"Let's think step by step.",
"Let's work through this problem.",
"Let's break this down into steps.",
'Let me reason through this carefully.',
"I'll solve this systematically.",
"Let's analyze this step by step.",
'First, let me understand the problem, then solve it step by step.',
];
// Using structured prompts
const structuredCoT = `
${problem}
Please solve this by:
1. First, identify what we're trying to find
2. List the relevant information
3. Show each calculation step
4. State the final answer
`;
Use Cases for Chain of Thought
Mathematical Reasoning
const mathPrompt = `
A farmer has chickens and cows. There are 50 heads and 140 legs in total.
How many chickens and how many cows are there?
Let's solve this step by step:
1. Define variables
2. Set up equations
3. Solve the system
4. Verify the answer
`;
Logical Deduction
const logicPrompt = `
In a room of 5 people: Alice, Bob, Carol, David, and Eve.
- Alice is taller than Bob
- Carol is shorter than Bob
- David is taller than Alice
- Eve is shorter than Carol
Who is the tallest person? Work through this step by step.
`;
Code Debugging
const debugPrompt = `
This function should return the sum of even numbers, but it has a bug:
\`\`\`typescript
function sumEven(numbers: number[]): number {
let sum = 0;
for (let i = 1; i <= numbers.length; i++) {
if (numbers[i] % 2 === 0) {
sum += numbers[i];
}
}
return sum;
}
\`\`\`
Think through this step by step:
1. What is the function supposed to do?
2. Trace through the code with an example input
3. Identify where the bug occurs
4. Explain the fix
`;
Decision Analysis
const decisionPrompt = `
Should our startup use a monolithic or microservices architecture?
Context:
- Team size: 5 developers
- Expected users: 10,000 in year 1
- Budget: Limited
- Timeline: MVP in 3 months
Please analyze this step by step:
1. List pros and cons of each approach
2. Consider each constraint
3. Make a recommendation with reasoning
`;
Advanced CoT Techniques
Self-Consistency
Run the same CoT prompt multiple times and pick the most common answer:
import OpenAI from 'openai';
const openai = new OpenAI();
async function selfConsistentCoT(
problem: string,
runs: number = 5
): Promise<{ answer: string; confidence: number }> {
const answers: string[] = [];
for (let i = 0; i < runs; i++) {
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'user',
content: `${problem}\n\nLet's think step by step. At the end, clearly state your final answer after "ANSWER:"`,
},
],
temperature: 0.7, // Allow variation
});
const text = response.choices[0].message.content || '';
const match = text.match(/ANSWER:\s*(.+)/i);
if (match) {
answers.push(match[1].trim());
}
}
// Find most common answer
const counts = answers.reduce(
(acc, answer) => {
acc[answer] = (acc[answer] || 0) + 1;
return acc;
},
{} as Record<string, number>
);
const mostCommon = Object.entries(counts).sort((a, b) => b[1] - a[1])[0];
return {
answer: mostCommon[0],
confidence: mostCommon[1] / runs,
};
}
// Usage
const result = await selfConsistentCoT('What is 23 × 47?', 5);
console.log(`Answer: ${result.answer} (confidence: ${result.confidence * 100}%)`);
Verification Step
Ask the model to verify its own answer:
const verificationPrompt = `
Problem: In a class of 30 students, 40% are girls. If 5 more boys join,
what percentage of the class are girls now?
Step 1: Solve the problem
Let me think step by step:
- Original number of girls: 30 × 0.40 = 12 girls
- Original number of boys: 30 - 12 = 18 boys
- After 5 boys join: 18 + 5 = 23 boys
- Total students now: 12 + 23 = 35 students
- Percentage of girls: 12/35 ≈ 0.343 = 34.3%
Step 2: Verify the answer
Let me check:
- 12 girls out of 35 total
- 12 ÷ 35 = 0.3428...
- As a percentage: 34.28% ≈ 34.3%
- The original class was 40% girls (12/30 = 0.4 ✓)
- We added 5 boys (18 + 5 = 23 ✓)
- Total is now 35 (12 + 23 = 35 ✓)
The answer is verified: approximately 34.3% of the class are girls.
`;
Structured Output with CoT
Combine CoT with structured output for reliable parsing:
const structuredCoTPrompt = `
Analyze whether this code has any security vulnerabilities.
\`\`\`typescript
app.get('/user/:id', async (req, res) => {
const query = \`SELECT * FROM users WHERE id = \${req.params.id}\`;
const user = await db.query(query);
res.json(user);
});
\`\`\`
Think through this step by step, then provide your analysis in JSON format:
{
"reasoning": "Your step-by-step analysis",
"vulnerabilities": [
{
"type": "vulnerability type",
"severity": "critical|high|medium|low",
"description": "what the vulnerability is",
"fix": "how to fix it"
}
],
"safe": true/false
}
`;
Implementing a CoT Helper
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();
interface CoTResult {
reasoning: string;
answer: string;
fullResponse: string;
}
async function solveWithCoT(
problem: string,
options: {
model?: string;
steps?: string[];
verifyAnswer?: boolean;
} = {}
): Promise<CoTResult> {
const { model = 'claude-sonnet-4-20250514', steps, verifyAnswer = false } = options;
let prompt = problem + '\n\n';
if (steps) {
prompt += 'Please solve this by following these steps:\n';
steps.forEach((step, i) => {
prompt += `${i + 1}. ${step}\n`;
});
} else {
prompt += "Let's think through this step by step.\n";
}
if (verifyAnswer) {
prompt += '\nAfter reaching your answer, verify it by checking your work.\n';
}
prompt += "\nClearly mark your final answer with 'FINAL ANSWER:'";
const response = await anthropic.messages.create({
model,
max_tokens: 1500,
messages: [{ role: 'user', content: prompt }],
});
const fullResponse = response.content[0].text;
// Extract answer
const answerMatch = fullResponse.match(/FINAL ANSWER:\s*(.+?)(?:\n|$)/i);
const answer = answerMatch ? answerMatch[1].trim() : '';
// Extract reasoning (everything before FINAL ANSWER)
const reasoning = fullResponse.split(/FINAL ANSWER:/i)[0].trim();
return {
reasoning,
answer,
fullResponse,
};
}
// Usage
const result = await solveWithCoT(
'A car travels 120 miles in 2 hours, then 180 miles in 3 hours. What was the average speed for the entire trip?',
{
steps: [
'Calculate total distance',
'Calculate total time',
'Calculate average speed (total distance / total time)',
'State the answer with units',
],
verifyAnswer: true,
}
);
console.log('Reasoning:', result.reasoning);
console.log('Answer:', result.answer);
When to Use Chain of Thought
Good Use Cases
| Task Type | Example | Why CoT Helps |
|---|---|---|
| Math problems | Word problems, calculations | Breaks down computation steps |
| Logic puzzles | Deduction, ordering | Shows logical connections |
| Code analysis | Bug finding, optimization | Traces execution mentally |
| Decision making | Pros/cons analysis | Considers multiple factors |
| Multi-step tasks | Complex queries | Handles dependencies |
When CoT May Not Help
| Task Type | Example | Why |
|---|---|---|
| Simple lookups | "What is the capital of France?" | No reasoning needed |
| Creative tasks | "Write a poem" | Not logic-based |
| Classification | "Is this spam?" | Often pattern matching |
| Translation | "Translate to Spanish" | Different skill |
Common Pitfalls
Pitfall 1: Overthinking Simple Problems
Problem: Using CoT for trivial questions adds latency and cost
Solution: Only use CoT for problems that benefit from reasoning
// No CoT needed
const simple = 'What is 2 + 2?';
// CoT beneficial
const complex =
'What is the total cost of 3 items at $19.99 each, with 8.5% tax and a $5 discount?';
Pitfall 2: Stopping at Wrong Step
Problem: Model reaches correct intermediate step but wrong final answer
Solution: Add explicit verification step
const withVerification = `
${problem}
Think step by step, then:
1. State your preliminary answer
2. Check your work by verifying each step
3. Confirm or correct your final answer
`;
Pitfall 3: Inconsistent Formatting
Problem: Answers buried in reasoning, hard to extract
Solution: Use clear answer markers
const clearFormat = `
${problem}
Show your step-by-step reasoning, then clearly format your response as:
REASONING:
[Your step-by-step work]
FINAL ANSWER:
[Just the answer]
`;
Exercises
Exercise 1: Write a CoT Prompt
Create a CoT prompt for this problem: "Three friends split a restaurant bill. The bill is $87.50 and they want to leave a 20% tip. They agree to split everything equally. How much does each person pay?"
Solution
const billSplitPrompt = `
Three friends split a restaurant bill. The bill is $87.50 and they want to
leave a 20% tip. They agree to split everything equally.
How much does each person pay?
Let's solve this step by step:
Step 1: Calculate the tip
- Tip = Bill × Tip percentage
- Tip = $87.50 × 0.20
Step 2: Calculate the total amount (bill + tip)
- Total = Bill + Tip
Step 3: Split equally among 3 people
- Per person = Total ÷ 3
Step 4: State the final answer with appropriate rounding
FINAL ANSWER:
`;
Exercise 2: Debug with CoT
Use chain-of-thought to find the bug in this code:
function findMax(arr: number[]): number {
let max = 0;
for (const num of arr) {
if (num > max) max = num;
}
return max;
}
Solution
const debugPrompt = `
Find the bug in this function:
\`\`\`typescript
function findMax(arr: number[]): number {
let max = 0;
for (const num of arr) {
if (num > max) max = num;
}
return max;
}
\`\`\`
Let me trace through this step by step:
Step 1: Understand the function's purpose
- It should return the maximum value in an array
Step 2: Trace with a normal case: [1, 5, 3]
- max starts at 0
- 1 > 0? Yes, max = 1
- 5 > 1? Yes, max = 5
- 3 > 5? No
- Returns 5 ✓
Step 3: Trace with edge cases
- Empty array []: Returns 0 (might be wrong, but acceptable)
- All negative numbers [-5, -3, -1]:
- max starts at 0
- -5 > 0? No
- -3 > 0? No
- -1 > 0? No
- Returns 0 ✗ (should return -1!)
Step 4: Identify the bug
- Initial value of max = 0 is wrong
- Fails when all numbers are negative
Step 5: Correct solution
\`\`\`typescript
function findMax(arr: number[]): number {
if (arr.length === 0) throw new Error("Array is empty");
let max = arr[0]; // Start with first element
for (const num of arr) {
if (num > max) max = num;
}
return max;
}
\`\`\`
FINAL ANSWER: The bug is initializing max to 0. This fails for arrays with all negative numbers. Initialize max to arr[0] instead.
`;
Exercise 3: Multi-Step Reasoning
Create a CoT prompt for this logic puzzle: "Five houses in a row are painted different colors. The English person lives in the red house. The Spanish person lives to the left of the green house. The green house is immediately to the right of the white house. What color is the house at each position?"
Solution
const logicPuzzlePrompt = `
Solve this logic puzzle step by step:
Five houses in a row are painted different colors.
- The English person lives in the red house.
- The Spanish person lives to the left of the green house.
- The green house is immediately to the right of the white house.
Determine possible arrangements for the house colors.
Step 1: List what we know
- 5 houses, positions 1-5 (left to right)
- 5 different colors (we know: red, green, white - need to identify others)
- Constraints about positions
Step 2: Analyze the "immediately to the right" constraint
- Green is immediately right of white
- So we have a "white-green" pair
- Possible positions: (1,2), (2,3), (3,4), or (4,5)
Step 3: Analyze the "to the left of" constraint
- Spanish person lives LEFT of green house
- This means Spanish is not in the green house
- Spanish could be in positions 1, 2, 3, or 4 (depending on where green is)
Step 4: Consider each possibility for white-green pair
- If white-green is at (4,5): Spanish could be 1,2,3,4
- If white-green is at (3,4): Spanish could be 1,2,3
- And so on...
Step 5: Determine what additional information we need
- We don't have enough constraints to uniquely determine all positions
- Red house position depends on where English person lives
- We need to list possible valid arrangements
FINAL ANSWER: Multiple arrangements are possible. One valid arrangement:
Position 1: (other color) - Spanish
Position 2: White
Position 3: Green
Position 4: Red - English
Position 5: (other color)
The puzzle as stated is underconstrained for a unique solution.
`;
Key Takeaways
- Chain of thought improves reasoning by generating intermediate steps
- "Let's think step by step" is often enough to trigger CoT (zero-shot)
- Few-shot CoT with examples provides more consistent results
- Verification steps catch errors in reasoning
- Self-consistency (multiple runs) increases reliability
- Use CoT for complex tasks - math, logic, debugging, multi-step problems
- Skip CoT for simple tasks - it adds latency and cost without benefit
Resources
| Resource | Type | Description |
|---|---|---|
| Chain-of-Thought Paper | Paper | Original research on CoT prompting |
| Self-Consistency Paper | Paper | Multiple reasoning paths improve accuracy |
| Prompting Guide - CoT | Tutorial | Practical CoT examples and techniques |
| OpenAI Best Practices | Documentation | Official guidance on reasoning prompts |
Next Lesson
You have learned how to make AI think step by step. In the next lesson, you will put all these techniques together in a hands-on practice session, optimizing prompts for real-world scenarios.