From Zero to AI

Lesson 4.1: OpenAI (GPT-4, GPT-4o)

Duration: 50 minutes

Learning Objectives

By the end of this lesson, you will be able to:

  • Understand OpenAI's model lineup and their differences
  • Set up and authenticate with the OpenAI API
  • Make chat completion requests using the TypeScript SDK
  • Handle responses and errors properly
  • Understand OpenAI-specific features like function calling and vision

Introduction

OpenAI is the company behind ChatGPT and the GPT family of models. They pioneered the commercial LLM API market and remain one of the most widely used providers. In this lesson, you will learn how to work with OpenAI's API, understand their model offerings, and build your first integration.


OpenAI Model Lineup

OpenAI offers several model families, each optimized for different use cases:

┌─────────────────────────────────────────────────────────────────┐
│                    OpenAI Model Families                         │
├─────────────────────────────────────────────────────────────────┤
│  GPT-4o        │ Flagship multimodal model                      │
│                │ Text, vision, audio input/output               │
│                │ Best for: Complex reasoning, vision tasks      │
├────────────────┼────────────────────────────────────────────────┤
│  GPT-4o-mini   │ Faster, cheaper version of GPT-4o              │
│                │ Good balance of speed and capability           │
│                │ Best for: Most production applications         │
├────────────────┼────────────────────────────────────────────────┤
│  GPT-4 Turbo   │ Previous generation flagship                   │
│                │ 128K context window                            │
│                │ Best for: Long document processing             │
├────────────────┼────────────────────────────────────────────────┤
│  o1 / o1-mini  │ Reasoning-focused models                       │
│                │ Extended "thinking" before responding          │
│                │ Best for: Math, coding, complex logic          │
└─────────────────────────────────────────────────────────────────┘

Model Selection Guidelines

Use Case Recommended Model Why
General chat applications gpt-4o-mini Fast, cheap, capable enough
Complex reasoning tasks gpt-4o Best overall intelligence
Math and logic problems o1-mini Specialized for reasoning
Vision/image analysis gpt-4o Native multimodal support
High-volume production gpt-4o-mini Best cost/performance ratio

Setting Up the OpenAI SDK

Installation

npm install openai

Authentication

OpenAI uses API keys for authentication. Get your key from the OpenAI Platform.

import OpenAI from 'openai';

// The SDK automatically reads OPENAI_API_KEY from environment
const openai = new OpenAI();

// Or explicitly provide the key (not recommended for production)
const openaiWithKey = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

Environment setup:

# .env file
OPENAI_API_KEY=sk-proj-your-api-key-here

Making Your First Request

The core API for text generation is chat.completions.create:

import OpenAI from 'openai';

const openai = new OpenAI();

async function chat(userMessage: string): Promise<string> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: userMessage }],
  });

  return response.choices[0].message.content || '';
}

// Usage
const answer = await chat('What is TypeScript?');
console.log(answer);

Understanding the Response Structure

interface ChatCompletion {
  id: string; // Unique identifier
  object: 'chat.completion';
  created: number; // Unix timestamp
  model: string; // Model used
  choices: Array<{
    index: number;
    message: {
      role: 'assistant';
      content: string | null;
    };
    finish_reason: 'stop' | 'length' | 'content_filter' | 'tool_calls';
  }>;
  usage: {
    prompt_tokens: number; // Input tokens
    completion_tokens: number; // Output tokens
    total_tokens: number;
  };
}

Chat Completions with System Prompts

System prompts define the AI's behavior throughout a conversation:

import OpenAI from 'openai';

const openai = new OpenAI();

async function codeReviewer(code: string): Promise<string> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: `You are an expert code reviewer specializing in TypeScript.
Review code for:
- Type safety issues
- Potential bugs
- Performance problems
- Best practice violations

Be concise and actionable in your feedback.`,
      },
      {
        role: 'user',
        content: `Review this code:\n\n\`\`\`typescript\n${code}\n\`\`\``,
      },
    ],
  });

  return response.choices[0].message.content || '';
}

Managing Conversations

For multi-turn conversations, you need to maintain message history:

import OpenAI from 'openai';

type Message = OpenAI.Chat.ChatCompletionMessageParam;

class Conversation {
  private openai: OpenAI;
  private messages: Message[];
  private model: string;

  constructor(systemPrompt: string, model: string = 'gpt-4o-mini') {
    this.openai = new OpenAI();
    this.model = model;
    this.messages = [{ role: 'system', content: systemPrompt }];
  }

  async send(userMessage: string): Promise<string> {
    this.messages.push({ role: 'user', content: userMessage });

    const response = await this.openai.chat.completions.create({
      model: this.model,
      messages: this.messages,
    });

    const assistantMessage = response.choices[0].message.content || '';
    this.messages.push({ role: 'assistant', content: assistantMessage });

    return assistantMessage;
  }

  getHistory(): Message[] {
    return [...this.messages];
  }

  clearHistory(): void {
    this.messages = [this.messages[0]]; // Keep system prompt
  }
}

// Usage
const conversation = new Conversation(
  'You are a helpful TypeScript tutor. Explain concepts clearly with examples.'
);

const response1 = await conversation.send('What are generics?');
console.log(response1);

const response2 = await conversation.send('Can you show me an example?');
console.log(response2);

Configuring Model Parameters

Fine-tune the model's behavior with these parameters:

const response = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Write a haiku about coding' }],

  // Temperature: 0-2, controls randomness
  // Lower = more deterministic, Higher = more creative
  temperature: 0.7,

  // Max tokens: limit response length
  max_tokens: 500,

  // Top P: alternative to temperature (nucleus sampling)
  // Use one or the other, not both
  top_p: 1,

  // Frequency penalty: -2 to 2
  // Reduces repetition of frequent tokens
  frequency_penalty: 0,

  // Presence penalty: -2 to 2
  // Encourages talking about new topics
  presence_penalty: 0,

  // Stop sequences: stop generation at these strings
  stop: ['\n\n', 'END'],

  // Number of completions to generate
  n: 1,
});

When to Adjust Parameters

Parameter When to Increase When to Decrease
temperature Creative writing, brainstorming Code generation, factual Q&A
max_tokens Long-form content Short answers, cost control
frequency_penalty Avoiding repetition Technical content
presence_penalty Exploring diverse topics Focused discussions

Vision Capabilities

GPT-4o can analyze images alongside text:

import OpenAI from 'openai';

const openai = new OpenAI();

async function analyzeImage(imageUrl: string, question: string): Promise<string> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'user',
        content: [
          {
            type: 'text',
            text: question,
          },
          {
            type: 'image_url',
            image_url: {
              url: imageUrl,
              detail: 'auto', // "low", "high", or "auto"
            },
          },
        ],
      },
    ],
    max_tokens: 500,
  });

  return response.choices[0].message.content || '';
}

// Usage
const analysis = await analyzeImage(
  'https://example.com/screenshot.png',
  'Describe what you see in this UI screenshot. Identify any usability issues.'
);

Using Base64 Images

import fs from 'fs';

async function analyzeLocalImage(filePath: string): Promise<string> {
  const imageBuffer = fs.readFileSync(filePath);
  const base64Image = imageBuffer.toString('base64');
  const mimeType = 'image/png'; // or image/jpeg, image/gif, image/webp

  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'What is in this image?' },
          {
            type: 'image_url',
            image_url: {
              url: `data:${mimeType};base64,${base64Image}`,
            },
          },
        ],
      },
    ],
  });

  return response.choices[0].message.content || '';
}

Error Handling

The OpenAI SDK throws specific errors you should handle:

import OpenAI from 'openai';

const openai = new OpenAI();

async function safeChat(message: string): Promise<string> {
  try {
    const response = await openai.chat.completions.create({
      model: 'gpt-4o-mini',
      messages: [{ role: 'user', content: message }],
    });
    return response.choices[0].message.content || '';
  } catch (error) {
    if (error instanceof OpenAI.APIError) {
      switch (error.status) {
        case 400:
          console.error('Bad request:', error.message);
          break;
        case 401:
          console.error('Invalid API key');
          break;
        case 403:
          console.error('Access forbidden - check your permissions');
          break;
        case 429:
          console.error('Rate limit exceeded - slow down requests');
          break;
        case 500:
          console.error('OpenAI server error - try again later');
          break;
        default:
          console.error(`API error ${error.status}:`, error.message);
      }
    } else if (error instanceof OpenAI.APIConnectionError) {
      console.error('Network error - check your connection');
    } else if (error instanceof OpenAI.RateLimitError) {
      console.error('Rate limit hit - implement backoff');
    } else {
      throw error;
    }
    return '';
  }
}

Implementing Retry Logic

async function chatWithRetry(message: string, maxRetries: number = 3): Promise<string> {
  let lastError: Error | null = null;

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await openai.chat.completions.create({
        model: 'gpt-4o-mini',
        messages: [{ role: 'user', content: message }],
      });
      return response.choices[0].message.content || '';
    } catch (error) {
      lastError = error as Error;

      if (error instanceof OpenAI.RateLimitError) {
        // Exponential backoff: 1s, 2s, 4s
        const delay = Math.pow(2, attempt - 1) * 1000;
        console.log(`Rate limited. Retrying in ${delay}ms...`);
        await new Promise((resolve) => setTimeout(resolve, delay));
      } else if (error instanceof OpenAI.APIError && error.status >= 500) {
        // Server error, retry with backoff
        const delay = attempt * 1000;
        console.log(`Server error. Retrying in ${delay}ms...`);
        await new Promise((resolve) => setTimeout(resolve, delay));
      } else {
        // Don't retry client errors
        throw error;
      }
    }
  }

  throw lastError;
}

Understanding Pricing

OpenAI charges based on tokens processed:

┌─────────────────────────────────────────────────────────────────┐
│              OpenAI Pricing (as of 2024)                        │
├─────────────────┬──────────────────┬────────────────────────────┤
│  ModelInput (per 1M)  │  Output (per 1M)           │
├─────────────────┼──────────────────┼────────────────────────────┤
│  gpt-4o         │  $2.50           │  $10.00                    │
│  gpt-4o-mini    │  $0.15           │  $0.60                     │
│  gpt-4-turbo    │  $10.00          │  $30.00                    │
│  o1             │  $15.00          │  $60.00                    │
│  o1-mini        │  $3.00           │  $12.00                    │
└─────────────────┴──────────────────┴────────────────────────────┘

Tracking Usage

async function chatWithCostTracking(message: string): Promise<{
  response: string;
  cost: number;
  tokens: { input: number; output: number };
}> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: message }],
  });

  const usage = response.usage!;

  // gpt-4o-mini pricing per token
  const inputCostPer1M = 0.15;
  const outputCostPer1M = 0.6;

  const inputCost = (usage.prompt_tokens / 1_000_000) * inputCostPer1M;
  const outputCost = (usage.completion_tokens / 1_000_000) * outputCostPer1M;

  return {
    response: response.choices[0].message.content || '',
    cost: inputCost + outputCost,
    tokens: {
      input: usage.prompt_tokens,
      output: usage.completion_tokens,
    },
  };
}

// Usage
const result = await chatWithCostTracking('Explain async/await in TypeScript');
console.log(`Response: ${result.response}`);
console.log(`Tokens: ${result.tokens.input} in, ${result.tokens.output} out`);
console.log(`Cost: $${result.cost.toFixed(6)}`);

Structured Outputs with JSON Mode

Request JSON output for easier parsing:

interface UserProfile {
  name: string;
  age: number;
  interests: string[];
  summary: string;
}

async function extractProfile(text: string): Promise<UserProfile> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: 'Extract user information and return as JSON.',
      },
      {
        role: 'user',
        content: `Extract profile from: "${text}"
        
Return JSON with: name (string), age (number), interests (string array), summary (string)`,
      },
    ],
    response_format: { type: 'json_object' },
  });

  const content = response.choices[0].message.content || '{}';
  return JSON.parse(content) as UserProfile;
}

// Usage
const profile = await extractProfile(
  "Hi, I'm Sarah, 28 years old. I love hiking, photography, and cooking Italian food."
);
console.log(profile);
// { name: "Sarah", age: 28, interests: ["hiking", "photography", "cooking Italian food"], ... }

Exercises

Exercise 1: Basic Chat Application

Create a function that acts as a TypeScript expert assistant:

// Your implementation here
async function typescriptExpert(question: string): Promise<string> {
  // TODO: Implement using OpenAI API
  // - Use gpt-4o-mini
  // - Add a system prompt that defines expertise in TypeScript
  // - Return helpful, accurate responses
}
Solution
import OpenAI from 'openai';

const openai = new OpenAI();

async function typescriptExpert(question: string): Promise<string> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [
      {
        role: 'system',
        content: `You are an expert TypeScript developer with deep knowledge of:
- TypeScript type system (generics, utility types, conditional types)
- Best practices and design patterns
- Common pitfalls and how to avoid them
- Integration with popular frameworks

Provide clear, accurate answers with code examples when helpful.
Keep explanations concise but thorough.`,
      },
      {
        role: 'user',
        content: question,
      },
    ],
    temperature: 0.3, // Lower for more consistent technical answers
  });

  return response.choices[0].message.content || '';
}

// Test
const answer = await typescriptExpert(
  'How do I type a function that accepts any object with an id property?'
);
console.log(answer);

Exercise 2: Token Counter

Create a utility that estimates the cost of a conversation:

// Your implementation here
interface CostEstimate {
  inputTokens: number;
  outputTokens: number;
  totalCost: number;
  model: string;
}

async function estimateConversationCost(
  messages: Array<{ role: string; content: string }>,
  model: string
): Promise<CostEstimate> {
  // TODO: Make a request and calculate actual token usage and cost
}
Solution
import OpenAI from 'openai';

const openai = new OpenAI();

interface CostEstimate {
  inputTokens: number;
  outputTokens: number;
  totalCost: number;
  model: string;
}

const PRICING: Record<string, { input: number; output: number }> = {
  'gpt-4o': { input: 2.5, output: 10 },
  'gpt-4o-mini': { input: 0.15, output: 0.6 },
  'gpt-4-turbo': { input: 10, output: 30 },
};

async function estimateConversationCost(
  messages: Array<{ role: 'user' | 'system' | 'assistant'; content: string }>,
  model: string = 'gpt-4o-mini'
): Promise<CostEstimate> {
  const response = await openai.chat.completions.create({
    model,
    messages,
    max_tokens: 100, // Limit for estimation
  });

  const usage = response.usage!;
  const pricing = PRICING[model] || PRICING['gpt-4o-mini'];

  const inputCost = (usage.prompt_tokens / 1_000_000) * pricing.input;
  const outputCost = (usage.completion_tokens / 1_000_000) * pricing.output;

  return {
    inputTokens: usage.prompt_tokens,
    outputTokens: usage.completion_tokens,
    totalCost: inputCost + outputCost,
    model,
  };
}

// Test
const estimate = await estimateConversationCost(
  [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is TypeScript?' },
  ],
  'gpt-4o-mini'
);

console.log(`Input tokens: ${estimate.inputTokens}`);
console.log(`Output tokens: ${estimate.outputTokens}`);
console.log(`Total cost: $${estimate.totalCost.toFixed(6)}`);

Exercise 3: Image Analyzer

Create a function that analyzes code screenshots:

// Your implementation here
async function analyzeCodeScreenshot(imageUrl: string): Promise<{
  language: string;
  description: string;
  issues: string[];
  suggestions: string[];
}> {
  // TODO: Use GPT-4o vision to analyze a code screenshot
}
Solution
import OpenAI from 'openai';

const openai = new OpenAI();

interface CodeAnalysis {
  language: string;
  description: string;
  issues: string[];
  suggestions: string[];
}

async function analyzeCodeScreenshot(imageUrl: string): Promise<CodeAnalysis> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'user',
        content: [
          {
            type: 'text',
            text: `Analyze this code screenshot and return a JSON object with:
- language: the programming language
- description: what the code does (1-2 sentences)
- issues: array of potential bugs or problems
- suggestions: array of improvement suggestions

Return ONLY valid JSON, no markdown or explanation.`,
          },
          {
            type: 'image_url',
            image_url: { url: imageUrl },
          },
        ],
      },
    ],
    response_format: { type: 'json_object' },
    max_tokens: 500,
  });

  const content = response.choices[0].message.content || '{}';
  return JSON.parse(content) as CodeAnalysis;
}

// Test with a code screenshot URL
const analysis = await analyzeCodeScreenshot('https://example.com/code-screenshot.png');
console.log(analysis);

Key Takeaways

  1. Model Selection: Use gpt-4o-mini for most tasks, gpt-4o for complex reasoning or vision
  2. Authentication: Store API keys in environment variables, never in code
  3. Conversation Management: Maintain message history for multi-turn conversations
  4. Parameters: Adjust temperature for creativity vs. consistency trade-off
  5. Error Handling: Implement retry logic for rate limits and server errors
  6. Cost Awareness: Track token usage, especially with high-volume applications
  7. JSON Mode: Use response_format: { type: "json_object" } for structured outputs

Resources

Resource Type Description
OpenAI API Reference Documentation Complete API reference
OpenAI Cookbook Tutorial Practical examples and recipes
OpenAI Pricing Reference Current pricing information
OpenAI TypeScript SDK Repository Official SDK source code

Next Lesson

You have learned how to work with OpenAI's API. In the next lesson, you will explore Anthropic's Claude models, which offer a different approach to AI with a focus on safety and helpfulness.

Continue to Lesson 4.2: Anthropic (Claude)