Karpathy Skills examples: before-and-after code patterns

These examples show concrete before-and-after code for each of the four principles. Each one captures a pattern that appears repeatedly in real LLM-assisted coding sessions — use them to recognize when a model is about to go off the rails, and to calibrate what “correct” behavior looks like before you get there.

1. Think before coding

When a request is ambiguous, LLMs tend to pick an interpretation silently and implement it. The result looks complete but is built on assumptions the user never validated.

Hidden assumptions: exporting user data

Request: “Add a feature to export user data”

Wrong approach — silent assumptions

def export_users(format='json'):
    """Export all users to a file."""
    users = User.query.all()
    
    if format == 'json':
        with open('users.json', 'w') as f:
            json.dump([u.to_dict() for u in users], f)
    elif format == 'csv':
        with open('users.csv', 'w') as f:
            writer = csv.DictWriter(f, fieldnames=['id', 'email', 'name'])
            writer.writeheader()
            writer.writerows([u.to_dict() for u in users])
    
    return f"Exported {len(users)} users"

This code assumes: export ALL users (ignoring privacy and pagination), write to a local file, include specific fields, use hardcoded fieldnames — none of which were specified.

Correct approach — surface assumptions first

Before implementing, I need to clarify:

1. **Scope**: Export all users or filtered subset? (privacy implications)
2. **Format**: You mentioned "export" - did you mean:
   - Download file in browser?
   - Background job with email notification?
   - API endpoint returning data?
3. **Fields**: Which user fields? (some might be sensitive)
4. **Volume**: How many users typically? (affects approach)

Simplest approach: Add an API endpoint that returns paginated JSON.
Would need more info for file-based exports.

What's your preference?

2. Simplicity first

LLMs frequently reach for abstractions and design patterns before any complexity is warranted. The result is code that follows best practices but is harder to read, harder to test, and solves a problem that doesn’t exist yet.

Over-abstraction: discount calculator

Request: “Add a function to calculate discount”

Wrong approach — overengineered for a single use case

from abc import ABC, abstractmethod
from enum import Enum
from typing import Protocol, Union
from dataclasses import dataclass

class DiscountStrategy(ABC):
    @abstractmethod
    def calculate(self, amount: float) -> float:
        pass

class PercentageDiscount(DiscountStrategy):
    def __init__(self, percentage: float):
        self.percentage = percentage
    
    def calculate(self, amount: float) -> float:
        return amount * (self.percentage / 100)

class FixedDiscount(DiscountStrategy):
    def __init__(self, fixed_amount: float):
        self.fixed_amount = fixed_amount
    
    def calculate(self, amount: float) -> float:
        return min(self.fixed_amount, amount)

@dataclass
class DiscountConfig:
    strategy: DiscountStrategy
    min_purchase: float = 0.0
    max_discount: float = float('inf')

class DiscountCalculator:
    def __init__(self, config: DiscountConfig):
        self.config = config
    
    def apply_discount(self, amount: float) -> float:
        if amount < self.config.min_purchase:
            return 0
        discount = self.config.strategy.calculate(amount)
        return min(discount, self.config.max_discount)

# Usage requires 30+ lines of setup for a simple calculation

Correct approach — minimum code that solves the problem

def calculate_discount(amount: float, percent: float) -> float:
    """Calculate discount amount. percent should be 0-100."""
    return amount * (percent / 100)

# Usage
discount = calculate_discount(100.0, 10.0)  # $10 off

Add complexity only when you actually need multiple discount types. If that requirement comes later, refactor then.

3. Surgical changes

When fixing a bug, LLMs often “improve” unrelated code in the same function — reformatting, renaming, adding type hints, strengthening validation. Each individual change seems reasonable, but together they produce a diff that is hard to review and risky to ship.

Drive-by refactoring: fixing an email validation bug

Request: “Fix the bug where empty emails crash the validator”

Wrong approach — changes far beyond the bug fix

  def validate_user(user_data):
-     # Check email format
-     if not user_data.get('email'):
+     """Validate user data."""
+     email = user_data.get('email', '').strip()
+     
+     # Validate email
+     if not email:
          raise ValueError("Email required")
-     
-     # Basic email validation
-     if '@' not in user_data['email']:
+     if '@' not in email or '.' not in email.split('@')[1]:
          raise ValueError("Invalid email")
      
-     # Check username
-     if not user_data.get('username'):
+     # Validate username
+     username = user_data.get('username', '').strip()
+     if not username:
          raise ValueError("Username required")
+     if len(username) < 3:
+         raise ValueError("Username too short")
+     if not username.isalnum():
+         raise ValueError("Username must be alphanumeric")
      
      return True

Problems: strengthened email validation beyond the reported bug, added username length and format rules nobody asked for, changed comments, added a docstring.

Correct approach — only the lines that fix empty email handling

  def validate_user(user_data):
      # Check email format
-     if not user_data.get('email'):
+     email = user_data.get('email', '')
+     if not email or not email.strip():
          raise ValueError("Email required")
      
      # Basic email validation
-     if '@' not in user_data['email']:
+     if '@' not in email:
          raise ValueError("Invalid email")
      
      # Check username
      if not user_data.get('username'):
          raise ValueError("Username required")
      
      return True

Every changed line traces directly to fixing the empty email crash.

4. Goal-driven execution

Vague instructions produce vague plans. When a task has no explicit success criteria, the model improvises — and you have no way to verify the result is correct until something breaks in production.

Multi-step with verification: adding rate limiting

Request: “Add rate limiting to the API”

Wrong approach — all at once, no verification

# Implements full rate limiting with Redis, multiple strategies,
# configuration system, and monitoring in one 300-line commit.
# No clear verification steps.

Correct approach — incremental steps, each independently verifiable

Plan for rate limiting:

1. Add basic in-memory rate limiting (single endpoint)
   Verify: 
   - Test: 100 requests → first 10 succeed, rest get 429
   - Manual: curl endpoint 11 times, see rate limit error

2. Extract to middleware (apply to all endpoints)
   Verify:
   - Test: Rate limits apply to /users and /posts
   - Existing endpoint tests still pass

3. Add Redis backend (for multi-server)
   Verify:
   - Test: Rate limit persists across app restarts
   - Test: Two app instances share rate limit counter

4. Add configuration (rates per endpoint)
   Verify:
   - Test: /search allows 10/min, /users allows 100/min
   - Config file parsed correctly

Each step is independently verifiable and deployable.
Start with step 1?

Anti-patterns summary

Principle	Anti-pattern	Fix
Think before coding	Silently assumes file format, fields, scope	List assumptions explicitly, ask for clarification
Simplicity first	Strategy pattern for single discount calculation	One function until complexity is actually needed
Surgical changes	Reformats quotes, adds type hints while fixing bug	Only change lines that fix the reported issue
Goal-driven execution	”I’ll review and improve the code"	"Write test for bug X → make it pass → verify no regressions”

Key insight

The “overcomplicated” examples are not obviously wrong — they follow design patterns and best practices. The problem is timing: they add complexity before it is needed, which makes code harder to understand, introduces more bugs, takes longer to implement, and is harder to test.

Good code is code that solves today’s problem simply, not tomorrow’s problem prematurely.

The simple versions are easier to understand, faster to implement, easier to test, and can be refactored when complexity is actually needed.

Get Started

The Four Principles

Guides

Karpathy Skills examples: before-and-after code patterns

1. Think before coding

Hidden assumptions: exporting user data

2. Simplicity first

Over-abstraction: discount calculator

3. Surgical changes

Drive-by refactoring: fixing an email validation bug

4. Goal-driven execution

Multi-step with verification: adding rate limiting

Anti-patterns summary

Key insight

Get Started

The Four Principles

Guides

Documentation Index

​1. Think before coding

​Hidden assumptions: exporting user data

​2. Simplicity first

​Over-abstraction: discount calculator

​3. Surgical changes

​Drive-by refactoring: fixing an email validation bug

​4. Goal-driven execution

​Multi-step with verification: adding rate limiting

​Anti-patterns summary

​Key insight

1. Think before coding

Hidden assumptions: exporting user data

2. Simplicity first

Over-abstraction: discount calculator

3. Surgical changes

Drive-by refactoring: fixing an email validation bug

4. Goal-driven execution

Multi-step with verification: adding rate limiting

Anti-patterns summary

Key insight