> ## Documentation Index
> Fetch the complete documentation index at: https://docs.karpathy.phronesisagent.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Think Before Coding: surface LLM assumptions first

> Think Before Coding stops Claude from silently assuming ambiguous requirements. It surfaces tradeoffs, asks clarifying questions, and pushes back when needed.

When you give an LLM an ambiguous request, it doesn't pause to ask — it picks an interpretation and runs with it. The result is code that confidently solves the wrong problem. Think Before Coding is the principle that forces explicit reasoning before any implementation begins: surface assumptions, present alternatives, push back when the approach is wrong, and stop when something is genuinely unclear.

## The four rules

<Steps>
  <Step title="State assumptions explicitly">
    If the request is ambiguous, name what you're assuming before writing a single line. If uncertain, ask rather than guess.
  </Step>

  <Step title="Present multiple interpretations">
    When more than one valid reading exists, list them with tradeoffs. Don't pick silently and hope for the best.
  </Step>

  <Step title="Push back when warranted">
    If a simpler approach exists, say so. If the requested approach will cause problems, name them.
  </Step>

  <Step title="Stop when confused">
    Name what's unclear and ask for clarification. Proceeding while confused produces expensive mistakes.
  </Step>
</Steps>

## Hidden assumptions example

**Request:** "Add a feature to export user data"

<Warning>
  **What LLMs do wrong — silently assuming scope, format, and fields**

  ```python theme={null}
  def export_users(format='json'):
      """Export all users to a file."""
      users = User.query.all()

      if format == 'json':
          with open('users.json', 'w') as f:
              json.dump([u.to_dict() for u in users], f)
      elif format == 'csv':
          with open('users.csv', 'w') as f:
              writer = csv.DictWriter(f, fieldnames=['id', 'email', 'name'])
              writer.writeheader()
              writer.writerows([u.to_dict() for u in users])

      return f"Exported {len(users)} users"
  ```

  This code assumed: export all users (what about pagination, privacy?), a local file destination, specific fields to include, and CSV fieldnames without checking the actual data structure.
</Warning>

<Check>
  **The correct approach — surface every assumption before writing code**

  ```text theme={null}
  Before implementing, I need to clarify:

  1. Scope: Export all users or a filtered subset? (privacy implications)
  2. Format: You mentioned "export" — did you mean:
     - Download file in browser?
     - Background job with email notification?
     - API endpoint returning data?
  3. Fields: Which user fields? (some might be sensitive)
  4. Volume: How many users typically? (affects approach)

  Simplest approach: Add an API endpoint that returns paginated JSON.
  Would need more info for file-based exports.

  What's your preference?
  ```
</Check>

## Multiple interpretations example

**Request:** "Make the search faster"

<Warning>
  **What LLMs do wrong — pick one meaning silently and over-optimize**

  ```python theme={null}
  # Adds caching, database indexes, and async processing
  # without asking which "faster" matters
  @lru_cache(maxsize=1000)
  async def search(query: str) -> list[Result]:
      # ... 200 lines of optimization code
  ```
</Warning>

<Check>
  **The correct approach — present each interpretation with effort estimates**

  ```text theme={null}
  "Make the search faster" could mean:

  1. Faster response time (< 100ms instead of 500ms)
     - Add database indexes
     - Cache frequent queries
     - Estimated effort: 2 hours

  2. Handle more concurrent searches (throughput)
     - Add async/await
     - Connection pooling
     - Estimated effort: 4 hours

  3. Faster perceived speed (UX)
     - Show partial results immediately
     - Progressive loading
     - Estimated effort: 3 hours

  Current search takes ~500ms for typical queries.
  Which aspect matters most for your use case?
  ```
</Check>

## When this principle is working

You'll know Think Before Coding is active when you see:

* Clarifying questions arriving **before** implementation, not after a wrong-direction commit
* Multiple approaches presented with tradeoffs rather than a single confident solution
* Explicit "I'm assuming X — is that right?" notes in responses
* The LLM declining to proceed when the request is genuinely ambiguous

<Note>
  These guidelines bias toward caution over speed. For trivial tasks — obvious one-liners, simple typo fixes — use judgment. Not every change requires the full clarification ritual.
</Note>
