Documentation Index
Fetch the complete documentation index at: https://docs.karpathy.phronesisagent.com/llms.txt
Use this file to discover all available pages before exploring further.
When you ask an LLM to fix a bug, you expect a focused diff. What you often get instead is a sweeping “improvement” that touches comments, reformats whitespace, adds type hints, upgrades validation logic, and restructures boolean returns — none of which you asked for. Surgical Changes enforces a simple discipline: every changed line must trace directly to the user’s request, nothing more.
Two rules
Don’t touch what you shouldn’t
When editing existing code:
- Don’t “improve” adjacent code, comments, or formatting
- Don’t refactor things that aren’t broken
- Match existing style, even if you’d do it differently
- If you notice unrelated dead code, mention it — don’t delete it
Clean up your own orphans
When your changes create unused artifacts:
- Remove imports, variables, and functions that your changes made unused
- Don’t remove pre-existing dead code unless explicitly asked
The test
Every changed line should trace directly to the user’s request. If you can’t answer “why did this line change?” with a direct reference to the task, the change shouldn’t be there.
Drive-by refactoring example
Request: “Fix the bug where empty emails crash the validator”
What LLMs do wrong — “fixing” adjacent username logic, adding a docstring, improving email validation beyond the bug def validate_user(user_data):
- # Check email format
- if not user_data.get('email'):
+ """Validate user data."""
+ email = user_data.get('email', '').strip()
+
+ # Validate email
+ if not email:
raise ValueError("Email required")
-
- # Basic email validation
- if '@' not in user_data['email']:
+ if '@' not in email or '.' not in email.split('@')[1]:
raise ValueError("Invalid email")
- # Check username
- if not user_data.get('username'):
+ # Validate username
+ username = user_data.get('username', '').strip()
+ if not username:
raise ValueError("Username required")
+ if len(username) < 3:
+ raise ValueError("Username too short")
+ if not username.isalnum():
+ raise ValueError("Username must be alphanumeric")
return True
Unasked-for changes: improved email validation, added username length and alphanumeric checks, changed comments, added a docstring.
The correct approach — only the lines that fix the empty email crash def validate_user(user_data):
# Check email format
- if not user_data.get('email'):
+ email = user_data.get('email', '')
+ if not email or not email.strip():
raise ValueError("Email required")
# Basic email validation
- if '@' not in user_data['email']:
+ if '@' not in email:
raise ValueError("Invalid email")
# Check username
if not user_data.get('username'):
raise ValueError("Username required")
return True
Only changed: the specific lines that fix empty email handling.
Style drift example
Request: “Add logging to the upload function”
What LLMs do wrong — changing quote style, adding type hints, adding a docstring, reformatting whitespace- def upload_file(file_path, destination):
+ def upload_file(file_path: str, destination: str) -> bool:
+ """Upload file to destination with logging."""
+ logger.info(f"Uploading {file_path} to {destination}")
+
try:
- with open(file_path, 'rb') as f:
- data = f.read()
-
- response = requests.post(destination, files={'file': data})
-
- if response.status_code == 200:
- return True
- else:
- return False
+ with open(file_path, "rb") as f:
+ data = f.read()
+ response = requests.post(destination, files={"file": data})
+ success = response.status_code == 200
+
+ if success:
+ logger.info(f"Upload successful: {file_path}")
+ else:
+ logger.error(f"Upload failed: {file_path}, status={response.status_code}")
+
+ return success
except Exception as e:
- print(f"Error: {e}")
+ logger.exception(f"Upload error for {file_path}: {e}")
return False
Unasked-for changes: type hints, docstring, quote style changed from ' to ", whitespace reformatted, boolean return logic restructured.
The correct approach — match existing style, add only the logging lines+ import logging
+
+ logger = logging.getLogger(__name__)
+
def upload_file(file_path, destination):
+ logger.info(f'Starting upload: {file_path}')
try:
with open(file_path, 'rb') as f:
data = f.read()
response = requests.post(destination, files={'file': data})
if response.status_code == 200:
+ logger.info(f'Upload successful: {file_path}')
return True
else:
+ logger.error(f'Upload failed: {file_path}, status={response.status_code}')
return False
except Exception as e:
- print(f"Error: {e}")
+ logger.exception(f'Upload error: {file_path}')
return False
Matched: single quotes, no type hints, existing boolean pattern, original spacing style.
When this principle is working
- Diffs are small and focused — only requested changes appear
- PRs are clean with no drive-by refactoring or “improvements”
- Code style is consistent within each file, even across contributors
- Every line in the diff has an obvious reason for changing