@mattnigh/python-coding — Agent Skill

---
name: python-coding
description: Best practices for writing high quality production grade Python code
---

# Python Coding

This skill is applicable for writing production intent python code

## Core Principles

Applicable to production code. Less applicable to scripts, but use as much as it makes sense balancing cleanliness/simplicity with complexity/additional code

1. Prioritize KISS, SOLID and DRY principles
2. Utilize `uv` for all project management, venv, package installation, builds, execution etc.
3. Always use the latest Python practices (current version is 3.14) and packages.
4. Write module and function docstrings aiming for 100% coverage (verified by `interrogate` automatically in pre-commit)
5. Use type annotations maximally and run type checking always (with pyrefly)
6. Use logging (with structlog)
7. **Fail Fast**: Validate inputs and data (with pydantic) early at function boundaries
8. Write tests and aim for at least 70% coverage (with pytest and coverage)
9. Rely on `ruff` for formatting, import sorting and linting.
10. - **Resource Management** Use context managers (`with` statements) for files, connections, locks

Details on each below:

### In-Line Documentation

- Follow Numpy docstring format for function docstrings. Except, do not add parameter types (as we already use type annotation and it would violate DRY causing confusion/errors)
- Module docstring to follow the format:

```python
"""
Title:

Author: < AI Name >

Description:

Usage: <include basic examples>

Notes:

"""
```

- Be explicit and complete without being overly verbose.

### Typing

- Maximize type annotations for all functions/methods
- Use `pyrefly` for type checking

### Logging

- Use structured logging with `structlog`
- Centralize logging configuration with the dictConfig pattern
- Output to both file and terminal
- Follow the basic format: "%(asctime)s - %(levelname)s - %(message)s"
- Color the level name in outputs
  - Critical: Deep Red
  - Error: Light Red
  - Warning: Yellow
  - Info: Blue
  - Debug: Grey
- Rotate log files by time at the appropriate cadence
- Logger naming: create a hierarchy of loggers using dots in the names
- **Logging Security**
  - Never log sensitive data (passwords, tokens, PII)
  - Sanitize or redact sensitive fields before logging
  - Be careful with debug logs in production

### Data Validation and Input Sanitization

- Treat all external input as untrusted (API requests, file uploads, user input)
- Use Pydantic models to validate and coerce types at system boundaries
- Sanitize inputs before using in:
  - SQL queries (use parameterized queries, ORMs)
  - Shell commands (avoid `shell=True`, use lists)
  - File paths (validate, use `pathlib.Path.resolve()`)
  - HTML/JavaScript (escape properly)

- Aim for early Validation
- Validate at system boundaries (API endpoints, file inputs, user inputs). Don't let invalid data propagate through the system
- Make validators reusable and composable
- Fail fast with clear error messages
- Use assertions for internal consistency checks (developer errors, not user errors)
- Donts: Validate after processing/storage, Mixing validation with transformation. Over-validating trusted internal data

### Error Handling

**Use Standard Exceptions First**

- Prefer built-in exceptions: `ValueError`, `TypeError`, `KeyError`, `FileNotFoundError`
- Let third-party exceptions bubble up: `httpx.HTTPError`, `pydantic.ValidationError`

**Custom Exceptions: Only When Needed**

Create custom exceptions only when:

- You need different handling logic at call sites (different catch blocks → different actions)
- The error represents a specific domain concept (e.g., `DuplicateArticleError`, `RateLimitError`)
- You're building a public API or library where users shouldn't catch internal exceptions
- Keep it simple - avoid deep hierarchies

**When to Catch vs Propagate**

- **Catch** when you can handle the error meaningfully (retry, fallback, user-friendly message)
- **Propagate** when the calling code is better positioned to handle it
- **Re-raise with context** using `raise NewException(...) from original_exception` to preserve stack trace

**Error Messages**

- Include specific field names and invalid values
- Provide actionable guidance when possible
- Example: `ValueError(f"Invalid status '{status}'. Must be one of: {VALID_STATUSES}")`

**Error Context with Structured Logging**

- Use structured logging to capture rich context instead of encoding it in exception types
- Include relevant fields: `url`, `status_code`, `article_id`, `feed_name`, etc.
- Set `exc_info=True` to capture full tracebacks

### Testing

- Write tests as soon as possible, before code ideally (if following TDD) or after code is written.
- Name tests with the naming convention `test_*.py`
- Prefer when appropriate, Pytest with parametrized (`@pytest.mark.parametrize`) or randomized inputs( with hypothesis) complimentarily:
  - When to Use Each
    **Use Parametrize For:**
    - Known edge cases you must handle
    - Regression tests (specific bugs)
    - Documentation of expected behavior
    - Business rule examples
    - Quick debugging (run specific case)
    **Use Hypothesis For:**
    - Finding unknown edge cases
    - Testing invariants (properties that always hold)
    - Stress testing with many inputs
    - Discovering missing validation
  - Practical Workflow
    1. Start with parametrize for obvious cases
    2. Add hypothesis to find what you missed
    3. When hypothesis finds a bug, add it as a parametrized test for regression
    4. Keep both - parametrize documents intent, hypothesis keeps hunting
- Aim for Minimum 70% coverage (soft guideline), measure with `pytest-cov`
- Test file naming: `test_<module>.py`
- Test both valid and invalid cases
- Test boundary conditions explicitly

### Security Best Practices

**Secret Management**

- Never hardcode secrets (API keys, passwords, tokens) in source code
- Use environment variables for configuration: `os.getenv("API_KEY")`
- Use `.env` files for local development (add to `.gitignore`)
- For production, use secret management services (Google Secret Manager, AWS Secrets Manager etc.)
- Validate that required secrets are present at startup

**SQL Injection Prevention**

- Always use parameterized queries or ORMs
- Never concatenate user input into SQL strings
- Example with parameterization:

  ```python
  # UNSAFE
  cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")

  # SAFE
  cursor.execute("SELECT * FROM users WHERE id = ?", (user_id,))
  ```

**Command Injection Prevention**

- Avoid `subprocess.run(shell=True)`
- Pass commands as lists: `subprocess.run(["ls", "-l", directory])`
- If shell=True is necessary, sanitize inputs rigorously

## Development Tools

Prefer the following tools

| Tool | Purpose |
| ------ | ----- |
| uv | Project management, Python management, Package installation, Virtual Environments, Executing modules/scripts |
| pytest | Testing |
| pyrefly | Type Checking |
| structlog | Logging |
| coverage | Testing Coverage |
| pydantic | Data Validation |
| ruff | Linting, Formatting, Imports Sorting |
| prek | Pre-commit check |

Project management should be done declaratively through uv commands (which in turn will write into pyproject.toml and the uv lock file).

## Command Reference

Available here .claude/skills/python-coding/references/commands.md

## Package Preferences

Prefer:

- `httpx` over `requests`