APM

>Agent Skill

@sendaifun/zz-code-recon

skilldevelopment

Deep architectural context building for security audits. Use when conducting security reviews, building codebase understanding, mapping trust boundaries, or preparing for vulnerability analysis. Inspired by Trail of Bits methodology.

apm::install
$apm install @sendaifun/zz-code-recon
apm::skill.md
---
name: zz-code-recon
description: Deep architectural context building for security audits. Use when conducting security reviews, building codebase understanding, mapping trust boundaries, or preparing for vulnerability analysis. Inspired by Trail of Bits methodology.
---

# CodeRecon - Deep Architectural Context Building

Build comprehensive architectural understanding through ultra-granular code analysis. Designed for security auditors, code reviewers, and developers who need to rapidly understand unfamiliar codebases before diving deep.

## Overview

CodeRecon is a systematic approach to codebase reconnaissance that builds layered understanding from high-level architecture down to implementation details. Inspired by Trail of Bits' audit-context-building methodology.

### Why CodeRecon?

Before you can find vulnerabilities, you need to understand:
- How the system is architected
- Where data flows
- What the trust boundaries are
- Where security-critical logic lives

This skill provides a structured methodology for building that context efficiently.

## The Recon Pyramid

```
                    ┌─────────────┐
                    │   DETAILS   │  ← Implementation specifics
                   ─┼─────────────┼─
                  / │  FUNCTIONS  │  ← Key function analysis
                 /  ─┼─────────────┼─
                /   │   MODULES   │  ← Component relationships
               /    ─┼─────────────┼─
              /     │ ARCHITECTURE│  ← System structure
             /      ─┼─────────────┼─
            /       │   OVERVIEW  │  ← High-level understanding
           ─────────┴─────────────┴─────────
```

Start broad, go deep systematically.

## Phase 1: Overview Reconnaissance

### 1.1 Project Identification

Gather basic project information:

```bash
# Check for documentation
ls -la README* ARCHITECTURE* SECURITY* CHANGELOG* docs/

# Identify build system
ls package.json Cargo.toml go.mod pyproject.toml Makefile

# Check for tests
ls -la test* spec* *_test* __tests__/

# Identify CI/CD
ls -la .github/workflows/ .gitlab-ci.yml Jenkinsfile .circleci/
```

### 1.2 Technology Stack Detection

```bash
# Language distribution
find . -type f -name "*.py" | wc -l
find . -type f -name "*.js" -o -name "*.ts" | wc -l
find . -type f -name "*.go" | wc -l
find . -type f -name "*.rs" | wc -l
find . -type f -name "*.sol" | wc -l

# Framework indicators
grep -r "from flask" --include="*.py" | head -1
grep -r "from django" --include="*.py" | head -1
grep -r "express\|fastify" --include="*.js" | head -1
grep -r "anchor_lang" --include="*.rs" | head -1
```

### 1.3 Dependency Analysis

```bash
# Python dependencies
cat requirements.txt pyproject.toml setup.py 2>/dev/null | grep -E "^\s*[a-zA-Z]"

# Node.js dependencies
cat package.json | jq '.dependencies, .devDependencies'

# Rust dependencies
cat Cargo.toml | grep -A 100 "\[dependencies\]"

# Go dependencies
cat go.mod | grep -E "^\s+[a-z]"
```

### 1.4 Create Technology Map

```markdown
## Technology Map: [PROJECT NAME]

### Languages
| Language | Files | Lines | Primary Use |
|----------|-------|-------|-------------|
| Python | 150 | 25K | Backend API |
| TypeScript | 80 | 12K | Frontend |
| Solidity | 12 | 2K | Smart Contracts |

### Key Dependencies
| Package | Version | Purpose | Security Notes |
|---------|---------|---------|----------------|
| fastapi | 0.100.0 | Web framework | Recent CVEs: None |
| web3.py | 6.0.0 | Blockchain client | Check signing |
| pyjwt | 2.8.0 | JWT handling | Verify alg checks |

### Infrastructure
- Database: PostgreSQL 15
- Cache: Redis 7
- Message Queue: RabbitMQ
- Container: Docker + K8s
```

## Phase 2: Architecture Mapping

### 2.1 Directory Structure Analysis

```bash
# Top-level structure
tree -L 2 -d

# Identify entry points
find . -name "main.py" -o -name "app.py" -o -name "index.ts" -o -name "main.go"

# Identify config
find . -name "config*" -o -name "settings*" -o -name ".env*"
```

### 2.2 Component Identification

Look for common patterns:

```
project/
├── api/           # HTTP endpoints
├── auth/          # Authentication
├── core/          # Business logic
├── db/            # Database layer
├── models/        # Data models
├── services/      # External services
├── utils/         # Utilities
├── workers/       # Background jobs
└── tests/         # Test suite
```

### 2.3 Create Architecture Diagram

```
┌─────────────────────────────────────────────────────────────┐
│                        CLIENTS                              │
│              (Web, Mobile, API Consumers)                   │
└─────────────────────────┬───────────────────────────────────┘
                          │ HTTPS

┌─────────────────────────────────────────────────────────────┐
│                      API GATEWAY                            │
│                   (Rate Limiting, Auth)                     │
└─────────────────────────┬───────────────────────────────────┘

          ┌───────────────┼───────────────┐
          ▼               ▼               ▼
    ┌──────────┐   ┌──────────┐   ┌──────────┐
    │  Auth    │   │  Core    │   │  Admin   │
    │ Service  │   │  API     │   │  API     │
    └────┬─────┘   └────┬─────┘   └────┬─────┘
         │              │              │
         └──────────────┼──────────────┘

          ┌─────────────┼─────────────┐
          ▼             ▼             ▼
    ┌──────────┐  ┌──────────┐  ┌──────────┐
    │ Database │  │  Cache   │  │ External │
    │ (Postgres)│  │ (Redis)  │  │  APIs    │
    └──────────┘  └──────────┘  └──────────┘
```

### 2.4 Trust Boundary Identification

Map where trust levels change:

```markdown
## Trust Boundaries

### Boundary 1: Internet → API Gateway
- **Type:** Network boundary
- **Controls:** TLS, Rate limiting, WAF
- **Risks:** DDoS, Injection, Auth bypass

### Boundary 2: API Gateway → Services
- **Type:** Authentication boundary
- **Controls:** JWT validation, Role checks
- **Risks:** Token forgery, Privilege escalation

### Boundary 3: Services → Database
- **Type:** Data access boundary
- **Controls:** Query parameterization, Connection pooling
- **Risks:** SQL injection, Data leakage

### Boundary 4: Services → External APIs
- **Type:** Third-party integration
- **Controls:** API keys, Request signing
- **Risks:** SSRF, Secret exposure
```

## Phase 3: Module Deep Dive

### 3.1 Entry Point Analysis

For each entry point type:

```python
# HTTP Routes - map all endpoints
grep -rn "@app.route\|@router\|@api_view" --include="*.py"
grep -rn "app.(get|post|put|delete)\|router.(get|post)" --include="*.ts"

# CLI Commands
grep -rn "@click.command\|argparse\|clap" --include="*.py" --include="*.rs"

# Event Handlers
grep -rn "@consumer\|@handler\|on_message" --include="*.py"
```

### 3.2 Create Entry Point Map

```markdown
## Entry Points

### HTTP API
| Method | Path | Handler | Auth | Input |
|--------|------|---------|------|-------|
| POST | /api/login | auth.login | None | JSON body |
| GET | /api/users | users.list | JWT | Query params |
| POST | /api/transfer | tx.transfer | JWT + 2FA | JSON body |
| GET | /admin/logs | admin.logs | Admin JWT | Query params |

### WebSocket
| Event | Handler | Auth | Data |
|-------|---------|------|------|
| connect | ws.connect | JWT | None |
| message | ws.message | Session | JSON |

### Background Jobs
| Queue | Handler | Trigger | Data Source |
|-------|---------|---------|-------------|
| emails | email.send | API call | Database |
| reports | report.gen | Cron | Database |
```

### 3.3 Data Flow Tracing

For each critical endpoint, trace data flow:

```
POST /api/transfer


┌──────────────────┐
│ Request Parser   │ ← Validate JSON schema
│ (validation.py)  │
└────────┬─────────┘
         │ TransferRequest

┌──────────────────┐
│ Auth Middleware  │ ← Verify JWT, extract user
│ (middleware.py)  │
└────────┬─────────┘
         │ User context

┌──────────────────┐
│ Transfer Service │ ← Business logic
│ (transfer.py)    │
└────────┬─────────┘

    ┌────┴────┐
    ▼         ▼
┌────────┐ ┌────────┐
│ DB     │ │External│
│ Write  │ │ API    │
└────────┘ └────────┘
```

## Phase 4: Function-Level Analysis

### 4.1 Security-Critical Function Identification

Search for security-sensitive operations:

```bash
# Authentication
grep -rn "def login\|def authenticate\|def verify_token" --include="*.py"
grep -rn "function login\|authenticate\|verifyToken" --include="*.ts"

# Authorization
grep -rn "def is_authorized\|def check_permission\|@requires_role" --include="*.py"

# Cryptography
grep -rn "encrypt\|decrypt\|hash\|sign\|verify" --include="*.py"
grep -rn "crypto\.\|bcrypt\|argon2" --include="*.py"

# Database
grep -rn "execute\|query\|cursor" --include="*.py"
grep -rn "\.query\|\.execute\|\.raw" --include="*.ts"

# File Operations
grep -rn "open\(.*\)\|read\|write\|unlink" --include="*.py"
```

### 4.2 Function Documentation Template

For each critical function:

```markdown
### Function: `transfer_funds()`

**Location:** `services/transfer.py:45`

**Purpose:** Execute fund transfer between accounts

**Parameters:**
| Name | Type | Source | Validation |
|------|------|--------|------------|
| from_account | str | JWT claim | UUID format |
| to_account | str | Request body | UUID format, exists check |
| amount | Decimal | Request body | > 0, <= balance |

**Returns:** TransferResult

**Side Effects:**
- Writes to `transactions` table
- Calls external payment API
- Emits `transfer_completed` event

**Security Considerations:**
- Requires authenticated user
- Rate limited to 10/minute
- Amount validated against balance
- Audit logged

**Potential Risks:**
- Race condition if concurrent transfers?
- What if external API fails mid-transfer?
```

### 4.3 Call Graph Analysis

```
transfer_funds()
├── validate_request()
│   └── check_uuid_format()
├── get_user_balance()
│   └── db.query()
├── check_rate_limit()
│   └── redis.get()
├── execute_transfer()     ← CRITICAL
│   ├── db.begin_transaction()
│   ├── update_balance()   ← State change
│   ├── external_api.send() ← External call
│   └── db.commit()
└── emit_event()
```

## Phase 5: Detail Reconnaissance

### 5.1 Configuration Analysis

```bash
# Find all config loading
grep -rn "os.environ\|getenv\|config\." --include="*.py"
grep -rn "process.env\|config\." --include="*.ts"

# Check for hardcoded secrets
grep -rn "password\s*=\|secret\s*=\|api_key\s*=" --include="*.py"
grep -rn "-----BEGIN\|sk-\|pk_live_" .
```

### 5.2 Error Handling Review

```bash
# Find exception handling
grep -rn "except.*:" --include="*.py" -A 2
grep -rn "catch\s*(" --include="*.ts" -A 2

# Find error responses
grep -rn "return.*error\|raise.*Error" --include="*.py"
```

### 5.3 Logging Analysis

```bash
# Find logging statements
grep -rn "logger\.\|logging\.\|console\.log" --include="*.py" --include="*.ts"

# Check what's being logged
grep -rn "log.*password\|log.*token\|log.*secret" --include="*.py"
```

## Output: Context Document

### Template

```markdown
# [PROJECT NAME] - Security Context Document

## Executive Summary
[2-3 sentences on what this system does]

## Technology Stack
[From Phase 1]

## Architecture
[Diagram from Phase 2]

## Trust Boundaries
[From Phase 2.4]

## Entry Points
[Table from Phase 3.2]

## Critical Functions
[Analysis from Phase 4]

## Data Flows
[Diagrams from Phase 3.3]

## Security Controls
| Control | Implementation | Location | Notes |
|---------|----------------|----------|-------|
| Authentication | JWT | middleware/auth.py | RS256 signing |
| Authorization | RBAC | decorators/auth.py | Role-based |
| Input Validation | Pydantic | schemas/*.py | Type checking |
| Encryption | AES-256-GCM | utils/crypto.py | At-rest |

## Areas Requiring Focus
1. [High-risk area 1]
2. [High-risk area 2]
3. [High-risk area 3]

## Open Questions
- [ ] How is X handled when Y?
- [ ] What happens if Z fails?
```

## Quick Start Commands

```bash
# Full recon script
./scripts/recon.sh /path/to/project

# Generate entry point map
./scripts/map-endpoints.sh /path/to/project

# Create call graph
./scripts/callgraph.sh /path/to/project
```

## Skill Files

```
code-recon/
├── SKILL.md                        # This file
├── resources/
│   ├── recon-checklist.md          # Comprehensive checklist
│   └── question-bank.md            # Questions to answer
├── examples/
│   ├── web-app-recon/              # Web application example
│   └── smart-contract-recon/       # Smart contract example
├── templates/
│   └── context-document.md         # Output template
└── docs/
    └── advanced-techniques.md      # Deep dive techniques
```

## Guidelines

1. **Top-down approach** - Start broad, go narrow
2. **Document everything** - Your notes are the deliverable
3. **Question assumptions** - Verify what docs say vs. what code does
4. **Focus on trust boundaries** - That's where bugs live
5. **Time-box phases** - Don't get stuck in the weeds early
6. **Iterate** - Revisit earlier phases as you learn more