Implement agent evaluation and safety gates using MLflow 3.x. Use for creating LLM-as-Judge scorers, evaluation datasets, quality gates, tracing, and continuous evaluation. Triggers on "evaluate agent", "MLflow scorer", "LLM judge", "safety evaluation", "quality gate", "agent testing", "hallucination detection", or when implementing spec/010-agent-evaluation.md requirements.
Analyze email messages and mailbox data for forensic investigation. Use when investigating phishing attacks, business email compromise, insider threats, or any scenario requiring email evidence analysis. Supports PST, OST, MBOX, EML, and MSG formats.
Issue tracking with Beads (bd CLI). Use when commands need to create, query, update, or close issues, or manage dependencies.