Prepare for the next agent or session by documenting work and leaving codebase in good state
Use for running and editing notebooks efficiently via jtool/Jupyter; prefers uv for deps and headless execution.
Use for Vite+React+TanStack Query frontends on Bun/Cloudflare Workers with shadcn/Tailwind—provides dev console bridge, tmux layout, dense/no-motion UI defaults, and Justfile/CI parity.
Use when aligning with Georgios’s default Git/GitHub conventions—cloning with gh, Conventional Commits, atomic staging, and guarded GitHub Actions for Rust/Python/TypeScript—so work matches his standard workflows.
Use for PR/code reviews and any task that benefits from a dedicated tmux sub-agent with per-task git worktrees; default path for reviewing diffs (read diff → summarize → run checks/tests) with automated monitoring.
Help address review/issue comments on the open GitHub PR for the current branch using gh CLI; verify gh auth first and prompt the user to authenticate if not logged in.
Practical guide for building safe, syntax-aware srgn CLI commands for source-code search and transformation. Use when users ask for srgn commands, scoped refactors (comments/docstrings/imports/functions), multi-file rewrites with --glob, custom tree-sitter query usage, or CI-style checks with --fail-any/--fail-none.
Formal evaluation framework for Claude Code sessions implementing eval-driven development (EDD) principles
Inspect GitHub PR checks with gh, pull failing GitHub Actions logs, summarize failure context, then create a fix plan and implement after user approval. Use when a user asks to debug or fix failing PR CI/CD checks on GitHub Actions and wants a plan + code changes; for external checks (e.g., Buildkite), only report the details URL and mark them out of scope.
Effective code search, analysis, and refactoring using ast-grep (sg). Use this skill for precise AST-based code modifications, structural search, and linting.
```skill
no-config skill
{what this skill teaches agents}
A simple tool.
Structured decision critic that systematically stress-tests reasoning before commitment surfacing hidden assumptions verifying claims and generating adversarial perspectives to improve decision quality.
```skill
Generate format-controlled research reports with evidence tracking, citations, and iterative review. This skill should be used when users request a research report, literature review, market or industry analysis, competitive landscape, policy or technical brief, or require a strict report template and section formatting that a single deepresearch pass cannot reliably enforce.
**Skill Path:** `plugins/meta/claude-dev-sandbox/skills/example-skill/SKILL.md`
**WORKFLOW SKILL** - Evaluate AI agent skills using structured benchmarks with YAML specs, fixture isolation, and pluggable validators. USE FOR: run waza, waza help, run eval, run benchmark, evaluate skill, test agent, generate eval suite, init eval, compare results, score agent, agent evaluation, skill testing, cross-model comparison. DO NOT USE FOR: improving skill frontmatter (use waza dev), creating new skills from scratch (use skill-creator), token counting or budget checks (use waza tokens). INVOKES: Copilot SDK executor, mock engine, code/regex validators. FOR SINGLE OPERATIONS: use waza run directly for a single benchmark.
Transform academic papers into in-depth technical articles with multiple writing style options. Use the MinerU Cloud API for high-precision PDF parsing, automatically extracting images, tables, and formulas. Optional formula explanations and GitHub code analysis, generating Markdown and HTML formats.