Adds schema tests and data quality validation to dbt models. Use when working with dbt tests for: (1) Adding or modifying tests in schema.yml files (2) Task mentions "test", "validate", "data quality", "unique", "not_null", or "accepted_values" (3) Ensuring data integrity - primary keys, foreign keys, relationships (4) Debugging test failures or understanding why dbt test failed Matches existing project test patterns and YAML style before adding new tests.
Creates dbt models following project conventions. Use when working with dbt models for: (1) Creating new models (any layer - discovers project's naming conventions first) (2) Task mentions "create", "build", "add", "write", "new", or "implement" with model, table, or SQL (3) Modifying existing model logic, columns, joins, or transformations (4) Implementing a model from schema.yml specs or expected output requirements Discovers project conventions before writing. Runs dbt build (not just compile) to verify.
Debugs and fixes dbt errors systematically. Use when working with dbt errors for: (1) Task mentions "fix", "error", "broken", "failing", "debug", "wrong", or "not working" (2) Compilation Error, Database Error, or test failures occur (3) Model produces incorrect output or unexpected results (4) Need to troubleshoot why a dbt command failed Reads full error, checks upstream first, runs dbt build (not just compile) to verify fix.
Converts legacy SQL to modular dbt models. Use when migrating SQL to dbt for: (1) Converting stored procedures, views, or raw SQL files to dbt models (2) Task mentions "migrate", "convert", "legacy SQL", "transform to dbt", or "modernize" (3) Breaking monolithic queries into modular layers (discovers project conventions first) (4) Porting existing data pipelines or ETL to dbt patterns Checks for existing models/sources, builds and validates layer by layer.
Safely refactors dbt models with downstream impact analysis. Use when restructuring dbt models for: (1) Task mentions "refactor", "restructure", "extract", "split", "break into", or "reorganize" (2) Extracting CTEs to intermediate models or creating macros (3) Modifying model logic that has downstream consumers (4) Renaming columns, changing types, or reorganizing model dependencies Analyzes all downstream dependencies BEFORE making changes.
Documents dbt models and columns in schema.yml. Use when working with dbt documentation for: (1) Adding model descriptions or column definitions to schema.yml (2) Task mentions "document", "describe", "description", "dbt docs", or "schema.yml" (3) Explaining business context, grain, meaning of data, or business rules (4) Preparing dbt docs generate or improving model discoverability Matches existing project documentation style and conventions before writing.
Optimizes Snowflake SQL query performance from provided query text. Use when optimizing Snowflake SQL for: (1) User provides or pastes a SQL query and asks to optimize, tune, or improve it (2) Task mentions "slow query", "make faster", "improve performance", "optimize SQL", or "query tuning" (3) Reviewing SQL for performance anti-patterns (function on filter column, implicit joins, etc.) (4) User asks why a query is slow or how to speed it up
Optimizes Snowflake query performance using query ID from history. Use when optimizing Snowflake queries for: (1) User provides a Snowflake query_id (UUID format) to analyze or optimize (2) Task mentions "slow query", "optimize", "query history", or "query profile" with a query ID (3) Analyzing query performance metrics - bytes scanned, spillage, partition pruning (4) User references a previously run query that needs optimization Fetches query profile, identifies bottlenecks, returns optimized SQL with expected improvements.
Develops and troubleshoots dbt incremental models. Use when working with incremental materialization for: (1) Creating new incremental models (choosing strategy, unique_key, partition) (2) Task mentions "incremental", "append", "merge", "upsert", or "late arriving data" (3) Troubleshooting incremental failures (merge errors, partition pruning, schema drift) (4) Optimizing incremental performance or deciding table vs incremental Guides through strategy selection, handles common incremental gotchas.
Finds and ranks expensive Snowflake queries by cost, time, or data scanned. Use when: (1) User asks to find slow, expensive, or problematic queries (2) Task mentions "query history", "top queries", "most expensive", or "slowest queries" (3) Analyzing warehouse costs or identifying optimization candidates (4) Finding queries that scan the most data or have the most spillage Returns ranked list of queries with metrics and optimization recommendations.
What should this skill be called?