adk-deploy-guide
skill✓MUST READ before deploying any ADK agent. ADK deployment guide — Agent Engine, Cloud Run, GKE, CI/CD pipelines, secrets, observability, and production workflows. Use when deploying agents to Google Cloud or troubleshooting deployments. Do NOT use for API code patterns (use adk-cheatsheet), evaluation (use adk-eval-guide), or project scaffolding (use adk-scaffold).
apm::install
apm install @google/adk-deploy-guideapm::skill.md
---
name: adk-deploy-guide
description: >
MUST READ before deploying any ADK agent.
ADK deployment guide — Agent Engine, Cloud Run, GKE, CI/CD pipelines,
secrets, observability, and production workflows.
Use when deploying agents to Google Cloud or troubleshooting deployments.
Do NOT use for API code patterns (use adk-cheatsheet), evaluation
(use adk-eval-guide), or project scaffolding (use adk-scaffold).
metadata:
license: Apache-2.0
author: Google
---
# ADK Deployment Guide
> **Scaffolded project?** Use the `make` commands throughout this guide — they wrap Terraform, Docker, and deployment into a tested pipeline.
>
> **No scaffold?** See [Quick Deploy](#quick-deploy-adk-cli) below, or the [ADK deployment docs](https://google.github.io/adk-docs/deploy/).
> For production infrastructure, scaffold with `/adk-scaffold`.
### Reference Files
For deeper details, consult these reference files in `references/`:
- **`cloud-run.md`** — Scaling defaults, Dockerfile, session types, networking
- **`agent-engine.md`** — deploy.py CLI, AdkApp pattern, Terraform resource, deployment metadata, CI/CD differences
- **`terraform-patterns.md`** — Custom infrastructure, IAM, state management, importing resources
- **`event-driven.md`** — Pub/Sub, Eventarc, BigQuery Remote Function triggers via custom `fast_api_app.py` endpoints
> **Observability:** See the **adk-observability-guide** skill for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.
---
## Deployment Target Decision Matrix
Choose the right deployment target based on your requirements:
| Criteria | Agent Engine | Cloud Run | GKE |
|----------|-------------|-----------|-----|
| **Languages** | Python | Python | Python (+ others via custom containers) |
| **Scaling** | Managed auto-scaling (configurable min/max, concurrency) | Fully configurable (min/max instances, concurrency, CPU allocation) | Full Kubernetes scaling (HPA, VPA, node auto-provisioning) |
| **Networking** | VPC-SC and PSC supported | Full VPC support, direct VPC egress, IAP, ingress rules | Full Kubernetes networking|
| **Session state** | Native `VertexAiSessionService` (persistent, managed) | In-memory (dev), Cloud SQL, or Agent Engine session backend | Custom (any Kubernetes-compatible store) |
| **Batch/event processing** | Not supported | `/invoke` endpoint for Pub/Sub, Eventarc, BigQuery | Custom (Kubernetes Jobs, Pub/Sub) |
| **Cost model** | vCPU-hours + memory-hours (not billed when idle) | Per-instance-second + min instance costs | Node pool costs (always-on or auto-provisioned) |
| **Setup complexity** | Lower (managed, purpose-built for agents) | Medium (Dockerfile, Terraform, networking) | Higher (Kubernetes expertise required) |
| **Best for** | Managed infrastructure, minimal ops | Custom infra, event-driven workloads | Full control, open models, GPU workloads |
**Ask the user** which deployment target fits their needs. Each is a valid production choice with different trade-offs.
---
## Quick Deploy (ADK CLI)
For projects without Agent Starter Pack scaffolding. No Makefile, Terraform, or Dockerfile required.
```bash
# Cloud Run
adk deploy cloud_run --project=PROJECT --region=REGION path/to/agent/
# Agent Engine
adk deploy agent_engine --project=PROJECT --region=REGION path/to/agent/
# GKE (requires existing cluster)
adk deploy gke --project=PROJECT --cluster_name=CLUSTER --region=REGION path/to/agent/
```
All commands support `--with_ui` to deploy the ADK dev UI. Cloud Run also accepts extra `gcloud` flags after `--` (e.g., `-- --no-allow-unauthenticated`).
See `adk deploy --help` or the [ADK deployment docs](https://google.github.io/adk-docs/deploy/) for full flag reference.
> For CI/CD, observability, or production infrastructure, scaffold with `/adk-scaffold` and use the sections below.
---
## Dev Environment Setup & Deploy (Scaffolded Projects)
### Setting Up Dev Infrastructure (Optional)
`make setup-dev-env` runs `terraform apply` in `deployment/terraform/dev/`. This provisions supporting infrastructure:
- Service accounts (`app_sa` for the agent, used for runtime permissions)
- Artifact Registry repository (for container images)
- IAM bindings (granting the app SA necessary roles)
- Telemetry resources (Cloud Logging bucket, BigQuery dataset)
- Any custom resources defined in `deployment/terraform/dev/`
This step is **optional** — `make deploy` works without it (Cloud Run creates the service on the fly via `gcloud run deploy --source .`). However, running it gives you proper service accounts, observability, and IAM setup.
```bash
make setup-dev-env
```
> **Note:** `make deploy` doesn't automatically use the Terraform-created `app_sa`. Pass `--service-account` explicitly or update the Makefile.
### Deploying
1. **Notify the human**: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?"
2. **Wait for explicit approval**
3. Once approved: `make deploy`
**IMPORTANT**: Never run `make deploy` without explicit human approval.
---
## Production Deployment — CI/CD Pipeline
**Best for:** Production applications, teams requiring staging → production promotion.
**Prerequisites:**
1. Project must NOT be in a gitignored folder
2. User must provide staging and production GCP project IDs
3. GitHub repository name and owner
**Steps:**
1. If prototype, first add Terraform/CI-CD files using the Agent Starter Pack CLI (see `/adk-scaffold` for full options):
```bash
uvx agent-starter-pack enhance . --cicd-runner github_actions -y -s
```
2. Ensure you're logged in to GitHub CLI:
```bash
gh auth login # (skip if already authenticated)
```
3. Run setup-cicd:
```bash
uvx agent-starter-pack setup-cicd \
--staging-project YOUR_STAGING_PROJECT \
--prod-project YOUR_PROD_PROJECT \
--repository-name YOUR_REPO_NAME \
--repository-owner YOUR_GITHUB_USERNAME \
--auto-approve \
--create-repository
```
4. Push code to trigger deployments
#### Key `setup-cicd` Flags
| Flag | Description |
|------|-------------|
| `--staging-project` | GCP project ID for staging environment |
| `--prod-project` | GCP project ID for production environment |
| `--repository-name` / `--repository-owner` | GitHub repository name and owner |
| `--auto-approve` | Skip Terraform plan confirmation prompts |
| `--create-repository` | Create the GitHub repo if it doesn't exist |
| `--cicd-project` | Separate GCP project for CI/CD infrastructure. Defaults to prod project |
| `--local-state` | Store Terraform state locally instead of in GCS (see `references/terraform-patterns.md`) |
Run `uvx agent-starter-pack setup-cicd --help` for the full flag reference (Cloud Build options, dev project, region, etc.).
### Choosing a CI/CD Runner
| Runner | Pros | Cons |
|--------|------|------|
| **github_actions** (Default) | No PAT needed, uses `gh auth`, WIF-based, fully automated | Requires GitHub CLI authentication |
| **google_cloud_build** | Native GCP integration | Requires interactive browser authorization (or PAT + app installation ID for programmatic mode) |
### How Authentication Works (WIF)
Both runners use **Workload Identity Federation (WIF)** — GitHub/Cloud Build OIDC tokens are trusted by a GCP Workload Identity Pool, which grants `cicd_runner_sa` impersonation. No long-lived service account keys needed. Terraform in `setup-cicd` creates the pool, provider, and SA bindings automatically. If auth fails, re-run `terraform apply` in the CI/CD Terraform directory.
### CI/CD Pipeline Stages
The pipeline has three stages:
1. **CI (PR checks)** — Triggered on pull request. Runs unit and integration tests.
2. **Staging CD** — Triggered on merge to `main`. Builds container, deploys to staging, runs load tests.
> **Path filter:** Staging CD uses `paths: ['app/**']` — it only triggers when files under `app/` change. The first push after `setup-cicd` won't trigger staging CD unless you modify something in `app/`. If nothing happens after pushing, this is why.
3. **Production CD** — Triggered after successful staging deploy via `workflow_run`. Might require **manual approval** before deploying to production.
> **Approving:** Go to GitHub Actions → the production workflow run → click "Review deployments" → approve the pending `production` environment. This is GitHub's environment protection rules, not a custom mechanism.
**IMPORTANT**: `setup-cicd` creates infrastructure but doesn't deploy automatically. Terraform configures all required GitHub secrets and variables (WIF credentials, project IDs, service accounts). Push code to trigger the pipeline:
```bash
git add . && git commit -m "Initial agent implementation"
git push origin main
```
To approve production deployment:
```bash
# GitHub Actions: Approve via repository Actions tab (environment protection rules)
# Cloud Build: Find pending build and approve
gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING"
gcloud builds approve BUILD_ID --project=PROD_PROJECT
```
---
## Cloud Run Specifics
For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see `references/cloud-run.md`. For ADK docs on Cloud Run deployment, fetch `https://google.github.io/adk-docs/deploy/cloud-run/index.md`.
---
## Agent Engine Specifics
Agent Engine is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via `deploy.py` and the `AdkApp` class.
> **No `gcloud` CLI exists for Agent Engine.** Deploy via `deploy.py` or `adk deploy agent_engine`. Query via the Python `vertexai.Client` SDK.
Deployments can take 5-10 minutes. If `make deploy` times out, check if the engine was created and manually populate `deployment_metadata.json` with the engine resource ID (see reference for details).
For detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see `references/agent-engine.md`. For ADK docs on Agent Engine deployment, fetch `https://google.github.io/adk-docs/deploy/agent-engine/index.md`.
---
## Service Account Architecture
Scaffolded projects use two service accounts:
- **`app_sa`** (per environment) — Runtime identity for the deployed agent. Roles defined in `deployment/terraform/iam.tf`.
- **`cicd_runner_sa`** (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in **both** staging and prod projects.
Check `deployment/terraform/iam.tf` for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there.
**Common 403 errors:**
- "Permission denied on Cloud Run" → `cicd_runner_sa` missing deployment role in the target project
- "Cannot act as service account" → Missing `iam.serviceAccountUser` binding on `app_sa`
- "Secret access denied" → `app_sa` missing `secretmanager.secretAccessor`
- "Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project
---
## Secret Manager (for API Credentials)
Instead of passing sensitive keys as environment variables, use GCP Secret Manager.
```bash
# Create a secret
echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-
# Update an existing secret
echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-
```
**Grant access:** For Cloud Run, grant `secretmanager.secretAccessor` to `app_sa`. For Agent Engine, grant it to the platform-managed SA (`service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com`).
**Pass secrets at deploy time (Agent Engine):**
```bash
make deploy SECRETS="API_KEY=my-api-key,DB_PASS=db-password:2"
```
Format: `ENV_VAR=SECRET_ID` or `ENV_VAR=SECRET_ID:VERSION` (defaults to latest). Access in code via `os.environ.get("API_KEY")`.
---
## Observability
See the **adk-observability-guide** skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations).
---
## Testing Your Deployed Agent
### Agent Engine Deployment
**Option 1: Testing Notebook**
```bash
jupyter notebook notebooks/adk_app_testing.ipynb
```
**Option 2: Python Script**
```python
import json
import vertexai
with open("deployment_metadata.json") as f:
engine_id = json.load(f)["remote_agent_engine_id"]
client = vertexai.Client(location="us-central1")
agent = client.agent_engines.get(name=engine_id)
async for event in agent.async_stream_query(message="Hello!", user_id="test"):
print(event)
```
**Option 3: Playground**
```bash
make playground
```
### Cloud Run Deployment
> **Auth required by default.** Cloud Run deploys with `--no-allow-unauthenticated`, so all requests need an `Authorization: Bearer` header with an identity token. Getting a 403? You're likely missing this header. To allow public access, redeploy with `--allow-unauthenticated`.
```bash
SERVICE_URL="https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app"
AUTH="Authorization: Bearer $(gcloud auth print-identity-token)"
# Test health endpoint
curl -H "$AUTH" "$SERVICE_URL/"
# Step 1: Create a session (required before sending messages)
curl -X POST "$SERVICE_URL/apps/app/users/test-user/sessions" \
-H "Content-Type: application/json" \
-H "$AUTH" \
-d '{}'
# → returns JSON with "id" — use this as SESSION_ID below
# Step 2: Send a message via SSE streaming
curl -X POST "$SERVICE_URL/run_sse" \
-H "Content-Type: application/json" \
-H "$AUTH" \
-d '{
"app_name": "app",
"user_id": "test-user",
"session_id": "SESSION_ID",
"new_message": {"role": "user", "parts": [{"text": "Hello!"}]}
}'
```
> **Common mistake:** Using `{"message": "Hello!", "user_id": "...", "session_id": "..."}` returns `422 Field required`. The ADK HTTP server expects the `new_message` / `parts` schema shown above, and the session must already exist.
### Load Tests
```bash
make load-test
```
See `tests/load_test/README.md` for configuration, default settings, and CI/CD integration details.
---
## Deploying with a UI (IAP)
To expose your agent with a web UI protected by Google identity authentication:
```bash
# Deploy with IAP (built-in framework UI)
make deploy IAP=true
# Deploy with custom frontend on a different port
make deploy IAP=true PORT=5173
```
IAP (Identity-Aware Proxy) secures the Cloud Run service — only authorized Google accounts can access it. After deploying, grant user access via the [Cloud Console IAP settings](https://cloud.google.com/run/docs/securing/identity-aware-proxy-cloud-run#manage_user_or_group_access).
For Agent Engine with a custom frontend, use a **decoupled deployment** — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Engine backend API.
---
## Rollback & Recovery
The primary rollback mechanism is **git-based**: fix the issue, commit, and push to `main`. The CI/CD pipeline will automatically build and deploy the new version through staging → production.
For immediate Cloud Run rollback without a new commit, use revision traffic shifting:
```bash
gcloud run revisions list --service=SERVICE_NAME --region=REGION
gcloud run services update-traffic SERVICE_NAME \
--to-revisions=REVISION_NAME=100 --region=REGION
```
Agent Engine doesn't support revision-based rollback — fix and redeploy via `make deploy`.
---
## Custom Infrastructure (Terraform)
For custom infrastructure patterns (Pub/Sub, BigQuery, Eventarc, Cloud SQL, IAM), consult `references/terraform-patterns.md` for:
- Where to put custom Terraform files (dev vs CI/CD)
- Resource examples (Pub/Sub, BigQuery, Eventarc triggers)
- IAM bindings for custom resources
- Terraform state management (remote vs local, importing resources)
- Common infrastructure patterns
---
## Troubleshooting
| Issue | Solution |
|-------|----------|
| Terraform state locked | `terraform force-unlock -force LOCK_ID` in deployment/terraform/ |
| GitHub Actions auth failed | Re-run `terraform apply` in CI/CD terraform dir; verify WIF pool/provider |
| Cloud Build authorization pending | Use `github_actions` runner instead |
| Resource already exists | `terraform import` (see `references/terraform-patterns.md`) |
| Agent Engine deploy timeout / hangs | Deployments take 5-10 min; check if engine was created (see Agent Engine Specifics) |
| Secret not available | Verify `secretAccessor` granted to `app_sa` (not the default compute SA) |
| 403 on deploy | Check `deployment/terraform/iam.tf` — `cicd_runner_sa` needs deployment + SA impersonation roles in the target project |
| 403 when testing Cloud Run | Default is `--no-allow-unauthenticated`; include `Authorization: Bearer $(gcloud auth print-identity-token)` header |
| Cold starts too slow | Set `min_instance_count > 0` in Cloud Run Terraform config |
| Cloud Run 503 errors | Check resource limits (memory/CPU), increase `max_instance_count`, or check container crash logs |
| 403 right after granting IAM role | IAM propagation is not instant — wait a couple of minutes before retrying. Don't keep re-granting the same role |
| Resource seems missing but Terraform created it | Run `terraform state list` to check what Terraform actually manages. Resources created via `null_resource` + `local-exec` (e.g., BQ linked datasets) won't appear in `gcloud` CLI output |