Skip to content

Deployment

This runbook covers how Curaway's backend, frontend, and documentation are deployed. Since the GCP cutover (2026-06) all three live on Google Cloud: the backend on Cloud Run and both the frontend and the docs site on Firebase Hosting. (Railway and Vercel were retired.)


Architecture Overview

Component Platform Trigger URL
Backend API Cloud Run (curaway-backend, project curaway-dev, region asia-south1) Push to main on curaway-ai/curaway-backendCI and Deploy (GCP) workflow https://api.curaway.ai
Frontend Firebase Hosting (project curaway-dev, site curaway-dev-patient) Push to main on curaway-ai/curaway-frontend → GitHub Actions deploy https://app.curaway.ai
Documentation Firebase Hosting (project curaway-dev, site curaway-docs) Push to main on curaway-ai/curaway-backend (docs/** or mkdocs.yml change) — built + deployed by .github/workflows/docs.yml https://docs.curaway.ai

Backend -- Cloud Run

The Curaway backend is deployed on Google Cloud Run with automatic deploys triggered by pushes to the main branch.

Auto-Deploy

Every push to main runs the CI and Deploy (GCP) workflow (.github/workflows/ci-deploy-gcp.yml). After the fast + slow CI lanes pass, the Build, Push, Deploy job runs docker build on the runner, pushes the image to Artifact Registry, then creates a Google Cloud Deploy release on the curaway-backend-pipeline delivery pipeline (skaffold + service.yaml pulled from curaway-ai/curaway-gcp-infra), which rolls out a new Cloud Run revision. A stuck or failed prod deploy surfaces in the Cloud Deploy pipeline's release/rollout status — not in Cloud Build (the CI path doesn't use Cloud Build). The container's entrypoint is the Dockerfile CMD (python -m app.main) — there is no platform-side start command anymore.

Health Check

Cloud Run probes the /ready endpoint to determine if a revision is healthy before routing traffic to it. This endpoint performs lightweight checks:

  • Database connectivity (PostgreSQL ping)
  • Application startup complete

Use /ready, Not /health

The /health endpoint performs deep checks (Neo4j, Qdrant, Redis) and may timeout during the startup-probe window. Always use /ready for deployment health checks. See the troubleshooting runbook for details.

Environment Variables

Runtime config is set on the Cloud Run service; secrets live in Secret Manager and are attached to the service. Never commit secrets to the repository. See the configuration reference for the full list.

# Plain (non-secret) env var:
gcloud run services update curaway-backend \
  --project=curaway-dev --region=asia-south1 \
  --update-env-vars=DEFAULT_TENANT_ID=tenant-apollo-001

# Secret (stored in Secret Manager, referenced by the service):
gcloud run services update curaway-backend \
  --project=curaway-dev --region=asia-south1 \
  --update-secrets=CLERK_WEBHOOK_SECRET=CLERK_WEBHOOK_SECRET:latest

Recently added (v1.38, 2026-04-28)

Var Required for Notes
FLAGSMITH_ADMIN_TOKEN /api/v1/admin/flags/* proxy Admin-scope token from Flagsmith → Account → API tokens. Distinct from the runtime SDK key — keeps admin write access server-side. Without it, requests return 502 with a clear error.
FLAGSMITH_PROJECT_ID /api/v1/admin/flags/* Numeric project id (visible in Flagsmith dashboard URL)
FLAGSMITH_ENVIRONMENT_KEY /api/v1/admin/flags/* Server-side environment key, format ser.…
FLAGSMITH_ADMIN_API_URL optional Defaults to https://api.flagsmith.com/api/v1/. Override for self-hosted Flagsmith.
CLERK_WEBHOOK_SECRET /api/v1/webhooks/clerk Svix signing secret (whsec_…) from Clerk dashboard → Webhooks. Required for signature verification; missing/wrong = 401 on every event.
DEFAULT_TENANT_ID optional Defaults to tenant-apollo-001. Lets us flip the platform default tenant without grepping the codebase.

Manual Deploy

Preferred — pipeline-consistent. Use the Deploy Only - GCP (manual) workflow (.github/workflows/deploy-only-gcp.yml) via Actions → Run workflow. It goes through the same Cloud Deploy pipeline + service.yaml config as the automated path, so the resulting revision matches a normal main deploy.

Quick-and-dirty — bypasses the pipeline. If you just need a throwaway revision from a branch (e.g. testing), build + deploy from source with gcloud. Cloud Build does the build here. Caveat: this creates a Cloud Run revision outside the Cloud Deploy pipeline and ignores the infra repo's service.yaml, so it can drift from the canonical config — don't use it for a real prod deploy.

# One-time: authenticate + set the project
gcloud auth login
gcloud config set project curaway-dev

# Build from source and deploy a new revision (Cloud Build does the build)
gcloud run deploy curaway-backend \
  --source . \
  --project=curaway-dev --region=asia-south1

# View logs
gcloud run services logs read curaway-backend \
  --project=curaway-dev --region=asia-south1

Rollback

Cloud Run keeps every revision. To roll back, route 100% of traffic to a known-good prior revision:

# List revisions (newest first)
gcloud run revisions list --service=curaway-backend \
  --project=curaway-dev --region=asia-south1

# Send all traffic to a specific revision
gcloud run services update-traffic curaway-backend \
  --project=curaway-dev --region=asia-south1 \
  --to-revisions=curaway-backend-00091-xxxxx=100

Alternatively, revert the commit on main and let auto-deploy ship a fresh revision.


Frontend -- Firebase Hosting

The Curaway frontend is deployed on Firebase Hosting (project curaway-dev) with automatic deploys on push to main of curaway-ai/curaway-frontend.

Auto-Deploy

A GitHub Actions workflow in the frontend repo watches for pushes to main. Each push:

  1. Installs dependencies and runs the per-portal build matrix (Vite).
  2. Deploys the built static output to the corresponding Firebase Hosting site (e.g. curaway-dev-patient for the patient app).

Build-time config is injected from GitHub secrets at deploy time, not baked into the repo.

Environment Variables

Frontend build-time variables (e.g. VITE_*) are injected by the deploy workflow from GitHub repository secrets. There is no platform dashboard for runtime env — the apps are static SPAs, so all config is resolved at build/runtime-fetch time.

Custom Domain

The production deployment is accessible at https://app.curaway.ai via a CNAME record pointing to the Firebase Hosting site.


Documentation -- Firebase Hosting (via GitHub Actions)

The MkDocs documentation site is built in GitHub Actions and deployed to the Firebase Hosting site curaway-docs (project curaway-dev). The deploy is fully driven by the workflow below or the manual fallback. The docs site needs the Clerk publishable key injected at build time (see pk_live_Y2xlcmsuY3VyYXdheS5haSQ placeholders in the built HTML), and the workflow injects it from a GitHub secret rather than exposing it in the repo.

Workflow: .github/workflows/docs.yml

Triggers on pushes to main when docs/**, mkdocs.yml, scripts/build_docs.sh, firebase.json, .firebaserc, or the workflow itself changes (and on workflow_dispatch). Steps:

  1. Check out with fetch-depth: 0 (required for mkdocs-git-revision-date-localized-plugin).
  2. Install deps from docs/requirements.txt into the CI's Python 3.12 environment.
  3. Run bash scripts/build_docs.sh (wraps mkdocs build --site-dir build/docs + a ≥10-page sanity check). firebase.json serves from build/docs/.
  4. sed-inject the Clerk publishable key into every *.html file (replacing the pk_live_Y2xlcmsuY3VyYXdheS5haSQ placeholder). The key is read via env: (not direct ${{ }} shell interpolation) so a malformed value can't break out of the sed quoting.
  5. firebase deploy --only hosting:docs --project curaway-dev --non-interactive.

Authentication uses FIREBASE_TOKEN (a long-lived refresh token from firebase login:ci, stored as a repo secret). This is officially deprecated in firebase-tools v15+ but still functional at the pinned version; the permanent fix is CI using the Cloud Build SA's implicit identity once its project-level role grants land (tracked in curaway-frontend #365). The other expected secret is CLERK_PUBLISHABLE_KEY — if it's unset the workflow exits 1 before any HTML ships, preventing an unprotected docs site.

Manual fallback (rare — CI broken, hotfix, etc.)

Try first: re-run the workflow from the GitHub Actions UI. The workflow has a workflow_dispatch trigger — go to Actions → Deploy Docs → Run workflow and pick main. This re-uses the same CI secrets (CLERK_PUBLISHABLE_KEY, FIREBASE_TOKEN) and avoids the local CLI dance entirely.

If you can't use CI (e.g. GitHub Actions billing is wedged): build + deploy locally with the Firebase CLI, authenticated as a user with Firebase Hosting access to curaway-dev.

# From the repo root, with the curaway_src venv active:
pip install -r docs/requirements.txt    # if not already installed
bash scripts/build_docs.sh              # produces build/docs/
# Inject the Clerk key (set CLERK_PUBLISHABLE_KEY in env first).
# Cross-platform sed -i (works on GNU + BSD/macOS):
find build/docs -name "*.html" -exec sed -i.bak "s|pk_live_Y2xlcmsuY3VyYXdheS5haSQ|$CLERK_PUBLISHABLE_KEY|g" {} +
find build/docs -name "*.html.bak" -delete
# Authenticate once, then deploy:
firebase login
firebase deploy --only hosting:docs --project curaway-dev

The CI workflow uses bare sed -i because it runs on GNU/Linux runners; the recipe above uses sed -i.bak + a find -delete cleanup so it works on macOS BSD sed too (SD's daily driver) without gsed.

Local Docs Preview

mkdocs serve

Available at http://localhost:8000. No Firebase involvement.


The gsync Alias

The gsync alias is a convenience command for syncing your local main branch with the remote. It is defined as:

alias gsync='git fetch origin && git reset --hard origin/main'

Destructive Command

gsync discards all local changes on the current branch. Only use it on main when you want to exactly match the remote state. Do not use it on feature branches.

When to Use gsync

  • After a deployment when you want your local main to match production.
  • When git pull reports "Already up to date" but you know there are remote changes (this can happen when the local branch has diverged). See the troubleshooting runbook for details.

Setup

Add to your shell profile (~/.zshrc or ~/.bashrc):

alias gsync='git fetch origin && git reset --hard origin/main'

Deployment Checklist

Before deploying to production, verify:

  • [ ] All tests pass (pytest for backend, npm test for frontend)
  • [ ] Linting passes (ruff check . for backend, npm run lint for frontend)
  • [ ] Database migrations are committed and tested (alembic upgrade head)
  • [ ] Environment variables/secrets are set on Cloud Run (backend) or in the FE deploy workflow secrets for any new config
  • [ ] Feature flags are configured in Flagsmith for any new features
  • [ ] Documentation is updated for any API changes

Monitoring After Deploy

After a deployment reaches production:

  1. Check Cloud Run logs for startup errors: gcloud run services logs read curaway-backend --project=curaway-dev --region=asia-south1.
  2. Verify the health check: curl https://api.curaway.ai/ready
  3. Spot-check a key endpoint: curl -H "X-Tenant-ID: tenant-apollo-001" https://api.curaway.ai/api/v1/providers/
  4. Monitor Langfuse for LLM tracing anomalies.
  5. Check Flagsmith for any flag evaluation errors.