Deployment¶
This runbook covers how Curaway's backend, frontend, and documentation are deployed. Since the GCP cutover (2026-06) all three live on Google Cloud: the backend on Cloud Run and both the frontend and the docs site on Firebase Hosting. (Railway and Vercel were retired.)
Architecture Overview¶
| Component | Platform | Trigger | URL |
|---|---|---|---|
| Backend API | Cloud Run (curaway-backend, project curaway-dev, region asia-south1) |
Push to main on curaway-ai/curaway-backend → CI and Deploy (GCP) workflow |
https://api.curaway.ai |
| Frontend | Firebase Hosting (project curaway-dev, site curaway-dev-patient) |
Push to main on curaway-ai/curaway-frontend → GitHub Actions deploy |
https://app.curaway.ai |
| Documentation | Firebase Hosting (project curaway-dev, site curaway-docs) |
Push to main on curaway-ai/curaway-backend (docs/** or mkdocs.yml change) — built + deployed by .github/workflows/docs.yml |
https://docs.curaway.ai |
Backend -- Cloud Run¶
The Curaway backend is deployed on Google Cloud Run with automatic deploys triggered by pushes to the main branch.
Auto-Deploy¶
Every push to main runs the CI and Deploy (GCP) workflow (.github/workflows/ci-deploy-gcp.yml). After the fast + slow CI lanes pass, the Build, Push, Deploy job runs docker build on the runner, pushes the image to Artifact Registry, then creates a Google Cloud Deploy release on the curaway-backend-pipeline delivery pipeline (skaffold + service.yaml pulled from curaway-ai/curaway-gcp-infra), which rolls out a new Cloud Run revision. A stuck or failed prod deploy surfaces in the Cloud Deploy pipeline's release/rollout status — not in Cloud Build (the CI path doesn't use Cloud Build). The container's entrypoint is the Dockerfile CMD (python -m app.main) — there is no platform-side start command anymore.
Health Check¶
Cloud Run probes the /ready endpoint to determine if a revision is healthy before routing traffic to it. This endpoint performs lightweight checks:
- Database connectivity (PostgreSQL ping)
- Application startup complete
Use /ready, Not /health
The /health endpoint performs deep checks (Neo4j, Qdrant, Redis) and may timeout during the startup-probe window. Always use /ready for deployment health checks. See the troubleshooting runbook for details.
Environment Variables¶
Runtime config is set on the Cloud Run service; secrets live in Secret Manager and are attached to the service. Never commit secrets to the repository. See the configuration reference for the full list.
# Plain (non-secret) env var:
gcloud run services update curaway-backend \
--project=curaway-dev --region=asia-south1 \
--update-env-vars=DEFAULT_TENANT_ID=tenant-apollo-001
# Secret (stored in Secret Manager, referenced by the service):
gcloud run services update curaway-backend \
--project=curaway-dev --region=asia-south1 \
--update-secrets=CLERK_WEBHOOK_SECRET=CLERK_WEBHOOK_SECRET:latest
Recently added (v1.38, 2026-04-28)¶
| Var | Required for | Notes |
|---|---|---|
FLAGSMITH_ADMIN_TOKEN |
/api/v1/admin/flags/* proxy |
Admin-scope token from Flagsmith → Account → API tokens. Distinct from the runtime SDK key — keeps admin write access server-side. Without it, requests return 502 with a clear error. |
FLAGSMITH_PROJECT_ID |
/api/v1/admin/flags/* |
Numeric project id (visible in Flagsmith dashboard URL) |
FLAGSMITH_ENVIRONMENT_KEY |
/api/v1/admin/flags/* |
Server-side environment key, format ser.… |
FLAGSMITH_ADMIN_API_URL |
optional | Defaults to https://api.flagsmith.com/api/v1/. Override for self-hosted Flagsmith. |
CLERK_WEBHOOK_SECRET |
/api/v1/webhooks/clerk |
Svix signing secret (whsec_…) from Clerk dashboard → Webhooks. Required for signature verification; missing/wrong = 401 on every event. |
DEFAULT_TENANT_ID |
optional | Defaults to tenant-apollo-001. Lets us flip the platform default tenant without grepping the codebase. |
Manual Deploy¶
Preferred — pipeline-consistent. Use the Deploy Only - GCP (manual) workflow (.github/workflows/deploy-only-gcp.yml) via Actions → Run workflow. It goes through the same Cloud Deploy pipeline + service.yaml config as the automated path, so the resulting revision matches a normal main deploy.
Quick-and-dirty — bypasses the pipeline. If you just need a throwaway revision from a branch (e.g. testing), build + deploy from source with gcloud. Cloud Build does the build here. Caveat: this creates a Cloud Run revision outside the Cloud Deploy pipeline and ignores the infra repo's service.yaml, so it can drift from the canonical config — don't use it for a real prod deploy.
# One-time: authenticate + set the project
gcloud auth login
gcloud config set project curaway-dev
# Build from source and deploy a new revision (Cloud Build does the build)
gcloud run deploy curaway-backend \
--source . \
--project=curaway-dev --region=asia-south1
# View logs
gcloud run services logs read curaway-backend \
--project=curaway-dev --region=asia-south1
Rollback¶
Cloud Run keeps every revision. To roll back, route 100% of traffic to a known-good prior revision:
# List revisions (newest first)
gcloud run revisions list --service=curaway-backend \
--project=curaway-dev --region=asia-south1
# Send all traffic to a specific revision
gcloud run services update-traffic curaway-backend \
--project=curaway-dev --region=asia-south1 \
--to-revisions=curaway-backend-00091-xxxxx=100
Alternatively, revert the commit on main and let auto-deploy ship a fresh revision.
Frontend -- Firebase Hosting¶
The Curaway frontend is deployed on Firebase Hosting (project curaway-dev) with automatic deploys on push to main of curaway-ai/curaway-frontend.
Auto-Deploy¶
A GitHub Actions workflow in the frontend repo watches for pushes to main. Each push:
- Installs dependencies and runs the per-portal build matrix (Vite).
- Deploys the built static output to the corresponding Firebase Hosting site (e.g.
curaway-dev-patientfor the patient app).
Build-time config is injected from GitHub secrets at deploy time, not baked into the repo.
Environment Variables¶
Frontend build-time variables (e.g. VITE_*) are injected by the deploy workflow from GitHub repository secrets. There is no platform dashboard for runtime env — the apps are static SPAs, so all config is resolved at build/runtime-fetch time.
Custom Domain¶
The production deployment is accessible at https://app.curaway.ai via a CNAME record pointing to the Firebase Hosting site.
Documentation -- Firebase Hosting (via GitHub Actions)¶
The MkDocs documentation site is built in GitHub Actions and deployed to the Firebase Hosting site curaway-docs (project curaway-dev). The deploy is fully driven by the workflow below or the manual fallback. The docs site needs the Clerk publishable key injected at build time (see pk_live_Y2xlcmsuY3VyYXdheS5haSQ placeholders in the built HTML), and the workflow injects it from a GitHub secret rather than exposing it in the repo.
Workflow: .github/workflows/docs.yml¶
Triggers on pushes to main when docs/**, mkdocs.yml, scripts/build_docs.sh, firebase.json, .firebaserc, or the workflow itself changes (and on workflow_dispatch). Steps:
- Check out with
fetch-depth: 0(required formkdocs-git-revision-date-localized-plugin). - Install deps from
docs/requirements.txtinto the CI's Python 3.12 environment. - Run
bash scripts/build_docs.sh(wrapsmkdocs build --site-dir build/docs+ a ≥10-page sanity check).firebase.jsonserves frombuild/docs/. sed-inject the Clerk publishable key into every*.htmlfile (replacing thepk_live_Y2xlcmsuY3VyYXdheS5haSQplaceholder). The key is read viaenv:(not direct${{ }}shell interpolation) so a malformed value can't break out of thesedquoting.firebase deploy --only hosting:docs --project curaway-dev --non-interactive.
Authentication uses FIREBASE_TOKEN (a long-lived refresh token from firebase login:ci, stored as a repo secret). This is officially deprecated in firebase-tools v15+ but still functional at the pinned version; the permanent fix is CI using the Cloud Build SA's implicit identity once its project-level role grants land (tracked in curaway-frontend #365). The other expected secret is CLERK_PUBLISHABLE_KEY — if it's unset the workflow exits 1 before any HTML ships, preventing an unprotected docs site.
Manual fallback (rare — CI broken, hotfix, etc.)¶
Try first: re-run the workflow from the GitHub Actions UI. The workflow has a workflow_dispatch trigger — go to Actions → Deploy Docs → Run workflow and pick main. This re-uses the same CI secrets (CLERK_PUBLISHABLE_KEY, FIREBASE_TOKEN) and avoids the local CLI dance entirely.
If you can't use CI (e.g. GitHub Actions billing is wedged): build + deploy locally with the Firebase CLI, authenticated as a user with Firebase Hosting access to curaway-dev.
# From the repo root, with the curaway_src venv active:
pip install -r docs/requirements.txt # if not already installed
bash scripts/build_docs.sh # produces build/docs/
# Inject the Clerk key (set CLERK_PUBLISHABLE_KEY in env first).
# Cross-platform sed -i (works on GNU + BSD/macOS):
find build/docs -name "*.html" -exec sed -i.bak "s|pk_live_Y2xlcmsuY3VyYXdheS5haSQ|$CLERK_PUBLISHABLE_KEY|g" {} +
find build/docs -name "*.html.bak" -delete
# Authenticate once, then deploy:
firebase login
firebase deploy --only hosting:docs --project curaway-dev
The CI workflow uses bare sed -i because it runs on GNU/Linux runners; the recipe above uses sed -i.bak + a find -delete cleanup so it works on macOS BSD sed too (SD's daily driver) without gsed.
Local Docs Preview¶
Available at http://localhost:8000. No Firebase involvement.
The gsync Alias¶
The gsync alias is a convenience command for syncing your local main branch with the remote. It is defined as:
Destructive Command
gsync discards all local changes on the current branch. Only use it on main when you want to exactly match the remote state. Do not use it on feature branches.
When to Use gsync¶
- After a deployment when you want your local
mainto match production. - When
git pullreports "Already up to date" but you know there are remote changes (this can happen when the local branch has diverged). See the troubleshooting runbook for details.
Setup¶
Add to your shell profile (~/.zshrc or ~/.bashrc):
Deployment Checklist¶
Before deploying to production, verify:
- [ ] All tests pass (
pytestfor backend,npm testfor frontend) - [ ] Linting passes (
ruff check .for backend,npm run lintfor frontend) - [ ] Database migrations are committed and tested (
alembic upgrade head) - [ ] Environment variables/secrets are set on Cloud Run (backend) or in the FE deploy workflow secrets for any new config
- [ ] Feature flags are configured in Flagsmith for any new features
- [ ] Documentation is updated for any API changes
Monitoring After Deploy¶
After a deployment reaches production:
- Check Cloud Run logs for startup errors:
gcloud run services logs read curaway-backend --project=curaway-dev --region=asia-south1. - Verify the health check:
curl https://api.curaway.ai/ready - Spot-check a key endpoint:
curl -H "X-Tenant-ID: tenant-apollo-001" https://api.curaway.ai/api/v1/providers/ - Monitor Langfuse for LLM tracing anomalies.
- Check Flagsmith for any flag evaluation errors.