From Notebook to Vercel: Mayavi's Deployment Architecture
Modal for the FastAPI service. Vercel for the dashboard and landing. DuckDB for run persistence. Auto-deploy on every push to main. Bearer-token auth — and the leak we caught before going public. The deployment that ties the engine to the dashboard. Series closes.
Part of the Platform & pipeline series.
This is the last post in the 15-post series. It walks the deployment: what runs where, how a git push origin main becomes a live update on app.mayavi.sambhal-labs.com minutes later, and the one security incident the deployment design caught before it shipped.
The topology
Four services, three vendors, one auth secret:
| Service | Host | URL | Source |
|---|---|---|---|
FastAPI (mayavi.api) | Modal | morellato26--mayavi-api-fastapi-app.modal.run | mayavi/api/modal_app.py |
| Dashboard (Next.js) | Vercel | app.mayavi.sambhal-labs.com | web/ |
| Marketing landing | Vercel | mayavi.sambhal-labs.com | landing/ (separate Vercel project) |
| Umbrella site | Vercel | sambhal-labs.com | external repo (sambhal-labs/sambhal-labs-website) |
Three things to notice:
- The dashboard talks to the API across origins.
app.mayavi.sambhal-labs.com→morellato26--mayavi-api-fastapi-app.modal.run. Bearer-token auth. api.mayavi.sambhal-labs.comis deliberately not provisioned. Modal gates custom domains behind a paid tier. We use the default Modal URL until a paying customer or funding round justifies the upgrade.- The marketing landing is a separate Vercel project from the dashboard, with its own design tokens (mirrored in
landing/styles.css). Brand parity, deploy independence.
Run persistence: DuckDB on a Modal Volume
The API service writes every submitted run + per-step actions + market snapshots into a DuckDB file at data/runs/runs.duckdb. On Modal, that path is inside a named persistent Volume (the same Volume mechanism Post 8 uses for model checkpoints; the precise mechanics are in [[reference_modal_volume_persistence]]). The Volume outlives container scale-to-zero — a queued run submitted at 11 PM survives until the worker container rehydrates at 2 AM.
DuckDB-specific subtleties:
- Schema is forward-migratable.
mayavi/sim/runner.py:_ensure_runs_schemarunsALTER TABLEon connect to add new columns; a contributor pulling a new branch doesn't need to nuke their localruns.duckdb. - No native
ON DELETE CASCADE. Phase 5's PR N caught a retention-policy leak where the 5000-row ceiling deleted fromrunsbut left orphaned rows inactions+market_snapshots. The fix is application-side cascade:truncate_runs_to_max_rowsdeletes from the child tables first, then the parent. - Read-after-write consistency on the same connection. API submission, run execution, and
load_runall share one connection per request; transactions are explicit.
The auth model — and the bearer-token leak we caught
The dashboard's bearer-token auth has one rule: the token never reaches the client bundle.
The naive Next.js pattern would set NEXT_PUBLIC_API_KEY in the Vercel env, then read it from a client component to attach an Authorization header before fetch. That ships the token in the JavaScript the browser downloads. Anyone viewing the page source can grep it out.
Phase-5 PR E caught exactly this. The pre-PR-E build was inlining NEXT_PUBLIC_API_KEY into the static _next/static/chunks/*.js. Any visitor to app.mayavi.sambhal-labs.com could open DevTools, grep the chunks, and exfiltrate the token. With that token they could submit arbitrary runs to the Modal-deployed API, exhausting our compute budget and persisting attacker-controlled data into our DuckDB.
The fix landed three changes together:
- Server-side proxy routes under
web/app/api/(/api/runs,/api/runs/[id],/api/runs/[id]/report,/api/scenarios). The Next.js route handlers run in Vercel's Node runtime, readMAYAVI_API_KEYfrom the server-only env, attach theAuthorization: Bearer <key>header server-side, and proxy upstream to Modal. The client only ever sees same-origin URLs. - Webpack
DefinePluginaudit innext.config.tsto fail the build if anyNEXT_PUBLIC_API_KEYreference survives. Defense in depth — even if a developer accidentally reads from the public-env name, the build won't compile. - CI assertion:
tests/web/test_no_secrets_in_client_bundle.py(run on every Vercel preview deploy) curls a representative chunk and asserts the bearer-key pattern isn't present.
The bearer never reaches the client. Iframe + PDF downloads work the same way: <iframe src="/api/runs/<id>/report"> is same-origin, the Next.js route handler proxies upstream with the bearer attached, the iframe just sees an HTML body. PDF download is identical (server-side Playwright render, streamed back).
The env-var contract
Every "misconfigured" case has to fail fast, not silently. The FastAPI service reads:
| Variable | Default | Behavior on misconfigure |
|---|---|---|
MAYAVI_API_KEY | (required) | Service won't start without it. |
MAYAVI_API_RUNS_DB | data/runs/runs.duckdb | Path the API reads/writes. In prod, points at the Modal Volume mount. |
MAYAVI_API_RUNS_MAX_ROWS | 5000 | Retention ceiling. Non-int or <1 raises at submission. |
MAYAVI_API_LOG_FORMAT | json | text for local dev. Misconfigured raises at startup. |
MAYAVI_API_CORS_ORIGINS | unset | Comma-separated extras beyond the *.mayavi.sambhal-labs.com defaults. |
MAYAVI_API_RATE_LIMIT_REQUESTS_PER_MINUTE | 60 | Token-bucket per bearer key on POST /runs only. |
SENTRY_DSN | unset → no-op | Sentry stays off until set. Sample rates outside [0.0, 1.0] raise at startup. |
MAYAVI_API_TEST_FAKE_EXECUTE | unset | Test escape hatch — short-circuits to complete in <500 ms without forking an EVM. Production deploys must leave this unset. PR N added a log.error + sentry_sdk.set_tag when the flag is set, so an accidentally-enabled prod deploy fires a Sentry-tagged error on every run. |
The contract is enforced by tests/api/test_env_var_contract.py — every variable in this table has a "misconfigured raises" assertion. A future env var that doesn't fail-fast can't be merged without adding the test row.
The deploy pipeline
git push origin main:
- GitHub Actions:
ci.ymlruns lint, type-check, unit tests + 6 per-layer coverage gates (sim ≥85, agents ≥80, svm ≥70, chains ≥90, api ≥85, protocols ≥90). Theforkjob runs nightly only — fork tests pin a block, hit Alchemy, and cost real CUs. - GitHub Actions:
deploy.ymlfires in parallel after CI passes:- Modal job:
uv run modal deploy mayavi/api/modal_app.py. Modal keeps the previous deploy live until the new one is healthy. - Vercel job:
vercel deploy --prebuilt --prodagainst theweb/directory.
- Modal job:
- Smoke test:
curl -fsS https://morellato26--mayavi-api-fastapi-app.modal.run/healthz. 5 retries, 5 s backoff. The pytest equivalent istests/api/test_modal_deploy.pymarked@pytest.mark.modal.
Concurrency is grouped so a second push within the deploy window cancels the in-flight job and supersedes it — last-merged-wins, no half-deployed states.
Rollback
Modal:
modal app history mayavi-api
modal app rollback mayavi-api <revision-id>Vercel:
vercel rollback <previous-deployment-url>Both vendors keep the previous N deploys reachable. Rollback is a one-command operation that completes in seconds.
The dashboard's run lifecycle
Submitting a run from app.mayavi.sambhal-labs.com/scenarios/<type>/new:
- Client POSTs the scenario YAML to
/api/runs(same-origin Next.js route handler). - Route handler reads
MAYAVI_API_KEYfrom server-env, attaches bearer, proxies to Modal'sPOST /runs. - FastAPI service writes a
queuedrow into DuckDB, returns therun_id. - Client redirects to
/runs/<run_id>and polls/api/runs/<run_id>via the same same-origin proxy. - Modal's run-executor worker picks up the queued row, forks the EVM, runs the scenario, persists the result, transitions the row to
complete. - Client's poll sees
complete, fetches/api/runs/<run_id>/report(iframe) — same-origin proxy, Plotly HTML body streams back. - PDF download:
/api/runs/<run_id>/report.pdftriggers a server-side Playwright render, streams the PDF.
On a cold Modal container, step 5 takes ~10–30 s (the EVM-init cost). On a warm container, it's under 2 s for the bundled demo scenarios.
What this whole architecture buys
Two properties worth naming explicitly:
- Compute scales to zero. Modal's containers idle at zero replicas when no runs are queued. We pay for the CPU-seconds of actual run execution, not for an always-on host. For an early-stage project, that's the difference between a $30/month bill and a $300/month bill.
- The credibility wall extends to the dashboard. Every claim a user sees — the run's report, the JSON eval, the saved historical run — is the same bundle the engine's CLI produces. There's no "marketing version" of the data on the dashboard. What you see is what
mayavi artifactrendered.
The 15-post series, closed
This is post 15. Reading order if you came in cold:
- Introducing Mayavi — what it is.
- The Credibility Wall — why we fork mainnet.
- Inside the Engine — architecture tour.
- Aave V3 Across Six Chains
- Compound V3 Comet
- SparkLend Thin Adapter
- Curve Depeg Cascade
- PPO Aave Borrower Saturation
- Vesting Saturation
- Liquidator vs Heuristic
- EIGEN Season 1 Replay
- ENA Vesting Cliff Replay
- Multichain Matrix Pipeline
- Determinism Is a Feature
- From Notebook to Vercel — this post.
The series ships across five themes (foundations, protocols, RL findings, replays, platform). Every post embeds a real on-chain bundle or quotes a real eval JSON. Every claim traces to a file in the repo. No fictional improvements. That's the contract.
The next set of posts won't be a numbered series. They'll be on-the-record write-ups of new scenarios, new chains, new protocol integrations, new RL findings. Subscribe via the RSS feed to catch them as they ship.