Multi-Region API Routing Architecture
Status: Draft Date: 2026-02-081. Region Naming Convention
1.1 Industry Research
Every major platform has converged on a similar hierarchy but with different encoding choices:| Provider | Format | Example | Hierarchy |
|---|---|---|---|
| AWS | {continent}{cardinal}-{n} | us-east-1 | continent-direction-number |
| AWS (short) | {continent}{cardinal}{n} | use1 | compressed, no delimiters |
| AWS (AZ) | {region}{letter} | us-east-1a | region + zone letter |
| GCP | {continent}-{subregion}{n} | us-central1 | continent-subregion-number |
| GCP (zone) | {region}-{letter} | us-central1-a | region + zone letter |
| Azure | {region} | eastus, westus2 | flat slug, no delimiters |
| Azure (regional DNS) | {service}-{region}-01 | contoso-westus2-01.regional.azure-api.net | service-region-instance |
| Cloudflare | {iata} | DFW, IAD | 3-letter IATA, uppercase |
| Vercel | {iata}{n} | sfo1, iad1, dub1 | city code + number |
| Hetzner | {location}{n} | fsn1, nbg1, ash1 | location code + number |
| Hetzner (DC) | {location}-dc{n} | fsn1-dc14 | location + datacenter number |
| Fly.io | {iata} | dfw, iad, ams | 3-letter IATA airport code |
| Kubernetes | topology.kubernetes.io/zone | us-east-2a | follows cloud provider convention |
1.2 Format
IATA airport codes map directly to metro areas where datacenters operate. Developers recognize them immediately. The number distinguishes multiple facilities in the same metro.1.3 Why IATA
Fly.io, Cloudflare, Fastly, and Wikimedia all use IATA codes for physical infrastructure. k0rdent sells bare metal in specific physical locations — the same category. Cloud-style codes (us-west-1) encode abstract geography because you never know the specific datacenter. With bare metal, the physical location is the product.
IATA codes are globally unique, assigned by an international body, 3 characters, and human-memorable. sfo1 carries more information in 4 characters than usca1 does in 5 — it tells you the metro, not just the state.
1.4 Subdomain Format
1.5 Region Registry
2. Request IDs vs Resource IDs
2.1 Request IDs
Every API request gets a unique Request ID for tracing, debugging, and correlation across services. Request IDs embed the region to enable grep-based debugging across the entire stack. Format:req_{region}-{timestamp}-{entropy}
req_— Platform prefix for easy grepping in logssfo1— Region code where request was processed1770564159296— Unix timestamp in milliseconds (sortable, debuggable)7d4b9e1f3a5b— Random entropy (12 hex characters)
- Grep any log file for
sfo1and see all activity in that region - Trace a request across services without additional lookups
- Instant debugging: see the request ID, know which region handled it
- Correlate metrics, traces, and logs by region prefix
- Returned in
X-Request-Idresponse header - Logged by all services handling the request
- Used for distributed tracing correlation
- NOT exposed to end users (internal debugging only)
2.2 Resource IDs
Resources (clusters, servers, organizations, etc.) get globally unique, opaque IDs that do NOT contain region information. This decouples resource identity from physical location. Format:{prefix}_{base62}
| Resource | Prefix | Example |
|---|---|---|
| Organization | org_ | org_8TcVx2WkZddNmK3Pt9JwX7BzWrLM |
| Server | srv_ | srv_3KpQm9WnXccFjH2Ls8DkT6VzRqYU |
| Cluster | cls_ | cls_6NZtkvWLBbbmHfPi7L6oz7KZpqET |
| Stack | stk_ | stk_5MfRp4WjYbbHmG8Nt2LvS9CxPqZK |
| Workflow Run | run_ | run_7NhTq6WlAbbKmF5Rt3MxU8DzSqWJ |
| Pool | pool_ | pool_2LgPn8WmXccGjE7Mt4KwV9BySrTL |
| Allocation | alloc_ | alloc_9QjSr3WnZddMmH6Pt5LxW2CzUrYK |
| API Key | key_ | key_4KfQm7WkYccJmG3Nt8MvX9BzSqWL |
| Event | evt_ | evt_6MgRp2WlXbbKmF9Rt5NxU3DzTqZJ |
org_system is reserved for platform-level admin operations. TBD if this is needed still. Originally it was for something else.
Components:
{prefix}_— Resource type for debugging clarity (e.g.,cls_,srv_,org_){base62}— 26-character base62-encoded unique identifier (case-sensitive)
- Resources can migrate between regions without ID changes
- Multi-region resources don’t have a single “home” region
- Simpler, shorter IDs (30 chars vs 40+)
- Region is stored as metadata on the resource, queryable via API
- Resource identity is independent of physical topology
- Subdomain (
sfo1.api.example.com) - Header (
X-Region: sfo1) - Query param (
?region=sfo1) - Request body (
{ "region": "sfo1" }) - Session context (project/org default region)
- Database lookup (stored on resource record)
3. API Design
2.1 Core Principle
Region is a routing concern, not a resource hierarchy. The API path describes what. The subdomain/header/session describes where (FR-06).2.2 Atlas API (Operator Console)
sfo1.api.example.com/v1/region/global/servers returns only servers in that region.
2.3 Arc API (Customer Console)
POST /compute/clusters { "name": "prod", "region": "sfo1" }. Once a resource exists, its region is embedded in its ID (FR-08).
2.4 Region Resolution Cascade
Every request resolves a region before reaching application code (FR-01):2.5 Persona Experiences
Arc developer, single-region org:3. Database Topology
3.1 Option A: Single Primary + Regional Read Replicas (MVP)
3.2 Option B: Hybrid — Regional for Fast Writes, Central for State of Record
3.3 Recommendation: Start A, Evolve to B
3.4 Write Latency by Type
| Write | Latency Sensitivity | Path |
|---|---|---|
| Create cluster | Low (async workflow) | Arc → Workflow Queue → Primary |
| Provision server | Low (async workflow) | Atlas → Workflow Queue → Primary |
| Update user settings | High | Arc → Primary (cross-region) |
| Mark notification read | High | Arc → Primary (cross-region) |
| Login/session | High (auth handles this) | Auth → Primary |
4. Routing Architecture
4.1 Request Flow
4.2 Request Flow Examples
Arc user creates a cluster (multi-region org, no subdomain):4.3 Gateway Implementation (Hono)
4.4 SDK Configuration
5. Request ID Generation
Request IDs are generated at the gateway for every API request. They provide distributed tracing and enable grep-based debugging across services.5.1 Generation Function
5.2 Region Extraction from Request IDs
6. Functional Requirements
Any alternative proposal must satisfy these. If it can’t, the burden is on the proposer to explain why the requirement is wrong.Critical Requirements
| ID | Requirement | Rationale |
|---|---|---|
| FR-01 | Resolve target region for every request before it reaches application code. | Routing is infrastructure, not application logic. |
| FR-02 | Follow deterministic priority: Subdomain → Header → Query → Body → Session → DB Lookup → Fan-out/Error. | Developers must predict where requests go. |
| FR-06 | Region never appears as a path segment. | Decouples API contract from topology changes. |
| FR-08 | Resource IDs are globally unique and opaque (base62, 26 chars). Region stored as metadata. | Enables resource migration between regions without ID changes. |
| FR-09 | Request IDs embed region, timestamp, and entropy. Format: req_{region}-{ts}-{entropy}. | Enables grep-based debugging and distributed tracing. |
| FR-10 | Single-region org users never need to specify a region. | The DX bar. If they think about regions, we failed. |
| FR-11 | Arc requests only return resources for the authenticated org, regardless of gateway. | Tenant isolation is non-negotiable. |
| FR-14 | Infrastructure mutations go through the workflow orchestrator to primary DB. | Eventual consistency on “does this cluster exist” is not acceptable. |
7. Non-Functional Requirements
MVP and production targets. The gap between them defines when to scale.Critical Targets
| ID | Metric | MVP | Production |
|---|---|---|---|
| NFR-02 | Same-region read (Arc list/get) | < 100ms p95 | < 50ms p95 |
| NFR-03 | Cross-region write (settings, notifications) | < 150ms p95 | < 100ms p95 |
| NFR-07 | Per-region read availability | 99.5% monthly | 99.9% monthly |
| NFR-10 | Mothership availability | 99.5% monthly | 99.9% monthly |
| NFR-13 | Primary DB down | All writes fail. Reads continue. 503 + Retry-After. | Managed failover within 60-120s. RTO < 5 min. |
| NFR-19 | Every request emits OTel span: region, region_source, request_id, org_id, latency_ms, status_code | Required | Required |
8. Scaling Roadmap
Each item is a deliberate MVP limitation. The table below summarizes what to monitor, what triggers action, and what to build. Details follow.| # | Monitor | Trigger | Build | Effort |
|---|---|---|---|---|
| 8.1 | Cross-region write p95 | > 200ms | Regional Redis for user state | 2-3 sprints |
| 8.2 | Write availability, failover RTO | RTO > 5 min or SLA > 99.5% | Managed PG failover → standby → multi-primary | 1 → 2 → 6+ sprints |
| 8.3 | Customer demand | HA across failure domains requested | Federation layer over single-region clusters | 4-6 sprints |
| 8.4 | Gateway latency from distance | p95 > 100ms or 3+ regions active | GeoDNS → edge gateways → edge compute | 1 → 2-3 → 4+ sprints |
| 8.5 | Session validation latency | > 50ms per request or > 500 RPS | Regional Redis session cache | 1-2 sprints |
8.1 Single-Primary Write Latency
What you’ll see: NFR-03 cross-region write latency for user settings and notification reads climbing toward 200ms p95. UI feels sluggish on interactions that trigger writes. Customers in distant regions experience noticeably worse responsiveness than those near the mothership. Why it happens: All writes go to the single PG primary in the mothership region. A user inams1 marking a notification as read incurs a transatlantic round-trip. This is acceptable at 20-80ms but degrades as regions get farther from the primary or write volume increases.
What to build: Regional Redis (or Turso/SQLite) for user-scoped state — notification read status, user preferences, UI state. These writes hit the local store immediately and async-sync to central PG. Infrastructure state (clusters, servers, org config) stays in central PG where consistency matters. Effort: 2-3 sprints. Requires defining consistency model per data type, deploying regional Redis, and implementing sync workers.
Dependencies: None. Can be deployed independently per region.
8.2 Single-Primary Availability
What you’ll see: NFR-13 mothership primary goes down. All writes across all regions fail immediately. Reads continue from replicas. Duration depends on recovery — manual intervention could take 10-30 minutes. This is the architecture’s single biggest risk. Why it happens: One PG primary handles all writes. No standby, no automatic failover. The managed database service provides backups but not instant promotion. What to build (staged): Stage 1 — Managed failover. Enable Multi-AZ on managed PostgreSQL. Standby in a second availability zone within the mothership region. Automatic failover in 60-120s. No application code changes. Effort: 1 sprint (infrastructure configuration). Stage 2 — Cross-region standby. Promote a replica in a second region to synchronous standby. If mothership fails, DNS update points writes to the standby. Requires connection drain and cache invalidation. Effort: 2 sprints. Stage 3 — Multi-primary. CockroachDB, Spanner, or PostgreSQL BDR. Local writes in every region. Only justified by contractual SLA requirements from very large customers. Effort: 6+ sprints with dedicated database engineering. Dependencies: Stage 1 is independent. Stage 2 requires monitoring from NFR-19. Stage 3 requires significant application changes.8.3 Single-Region Clusters
What you’ll see: Customers ask for high-availability workloads that survive a full region failure. Competitors offer multi-region Kubernetes. Sales loses deals where cross-region resilience is a hard requirement. Why it happens: A cluster lives in exactly one region (FR-08 embeds region in the cluster ID). All nodes are co-located. Ifsfo1 goes down, clusters in sfo1 are down.
What to build: A federation layer. A “multi-region deployment” is a logical resource that owns multiple single-region clusters (e.g., one in sfo1, one in lax1). The routing architecture doesn’t change — each underlying cluster still has a single region. The federation layer handles cross-cluster orchestration, health monitoring, and failover. Effort: 4-6 sprints.
Dependencies: Requires §8.4 GeoDNS for meaningful cross-region traffic steering. Requires fat-pipe interconnects between DCs for viable cross-region networking.
8.4 Manual Region Selection
What you’ll see:api.example.com resolves to a single gateway (or round-robin). Users in Amsterdam hit a US gateway before being routed to ams1. NFR-02 read latency inflated by unnecessary cross-region hop.
Why it happens: No geographic DNS routing. api.example.com points to one place. Developers pick their region via subdomain, header, or SDK config — there’s no automatic “nearest region” behavior.
What to build (staged):
Stage 1 — GeoDNS. Route53 latency-based routing or Cloudflare load balancing. api.example.com resolves to the nearest healthy regional gateway. DNS configuration only, no code changes. Effort: 1 sprint.
Stage 2 — Edge gateways. Lightweight gateways at CDN edge (Cloudflare Workers, CloudFront Functions). TLS termination + region resolution at edge. Reduces first-byte latency. Effort: 2-3 sprints.
Stage 3 — Edge compute. Move read-heavy operations (list caches, notification counts) to edge. Requires cache invalidation design. Effort: 4+ sprints.
Dependencies: Stage 1 requires 3+ active regions to be meaningful. Stage 2 requires NFR-19 observability to measure improvement. Stage 3 pairs with §8.1 regional state.
8.5 Centralized Session Auth
What you’ll see: Session validation adds measurable latency to every request because the auth DB is in the mothership region. At scale, the auth service becomes a throughput bottleneck. Why it happens: Stateful sessions live in mothership PG. Every request that needs session validation from a non-mothership region pays a cross-region round-trip. What to build: Regional Redis session cache, populated on login, validated locally. Session revocation propagated via pub/sub with < 5s delay. This uses the same regional Redis infrastructure as §8.1 — same deployment, same operational cost. Effort: 1-2 sprints if regional Redis already exists from 8.1. Dependencies: Pairs naturally with §8.1. Deploy together for shared infrastructure cost.8.6 Phase Summary
9. Decision Summary
| Decision | Choice | Reference |
|---|---|---|
| Region code format | IATA + number → sfo1, lax2 | FR-17 |
| Physical topology (zone, rack, cage) | Metadata on server records, not routing | FR-19 |
| Subdomain format | sfo1.api.example.com | §1.4 |
| API prefix | /v1/{service} → /v1/region/global, /v1/region/{region} | §2.2 |
| Region in API path | Never | FR-06 |
| Resource ID format | {prefix}_{base62} (26 chars, opaque) | §2.2, FR-08 |
| Request ID format | req_{region}-{timestamp}-{entropy} | §2.1, FR-09 |
| Database topology (MVP) | Single primary + regional read replicas | §3, NFR-03 |
| Resolution priority | Subdomain → Header → Query → Body → Session → DB → Fan-out/Error | FR-02 |
| Cross-region clusters | Not for MVP | §8.3 |
| Write availability risk | Accepted for MVP, managed failover first | NFR-13, §8.2 |
| Monitoring baseline | OTel on every request from day one | NFR-19 |
Appendix A: All Functional Requirements
Region Resolution
| ID | Requirement | Rationale |
|---|---|---|
| FR-01 | Resolve target region for every request before it reaches application code. | Routing is infrastructure, not application logic. |
| FR-02 | Follow deterministic priority: Subdomain → Header → Query → Body → Session → DB Lookup → Fan-out/Error. | Developers must predict where requests go. |
| FR-03 | Mutations without a resolved region return 400, never silently default. | Wrong-region creates are infrastructure incidents. |
| FR-04 | List operations without a region fan out to all org-accessible regions. | GET /clusters returns all clusters, not a random subset. |
| FR-05 | Resolution adds < 5ms overhead. | It’s a parse, not a network call. |
API Contract
| ID | Requirement | Rationale |
|---|---|---|
| FR-06 | Region never appears as a path segment. | Decouples API contract from topology changes. |
| FR-07 | OpenAPI spec is identical across all regional gateways. | One SDK, one set of docs. |
| FR-08 | Resource IDs are globally unique and opaque ({prefix}_{base62}, 26 chars). Region stored as metadata. | Enables resource migration between regions without ID changes. |
| FR-09 | Request IDs embed region, timestamp, and entropy (req_{region}-{ts}-{entropy}). | Enables grep-based debugging and distributed tracing. |
| FR-10-ORG | Organization resolved from session, never from URL. | Prevents enumeration, simplifies API surface. |
| FR-10 | Single-region org users never need to specify a region. | The DX bar. If they think about regions, we failed. |
Multi-Tenancy
| ID | Requirement | Rationale |
|---|---|---|
| FR-11 | Arc requests only return resources for the authenticated org, regardless of gateway. | Tenant isolation is non-negotiable. |
| FR-12 | Resource creation rejected if target region not in org’s allowed set. | API enforces region access, not just UI. |
| FR-13 | Atlas requests require platform-level auth, enforced at gateway before routing. | Atlas exposes cross-org data. |
Data Consistency
| ID | Requirement | Rationale |
|---|---|---|
| FR-14 | Infrastructure mutations go through the workflow orchestrator to primary DB. | Eventual consistency on “does this cluster exist” is not acceptable. |
| FR-15 | Read replicas serve Arc list/get operations. Reads don’t hit primary unless read-after-write is needed. | Local read latency is the point of replicas. |
| FR-16 | After creation, resource visible in same-session lists within 5 seconds. | Read-after-write for the creating user. |
Naming
| ID | Requirement | Rationale |
|---|---|---|
| FR-17 | Region codes use IATA + number format (e.g., sfo1, lax2). | Industry standard for physical infrastructure, instantly recognizable. |
| FR-18 | Same region string used in subdomains, IDs, DB columns, K8s labels, and logs. | One code everywhere, no translation tables. |
| FR-19 | Physical topology below region (zone, rack, cage) is metadata on resources, not part of the routing hierarchy or API contract. | Provider topology changes don’t break API contracts. |
Appendix B: All Non-Functional Requirements
Latency
| ID | Metric | MVP | Production | Measured At |
|---|---|---|---|---|
| NFR-01 | Gateway resolution overhead | < 5ms p99 | < 2ms p99 | Request arrival → region resolved |
| NFR-02 | Same-region read (Arc list/get) | < 100ms p95 | < 50ms p95 | Gateway → response, local replica |
| NFR-03 | Cross-region write (settings, notifications) | < 150ms p95 | < 100ms p95 | Round-trip including primary write |
| NFR-04 | Infrastructure mutation acceptance | < 500ms p95 | < 300ms p95 | Time to 202 Accepted + enqueue |
| NFR-05 | Fan-out list (all org regions) | < 500ms p95 | < 250ms p95 | Parallel query + merge |
| NFR-06 | Replication lag | < 5s p99 | < 1s p99 | pg_stat_replication |
Availability
| ID | Metric | MVP | Production |
|---|---|---|---|
| NFR-07 | Per-region reads | 99.5% monthly | 99.9% monthly |
| NFR-08 | Per-region writes | 99.0% monthly | 99.5% monthly |
| NFR-09 | Global endpoint (api.example.com) | 99.5% monthly | 99.9% monthly |
| NFR-10 | Mothership availability | 99.5% monthly | 99.9% monthly |
Failover
| ID | Scenario | MVP Behavior | Production Behavior |
|---|---|---|---|
| NFR-11 | Regional gateway down | {region}.api.example.com returns 503. Global endpoint routes to healthy regions. | Automatic DNS failover within 60s. |
| NFR-12 | Regional replica down | Reads fall back to primary (higher latency). | Automatic replica promotion within 30s. |
| NFR-13 | Primary DB down | All writes fail. Reads continue from replicas. API returns 503 with Retry-After. | Managed failover restores writes within 60-120s. RTO < 5 min. |
| NFR-14 | Mesh partition | Local reads continue. Writes fail with clear error. No silent data loss. | Regional write buffer retries for 5 min, then fails with audit trail. |
| NFR-15 | Degraded mode | Responses include X-Region, X-Request-Id. Degraded responses add X-Degraded: true + reason. | Same, plus GET /health/region with replica lag, primary connectivity, uptime. |
Throughput
| ID | Metric | MVP | Production |
|---|---|---|---|
| NFR-16 | RPS per regional gateway | 100 sustained | 1,000 sustained |
| NFR-17 | Concurrent fan-out ops | 10 | 100 |
| NFR-18 | Region registry lookup | In-memory, zero network calls | Same |
Observability
| ID | Requirement | MVP | Production |
|---|---|---|---|
| NFR-19 | Every request emits OTel span: region, region_source, request_id, org_id, latency_ms, status_code | Required | Required |
| NFR-20 | Cross-region write latency tracked separately from same-region reads, per-region p50/p95/p99 | Required | Required |
| NFR-21 | Replication lag monitored per replica, alert at 5s (MVP) / 1s (prod) | Required | Required |
| NFR-22 | Fan-out tracks per-region sub-request latency | Nice-to-have | Required |
| NFR-23 | Gateway exposes Prometheus /metrics: resolution latency, routing source distribution, error rates | Required | Required |