The global online gambling and sports betting market is projected to surpass $126 billion by 2027, and the platforms competing for that revenue routinely support more than 50,000 concurrent players during peak events. That scale creates a narrow engineering problem: an iGaming cloud infrastructure has to deliver sub-30-millisecond response times, survive a 40x traffic spike during a World Cup final, and simultaneously prove to four or five regulators that player data never left an approved border. Most general-purpose cloud architectures are built for one or two of those constraints. iGaming platforms need all three at once.
This guide walks through how modern iGaming operators actually build for that combination — compute and network topology, stateful WebSocket scaling, database concurrency control, and the jurisdictional rules that shape where every byte of player data can physically sit. It draws on current AWS, OVHcloud, Continent 8, and Google Cloud reference architectures, alongside the statutory frameworks operators must satisfy in Malta, Germany, Brazil, and New Jersey.
TL;DR
- Real-time betting and live-dealer games require sub-30ms round-trip latency, which pushes core transactional logic onto single-tenant bare-metal servers rather than shared public cloud instances.
- WebSocket-based session persistence needs sticky routing, consistent hashing, and a pub/sub layer (Redis or Kafka) to synchronize state across edge nodes.
- Database layers combine ACID-compliant engines (PostgreSQL, MySQL InnoDB) for ledgers with MVCC for read-heavy audit paths and in-memory stores for session state.
- Malta, Germany, Brazil, and New Jersey each impose different physical server localization and data residency rules — there is no single compliant architecture that works everywhere.
- Hybrid edge appliances (AWS Outposts, Google Distributed Cloud) let operators keep regulated workloads on sovereign hardware while running CDN and analytics in the public cloud.
Why standard cloud architectures fall short for iGaming
In live sports betting, a 500-millisecond delay is a financial vulnerability, not just a UX inconvenience. It opens an arbitrage window for “court-siding” — placing bets on events that have already concluded by exploiting broadcast delay. That single constraint reshapes the entire infrastructure decision tree.
Traditional multi-tenant public cloud environments introduce the “noisy neighbor” effect: virtualized workloads sharing physical hardware cause unpredictable jitter and round-trip time spikes. For a game engine calculating live odds or generating random numbers under regulatory audit, that unpredictability is unacceptable. This is why iGaming operators consistently isolate core transactional and game-logic engines on single-tenant bare-metal servers or private clouds, frequently orchestrated through OpenStack-based hypervisors and custom management APIs that avoid the resource contention inherent to shared infrastructure.
The hardware baseline
The table below summarizes the technical profile operators typically specify for compute nodes running RNG, betting logic, and ledger transactions.
| Component | Typical configuration | Why it matters |
|---|---|---|
| Compute | Dual Intel Xeon Scalable Silver/Gold, high base clock | Minimizes RNG and betting-logic execution time; avoids CPU scheduler queue delays |
| Memory | 96–256 GB DDR4/DDR5 ECC RAM | Real-time correction of single-bit memory errors; prevents crashes during peak load |
| Storage | Dual NVMe SSDs in RAID | Sustains high concurrent write IOPS without thread stalls |
| Network | Dual-bonded 100–200 Gbps NICs, Tier-1 peering | Maximizes burst capacity, minimizes network hops |
| Colocation | Tier III+/IV facilities, N+1 or 2N power redundancy | Supports up to 99.993% uptime; isolates localized power failures |
The cost trade-off matters as much as the technical spec. Public cloud egress charges can reach $0.09 per gigabyte on platforms like AWS, and that adds up fast when a live match generates continuous odds-update traffic to tens of thousands of sockets. Dedicated server pricing is predictable month over month — which is exactly the property operators need when a single high-volume event can otherwise erode margin through unplanned cloud consumption.
Stateful sessions: scaling WebSockets without losing game state
Live odds, live-dealer video, and real-time game state depend on persistent, bidirectional connections. Standard HTTP request-response cycles carry too much overhead — repeated TCP handshakes and verbose headers — for that job, so platforms upgrade to WebSocket: a single full-duplex TCP socket established once and held open for the duration of play.
Every frame sent from client to server is masked with a 32-bit XOR key to prevent frame-injection attacks, and platforms typically apply permessage-deflate compression to shrink repetitive JSON payloads — while leaving control frames like ping/pong/close uncompressed to protect connection stability.
The harder problem is scaling this statefully. Because each socket holds an active player session in memory, you cannot route a reconnecting player to an arbitrary backend node the way you would with stateless HTTP. Operators typically combine four techniques:
- Sticky sessions with IP hashing — Nginx’s
ip_hash(or equivalent) maps a client to the same backend node on reconnect, avoiding expensive cross-server state sync. - Consistent hashing — minimizes the percentage of session keys that must be remapped when application shards scale up or down, preventing hotspots.
- Distributed pub/sub — a message broker (Redis or Apache Kafka) propagates state changes instantly across edge nodes, since players in the same game room are often connected to different physical servers.
- Connection pooling and keep-alives — heartbeat frames at fixed intervals prevent silent termination by intermediary infrastructure (ALBs, Cloudflare edges, ISP firewalls), with exponential-backoff reconnection logic on the client side to preserve gameplay state after a drop.
5G rollout has pushed the achievable latency floor down to under 10 milliseconds, versus roughly 200ms on legacy 4G — which is why real-time multiplayer engines and fast-action sports betting now target sub-30ms round-trip time, well below the 100ms threshold that’s acceptable for standard casino systems.
Database architecture: concurrency control for the ledger
The database layer is the system of record for account balances, open wagers, and transaction history, and it’s where most iGaming platforms actually fail under load — not at the network edge. Maintaining integrity at scale requires a hybrid design: ACID-compliant relational databases (PostgreSQL, MySQL InnoDB) for ledger entries and balances, in-memory key-value stores (Redis) for session state, and document databases (MongoDB) for telemetry.
Without strict concurrency control, simultaneous writes to the same row — a player spinning a slot while a deposit posts — produce dirty reads, lost updates, and phantom reads. There are three concurrency paradigms in active use, each with a distinct trade-off profile:
| Paradigm | Mechanism | Best for | Trade-off |
|---|---|---|---|
| Pessimistic (PCC) | Exclusive/shared locks (e.g., SELECT FOR UPDATE) acquired upfront |
Ledger balances, jackpot pools, payment processing | Guarantees integrity under contention; higher latency from lock queuing and deadlock risk |
| Optimistic (OCC) | Transactions proceed lock-free on private copies, version-checked at commit | Low-contention profile updates, config lookups | Minimal latency at low contention; “rollback storms” under high write contention |
| MVCC | Writes create new timestamped row versions; readers never block writers | High-frequency reads, ledger audits, session lookups | Eliminates read-write contention; increases storage overhead and vacuuming load |
Under extreme write contention — a jackpot pool adjustment or a burst of bets on one live match — the probability of a rollback under optimistic concurrency control can be modeled statistically as a function of the transaction arrival rate (λ), the hold time of the transaction (t), and the degree of resource overlap (d):
P(rollback) = 1 − e^(−λtd)
As any of those three variables rises, rollback probability climbs toward 1 — which is exactly why pure OCC is a poor fit for jackpot pools but works fine for low-contention profile edits.
To avoid write-locking bottlenecks entirely during peak load, more advanced platforms implement lock-free reservation systems: instead of locking a balance row for the duration of a transaction, the application registers an atomic “intent to change” (a reservation), and defers the actual row update and lock acquisition to commit time. This keeps transaction ingestion flowing without exhausting the thread pool.
High-availability targets of 99.99% uptime also require real-time replication, and operators choose between two models depending on their risk tolerance:
- Synchronous replication writes to primary and replica simultaneously — zero data loss (RPO of zero), but every write pays a network round-trip.
- Asynchronous replication commits locally first and propagates after — lower write latency, but a small replication lag window where a primary failure could lose data.
Managed engines like Amazon Aurora or Amazon RDS combine both properties reasonably well: Aurora natively replicates across Availability Zones, so a secondary can be promoted to primary within seconds after a localized failure, preserving transaction state without manual intervention.
Multi-jurisdictional compliance: the real constraint on architecture
This is where iGaming infrastructure diverges most sharply from standard SaaS cloud design. Licensing regulators don’t just require “a cloud provider” — they mandate the physical location of servers, how data replicates, and what security certifications the environment carries. Four representative jurisdictions illustrate how differently these rules are written.
| Jurisdiction | Server localization | Data residency & replication | Compliance controls |
|---|---|---|---|
| Malta (MGA) | EU/EEA or approved third country. | Requires system traceability and audit trails accessible to MGA on demand. | ISO 27001 focus; risk-based supervision calibrated to licensee profile. |
| Germany (GGL) | Databases/servers must be physically located within the EU/EEA. | Mandatory integration with LUGAS (deposit tracking) and OASIS (self-exclusion). | Host-provider enforcement prioritized; IP-blocking of access providers ruled unlawful (BVerwG 8 C 3.24). |
| Brazil (SPA) | ISO 27001-certified data centers within Brazil required. | Real-time reporting to SIGAP; daily logs of identity, financial flows, and betting history. | Mandatory “Face Match” biometrics; BRL 5M financial reserve; 12% GGR tax. |
| New Jersey (DGE) | Primary gaming/RNG equipment must reside in Atlantic City or DGE-approved facilities. | Geofencing mandatory; wide-area systems must run from central databases in-state. | Strict in-state backup requirements; DGE-approved internal control submissions. |
Malta’s flexibility is deceptive — operators can position application engines almost anywhere in the EU/EEA, but must document the exact physical datacenter location, rack ID, internal and external IP addresses, and encryption protocol for replication traffic, and provide the MGA unhindered electronic access for ad-hoc virtual audits. Germany’s regime layers strict marketing controls on top of residency rules: promotional advertising for online poker or slots is banned between 6 AM and 9 PM across TV, radio, and internet. Operators satisfy age-verification requirements without violating GDPR by running facial age-estimation entirely on-device — the biometric payload never reaches a central server, which preserves compliance while still producing an auditable verification record.
Brazil’s framework, built under Law 14,790/2023, is arguably the most demanding technically: beyond the concession fee, operators need a local legal subsidiary with at least 20% local shareholder capital, and the entire platform must pass technical audits from accredited labs such as GLI, BMM North America, or eCOGRA within the first year of operation.
Hybrid edge topologies: reconciling residency with cloud scale
Strict localization rules in jurisdictions like New Jersey and Brazil historically forced operators onto legacy on-premises datacenters. Hyperscalers have since closed that gap with physical edge appliances — AWS Outposts, Azure Stack, and Google Distributed Cloud — deployed directly inside approved colocation facilities. The pattern lets operators run regulated database engines, RNGs, and transaction ledgers on hardware physically inside the state or country border, while offloading non-regulated workloads like CDN caching or analytics to the nearest public cloud region.
Deploying this in practice means coordinating three constraints simultaneously: environmental (42U racks need verified delivery paths and ASHRAE-compliant thermal handling), power (5–15 kVA redundant feeds, typically single-phase in North American facilities), and network (Outposts connect to a customer-owned local edge, which colocation providers like Continent 8 simplify via NEaaS-delivered virtualized routing and direct peering).
AWS documents five distinct hybrid patterns operators choose between, each suited to a different compliance posture:
| Pattern | Best fit |
|---|---|
| Regional deployment | Regional boundary already aligns with local gambling approvals. |
| Local Zone & Wavelength Zone | Managed edge computing without owning physical hardware. |
| Local Zone & Outpost | Local database residency combined with public cloud edge processing. |
| Wavelength Zone & Outpost | High-frequency mobile betting apps within specific telco network zones. |
| Outposts for primary & secondary sites | Strict physical localization requiring redundant active-active deployments within the state. |
To stop regulated data from leaking into a public-region bucket by mistake, operators pair these deployments with landing zone guardrails via AWS Control Tower and Organizations — non-bypassable policies that block APIs from copying rows out of Outpost storage into public-region resources. On the Google Cloud side, Google Distributed Cloud Hosted can be configured with NVIDIA H100 GPUs to run air-gapped instances of Gemma 2 on-premises, enabling conversational search, compliance reporting, and player-behavior monitoring without any PII or transaction data leaving the sovereign facility.
Security architecture: DDoS, fraud detection, and encryption
iGaming platforms are a persistent target for volumetric DDoS attacks, application exploits, and bot networks — and because major matches are time-sensitive, even brief downtime translates directly into lost betting volume. The layered defense typically includes:
- Private exchange routing — networks like Continent 8’s Gaming Exchange route transaction and live-odds telemetry entirely off the public internet between operators, aggregators, and payment providers, neutralizing public-facing DDoS exposure while cutting latency.
- CDN-terminated public traffic — TCP/TLS handshakes terminate close to the player at global CDN edges, backed by high-capacity scrubbing networks.
- Game-specific edge firewalls — tools like OVHcloud’s Game DDoS protection support up to 100 custom L3/L4 rules per IP on bare-metal servers, filtering malformed traffic and UDP reflection floods before they reach application servers.
- ML-driven threat and fraud analysis — Amazon GuardDuty continuously monitors VPC flow logs and API access for credential compromise and DNS anomalies, while models on SageMaker or Vertex AI score bet timing and transaction patterns for match-fixing, bonus abuse, and bot activity.
- Self-exclusion synchronization — real-time cross-referencing against local databases like GamStop (UK) or OASIS (Germany) to restrict registered players.
On the cryptography side, TLS 1.3 secures data in transit, column-level encryption protects high-value fields like balances, and keys sit in Hardware Security Modules or cloud-native key management. A growing number of high-volume operators are also beginning to pilot post-quantum cryptographic libraries in transactional pipelines, ahead of the risk that today’s encrypted data could be decrypted retroactively once practical quantum attacks emerge.
Putting it together: the modular hybrid blueprint
No single deployment model satisfies both the performance ceiling and the residency floor an iGaming platform needs. A pure public-cloud approach breaks on data residency law; a pure on-premises approach loses the elasticity needed to absorb a World Cup traffic spike. The pattern that holds up in practice is a modular hybrid: latency-critical and strictly regulated engines — ledgers, RNGs, jackpot pools — isolated on bare-metal or localized private cloud inside the approved jurisdiction, while non-regulated services — containerized microservices, player acquisition, global CDN — run on public cloud platforms like Amazon EKS or Google Kubernetes Engine, connected back to the regulated core over private, low-jitter links.
In our work advising operators on this exact split, the recurring mistake isn’t underestimating the performance requirements — it’s underestimating how early the residency rules need to shape the architecture diagram. Retrofitting compliance onto an already-built public-cloud platform is dramatically more expensive than designing the data boundary in from day one.
See how we can help to overcome your challenges


