IT Infrastructure

iGaming cloud infrastructure: architecture, performance, and compliance guide

iGaming cloud infrastructure

The global online gambling and sports betting market is projected to surpass $126 billion by 2027, and the platforms competing for that revenue routinely support more than 50,000 concurrent players during peak events. That scale creates a narrow engineering problem: an iGaming cloud infrastructure has to deliver sub-30-millisecond response times, survive a 40x traffic spike during a World Cup final, and simultaneously prove to four or five regulators that player data never left an approved border. Most general-purpose cloud architectures are built for one or two of those constraints. iGaming platforms need all three at once.

This guide walks through how modern iGaming operators actually build for that combination — compute and network topology, stateful WebSocket scaling, database concurrency control, and the jurisdictional rules that shape where every byte of player data can physically sit. It draws on current AWS, OVHcloud, Continent 8, and Google Cloud reference architectures, alongside the statutory frameworks operators must satisfy in Malta, Germany, Brazil, and New Jersey.

TL;DR

  • Real-time betting and live-dealer games require sub-30ms round-trip latency, which pushes core transactional logic onto single-tenant bare-metal servers rather than shared public cloud instances.
  • WebSocket-based session persistence needs sticky routing, consistent hashing, and a pub/sub layer (Redis or Kafka) to synchronize state across edge nodes.
  • Database layers combine ACID-compliant engines (PostgreSQL, MySQL InnoDB) for ledgers with MVCC for read-heavy audit paths and in-memory stores for session state.
  • Malta, Germany, Brazil, and New Jersey each impose different physical server localization and data residency rules — there is no single compliant architecture that works everywhere.
  • Hybrid edge appliances (AWS Outposts, Google Distributed Cloud) let operators keep regulated workloads on sovereign hardware while running CDN and analytics in the public cloud.

Why standard cloud architectures fall short for iGaming

In live sports betting, a 500-millisecond delay is a financial vulnerability, not just a UX inconvenience. It opens an arbitrage window for “court-siding” — placing bets on events that have already concluded by exploiting broadcast delay. That single constraint reshapes the entire infrastructure decision tree.

Traditional multi-tenant public cloud environments introduce the “noisy neighbor” effect: virtualized workloads sharing physical hardware cause unpredictable jitter and round-trip time spikes. For a game engine calculating live odds or generating random numbers under regulatory audit, that unpredictability is unacceptable. This is why iGaming operators consistently isolate core transactional and game-logic engines on single-tenant bare-metal servers or private clouds, frequently orchestrated through OpenStack-based hypervisors and custom management APIs that avoid the resource contention inherent to shared infrastructure.

The hardware baseline

The table below summarizes the technical profile operators typically specify for compute nodes running RNG, betting logic, and ledger transactions.

Component Typical configuration Why it matters
Compute Dual Intel Xeon Scalable Silver/Gold, high base clock Minimizes RNG and betting-logic execution time; avoids CPU scheduler queue delays
Memory 96–256 GB DDR4/DDR5 ECC RAM Real-time correction of single-bit memory errors; prevents crashes during peak load
Storage Dual NVMe SSDs in RAID Sustains high concurrent write IOPS without thread stalls
Network Dual-bonded 100–200 Gbps NICs, Tier-1 peering Maximizes burst capacity, minimizes network hops
Colocation Tier III+/IV facilities, N+1 or 2N power redundancy Supports up to 99.993% uptime; isolates localized power failures

The cost trade-off matters as much as the technical spec. Public cloud egress charges can reach $0.09 per gigabyte on platforms like AWS, and that adds up fast when a live match generates continuous odds-update traffic to tens of thousands of sockets. Dedicated server pricing is predictable month over month — which is exactly the property operators need when a single high-volume event can otherwise erode margin through unplanned cloud consumption.

⚙️
Weighing bare-metal against public cloud for a betting platform? Gart Solutions runs vendor-neutral cloud architecture assessments that map real traffic and latency requirements to the right mix of dedicated and public infrastructure — before you commit to a colocation contract. See our cloud architecture practice →

Stateful sessions: scaling WebSockets without losing game state

Live odds, live-dealer video, and real-time game state depend on persistent, bidirectional connections. Standard HTTP request-response cycles carry too much overhead — repeated TCP handshakes and verbose headers — for that job, so platforms upgrade to WebSocket: a single full-duplex TCP socket established once and held open for the duration of play.

Client Server
|——– HTTP GET /ws (Upgrade: websocket) ——>| [Initial Handshake]
|<------- HTTP 101 Switching Protocols ————| [Protocol Upgraded]
| |
|================= ESTABLISHED TCP STREAM ================|
|<------- Binary/Text Frames (client frames XOR-masked)| [Active Game Play]
|<------- Ping / Pong Heartbeats ———————>| [Idle Maintenance]

Every frame sent from client to server is masked with a 32-bit XOR key to prevent frame-injection attacks, and platforms typically apply permessage-deflate compression to shrink repetitive JSON payloads — while leaving control frames like ping/pong/close uncompressed to protect connection stability.

The harder problem is scaling this statefully. Because each socket holds an active player session in memory, you cannot route a reconnecting player to an arbitrary backend node the way you would with stateless HTTP. Operators typically combine four techniques:

  • Sticky sessions with IP hashing — Nginx’s ip_hash (or equivalent) maps a client to the same backend node on reconnect, avoiding expensive cross-server state sync.
  • Consistent hashing — minimizes the percentage of session keys that must be remapped when application shards scale up or down, preventing hotspots.
  • Distributed pub/sub — a message broker (Redis or Apache Kafka) propagates state changes instantly across edge nodes, since players in the same game room are often connected to different physical servers.
  • Connection pooling and keep-alives — heartbeat frames at fixed intervals prevent silent termination by intermediary infrastructure (ALBs, Cloudflare edges, ISP firewalls), with exponential-backoff reconnection logic on the client side to preserve gameplay state after a drop.

5G rollout has pushed the achievable latency floor down to under 10 milliseconds, versus roughly 200ms on legacy 4G — which is why real-time multiplayer engines and fast-action sports betting now target sub-30ms round-trip time, well below the 100ms threshold that’s acceptable for standard casino systems.

Database architecture: concurrency control for the ledger

The database layer is the system of record for account balances, open wagers, and transaction history, and it’s where most iGaming platforms actually fail under load — not at the network edge. Maintaining integrity at scale requires a hybrid design: ACID-compliant relational databases (PostgreSQL, MySQL InnoDB) for ledger entries and balances, in-memory key-value stores (Redis) for session state, and document databases (MongoDB) for telemetry.

Without strict concurrency control, simultaneous writes to the same row — a player spinning a slot while a deposit posts — produce dirty reads, lost updates, and phantom reads. There are three concurrency paradigms in active use, each with a distinct trade-off profile:

Paradigm Mechanism Best for Trade-off
Pessimistic (PCC) Exclusive/shared locks (e.g., SELECT FOR UPDATE) acquired upfront Ledger balances, jackpot pools, payment processing Guarantees integrity under contention; higher latency from lock queuing and deadlock risk
Optimistic (OCC) Transactions proceed lock-free on private copies, version-checked at commit Low-contention profile updates, config lookups Minimal latency at low contention; “rollback storms” under high write contention
MVCC Writes create new timestamped row versions; readers never block writers High-frequency reads, ledger audits, session lookups Eliminates read-write contention; increases storage overhead and vacuuming load

Under extreme write contention — a jackpot pool adjustment or a burst of bets on one live match — the probability of a rollback under optimistic concurrency control can be modeled statistically as a function of the transaction arrival rate (λ), the hold time of the transaction (t), and the degree of resource overlap (d):

P(rollback) = 1 − e^(−λtd)

As any of those three variables rises, rollback probability climbs toward 1 — which is exactly why pure OCC is a poor fit for jackpot pools but works fine for low-contention profile edits.

To avoid write-locking bottlenecks entirely during peak load, more advanced platforms implement lock-free reservation systems: instead of locking a balance row for the duration of a transaction, the application registers an atomic “intent to change” (a reservation), and defers the actual row update and lock acquisition to commit time. This keeps transaction ingestion flowing without exhausting the thread pool.

High-availability targets of 99.99% uptime also require real-time replication, and operators choose between two models depending on their risk tolerance:

  • Synchronous replication writes to primary and replica simultaneously — zero data loss (RPO of zero), but every write pays a network round-trip.
  • Asynchronous replication commits locally first and propagates after — lower write latency, but a small replication lag window where a primary failure could lose data.

Managed engines like Amazon Aurora or Amazon RDS combine both properties reasonably well: Aurora natively replicates across Availability Zones, so a secondary can be promoted to primary within seconds after a localized failure, preserving transaction state without manual intervention.

Multi-jurisdictional compliance: the real constraint on architecture

This is where iGaming infrastructure diverges most sharply from standard SaaS cloud design. Licensing regulators don’t just require “a cloud provider” — they mandate the physical location of servers, how data replicates, and what security certifications the environment carries. Four representative jurisdictions illustrate how differently these rules are written.

Jurisdiction Server localization Data residency & replication Compliance controls
Malta (MGA) EU/EEA or approved third country. Requires system traceability and audit trails accessible to MGA on demand. ISO 27001 focus; risk-based supervision calibrated to licensee profile.
Germany (GGL) Databases/servers must be physically located within the EU/EEA. Mandatory integration with LUGAS (deposit tracking) and OASIS (self-exclusion). Host-provider enforcement prioritized; IP-blocking of access providers ruled unlawful (BVerwG 8 C 3.24).
Brazil (SPA) ISO 27001-certified data centers within Brazil required. Real-time reporting to SIGAP; daily logs of identity, financial flows, and betting history. Mandatory “Face Match” biometrics; BRL 5M financial reserve; 12% GGR tax.
New Jersey (DGE) Primary gaming/RNG equipment must reside in Atlantic City or DGE-approved facilities. Geofencing mandatory; wide-area systems must run from central databases in-state. Strict in-state backup requirements; DGE-approved internal control submissions.

Malta’s flexibility is deceptive — operators can position application engines almost anywhere in the EU/EEA, but must document the exact physical datacenter location, rack ID, internal and external IP addresses, and encryption protocol for replication traffic, and provide the MGA unhindered electronic access for ad-hoc virtual audits. Germany’s regime layers strict marketing controls on top of residency rules: promotional advertising for online poker or slots is banned between 6 AM and 9 PM across TV, radio, and internet. Operators satisfy age-verification requirements without violating GDPR by running facial age-estimation entirely on-device — the biometric payload never reaches a central server, which preserves compliance while still producing an auditable verification record.

Brazil’s framework, built under Law 14,790/2023, is arguably the most demanding technically: beyond the concession fee, operators need a local legal subsidiary with at least 20% local shareholder capital, and the entire platform must pass technical audits from accredited labs such as GLI, BMM North America, or eCOGRA within the first year of operation.

🛡️
Regulatory residency rules rarely map cleanly onto a single cloud provider’s region list. Gart Solutions’ compliance architecture practice helps regulated operators translate jurisdiction-specific rules — like Malta’s replication mandates or Brazil’s data-center residency requirements — into concrete infrastructure decisions. Explore data sovereignty and compliance architecture →

Hybrid edge topologies: reconciling residency with cloud scale

Strict localization rules in jurisdictions like New Jersey and Brazil historically forced operators onto legacy on-premises datacenters. Hyperscalers have since closed that gap with physical edge appliances — AWS Outposts, Azure Stack, and Google Distributed Cloud — deployed directly inside approved colocation facilities. The pattern lets operators run regulated database engines, RNGs, and transaction ledgers on hardware physically inside the state or country border, while offloading non-regulated workloads like CDN caching or analytics to the nearest public cloud region.

Hybrid Edge Architecture (AWS Outposts)
AWS Parent Region
|
| [Managed Control Plane via Service Link]
v
Local Colocation Facility (e.g., Continent 8)
|– AWS Outpost (42U rack, 5–15 kVA, redundant power)
| |– Local Gateway (routes local traffic directly)
| +– Dedicated EC2/EKS nodes (regulated database & RNG logic)
|
+– Network Edge as a Service (NEaaS)
|– Virtual firewalls & IDS/IPS
+– Cloud Connect MPLS (private link to parent region)

Deploying this in practice means coordinating three constraints simultaneously: environmental (42U racks need verified delivery paths and ASHRAE-compliant thermal handling), power (5–15 kVA redundant feeds, typically single-phase in North American facilities), and network (Outposts connect to a customer-owned local edge, which colocation providers like Continent 8 simplify via NEaaS-delivered virtualized routing and direct peering).

AWS documents five distinct hybrid patterns operators choose between, each suited to a different compliance posture:

Pattern Best fit
Regional deployment Regional boundary already aligns with local gambling approvals.
Local Zone & Wavelength Zone Managed edge computing without owning physical hardware.
Local Zone & Outpost Local database residency combined with public cloud edge processing.
Wavelength Zone & Outpost High-frequency mobile betting apps within specific telco network zones.
Outposts for primary & secondary sites Strict physical localization requiring redundant active-active deployments within the state.

To stop regulated data from leaking into a public-region bucket by mistake, operators pair these deployments with landing zone guardrails via AWS Control Tower and Organizations — non-bypassable policies that block APIs from copying rows out of Outpost storage into public-region resources. On the Google Cloud side, Google Distributed Cloud Hosted can be configured with NVIDIA H100 GPUs to run air-gapped instances of Gemma 2 on-premises, enabling conversational search, compliance reporting, and player-behavior monitoring without any PII or transaction data leaving the sovereign facility.

Security architecture: DDoS, fraud detection, and encryption

iGaming platforms are a persistent target for volumetric DDoS attacks, application exploits, and bot networks — and because major matches are time-sensitive, even brief downtime translates directly into lost betting volume. The layered defense typically includes:

  • Private exchange routing — networks like Continent 8’s Gaming Exchange route transaction and live-odds telemetry entirely off the public internet between operators, aggregators, and payment providers, neutralizing public-facing DDoS exposure while cutting latency.
  • CDN-terminated public traffic — TCP/TLS handshakes terminate close to the player at global CDN edges, backed by high-capacity scrubbing networks.
  • Game-specific edge firewalls — tools like OVHcloud’s Game DDoS protection support up to 100 custom L3/L4 rules per IP on bare-metal servers, filtering malformed traffic and UDP reflection floods before they reach application servers.
  • ML-driven threat and fraud analysis — Amazon GuardDuty continuously monitors VPC flow logs and API access for credential compromise and DNS anomalies, while models on SageMaker or Vertex AI score bet timing and transaction patterns for match-fixing, bonus abuse, and bot activity.
  • Self-exclusion synchronization — real-time cross-referencing against local databases like GamStop (UK) or OASIS (Germany) to restrict registered players.

On the cryptography side, TLS 1.3 secures data in transit, column-level encryption protects high-value fields like balances, and keys sit in Hardware Security Modules or cloud-native key management. A growing number of high-volume operators are also beginning to pilot post-quantum cryptographic libraries in transactional pipelines, ahead of the risk that today’s encrypted data could be decrypted retroactively once practical quantum attacks emerge.

Putting it together: the modular hybrid blueprint

No single deployment model satisfies both the performance ceiling and the residency floor an iGaming platform needs. A pure public-cloud approach breaks on data residency law; a pure on-premises approach loses the elasticity needed to absorb a World Cup traffic spike. The pattern that holds up in practice is a modular hybrid: latency-critical and strictly regulated engines — ledgers, RNGs, jackpot pools — isolated on bare-metal or localized private cloud inside the approved jurisdiction, while non-regulated services — containerized microservices, player acquisition, global CDN — run on public cloud platforms like Amazon EKS or Google Kubernetes Engine, connected back to the regulated core over private, low-jitter links.

In our work advising operators on this exact split, the recurring mistake isn’t underestimating the performance requirements — it’s underestimating how early the residency rules need to shape the architecture diagram. Retrofitting compliance onto an already-built public-cloud platform is dramatically more expensive than designing the data boundary in from day one.

Let’s work together!

See how we can help to overcome your challenges

FAQ

What latency does an iGaming platform actually need?

It depends on the product. Standard casino systems (slots and table games) generally operate well at around 100 ms response time. Real-time multiplayer games and fast-action sports betting require sub-30 ms round-trip latency to maintain fair play and prevent arbitrage exploits such as court-siding, where even a 500 ms delay can create a betting window on events that have already occurred. Modern 5G edge deployments can reduce latency below 10 ms, compared with the roughly 200 ms typical of legacy 4G networks.

Should iGaming platforms use public cloud or dedicated servers?

Most mature platforms use a hybrid approach. Latency-sensitive and highly regulated components—such as RNGs, ledgers, and jackpot pools—typically run on dedicated bare-metal servers to eliminate virtualization noise and meet regulatory residency requirements. Less critical workloads, including CDN delivery and player analytics, are often hosted in the public cloud for scalability and cost efficiency during traffic spikes. Public cloud egress fees can also make fully cloud-based live betting infrastructure expensive at scale.

How does data residency differ between Malta, Germany, Brazil, and New Jersey?

Requirements vary significantly. Malta allows primary infrastructure anywhere within the EU/EEA but requires real-time replication of regulated data to a Maltese data center. Germany requires servers and databases to remain within the EU/EEA. Brazil requires systems and user databases to be hosted in ISO 27001-certified data centers located in Brazil. New Jersey has the strictest rules, requiring primary gaming servers, equipment, and RNGs to be physically located in Atlantic City casino hotels or DGE-approved facilities.

Why do iGaming platforms use WebSockets instead of standard HTTP APIs?

Live odds, dealer streams, and real-time game state require continuous bidirectional communication. Traditional HTTP request-response cycles introduce unnecessary overhead through repeated requests and headers. WebSockets establish a persistent full-duplex connection that allows lightweight data frames to be exchanged with minimal latency after the initial handshake.

What database concurrency model is best for a betting platform?

Most betting platforms combine several concurrency models. Pessimistic locking is used for high-contention financial operations such as balance updates and jackpot pools. Optimistic concurrency control works well for low-contention tasks like profile updates. MVCC supports high-volume read operations such as player history and audit logs because readers do not block writers. Advanced platforms may also implement lock-free reservation mechanisms for better scalability during peak traffic.

Can an operator use AWS Outposts to satisfy strict data residency laws?

Yes. AWS Outposts and similar edge platforms allow regulated services such as databases, RNGs, and transaction ledgers to run on hardware physically located within an approved jurisdiction while integrating with public cloud services for less regulated workloads. Successful deployments require appropriate power, cooling, networking, and secure connectivity back to the parent cloud region.

How do operators protect real-time betting platforms from DDoS attacks?

Operators typically use layered protection. Private gaming networks keep critical traffic off the public internet, CDN edge services absorb large-scale attacks, edge firewalls block malicious traffic before it reaches application servers, and AI-driven monitoring systems detect fraud and traffic anomalies in real time.

How should an operator start planning multi-jurisdiction iGaming infrastructure?

Start with regulatory requirements rather than infrastructure technology. Data residency, audit, and localization rules should define the architecture from the outset because redesigning infrastructure later to achieve compliance is significantly more expensive than building it correctly from the beginning. Gart Solutions' cloud architects help operators map regulatory requirements to scalable hybrid cloud architectures before infrastructure investments begin.
arrow arrow

Thank you
for contacting us!

Please, check your email

arrow arrow

Thank you

You've been subscribed

We use cookies to enhance your browsing experience. By clicking "Accept," you consent to the use of cookies. To learn more, read our Privacy Policy