A website that loads in 800ms in New York can take 4 seconds in Tokyo if your infrastructure isn't built for global delivery. Geographic performance gaps are often invisible to product teams who test from their office locations — but brutally obvious to the users who experience them. Understanding and monitoring regional performance is essential for any service with international users.
Why Performance Varies by Location
Before you can fix geographic performance issues, understand why they exist:
Physical distance — The speed of light limits data transfer. A server in Virginia is 90ms+ round-trip from Singapore simply due to physics.
Network routing — Internet routing isn't always the most direct path. A request might hop through multiple countries based on BGP routing decisions that prioritize cost over speed.
CDN coverage — If your CDN has limited presence in a region, users in that region don't get the benefit of edge caching.
Server infrastructure — If all your origin servers are in US-East, users in Southeast Asia always hit those servers for cache misses.
DNS resolution — A DNS provider with limited global presence can add significant latency for users in underserved regions.
Setting Up Global Synthetic Monitoring
Synthetic monitoring from multiple regions reveals your geographic performance baseline:
# Multi-region monitoring configuration
monitor:
name: "Homepage - Global Latency"
url: "https://example.com"
interval: 300 # 5 minutes
regions:
- name: "US East (Virginia)"
location: us-east-1
- name: "US West (Oregon)"
location: us-west-2
- name: "Europe (Frankfurt)"
location: eu-central-1
- name: "Asia Pacific (Singapore)"
location: ap-southeast-1
- name: "Asia Pacific (Tokyo)"
location: ap-northeast-1
- name: "South America (São Paulo)"
location: sa-east-1
- name: "Middle East (Dubai)"
location: me-south-1
- name: "Africa (Cape Town)"
location: af-south-1
metrics:
- ttfb
- lcp
- full_load_time
- dns_lookup_time
- tcp_connect_time
- tls_handshake_time
alerts:
- condition: "ttfb > 2000ms from any region"
severity: warning
- condition: "full_load_time > 5000ms from any region"
severity: critical
Regional Performance Baseline Analysis
Once you have multi-region data, establish baselines:
| Region | TTFB Typical | LCP Typical | Total Load | Gap vs Best | |---|---|---|---|---| | US East | 120ms | 1.8s | 2.1s | — (baseline) | | US West | 180ms | 2.1s | 2.5s | +400ms | | Europe | 210ms | 2.4s | 2.8s | +700ms | | Singapore | 420ms | 3.8s | 4.5s | +2.4s | | São Paulo | 380ms | 3.5s | 4.2s | +2.1s | | Tokyo | 380ms | 3.4s | 4.1s | +2.0s |
This table immediately reveals that users in Singapore, São Paulo, and Tokyo are experiencing 2x+ worse performance than US East users. That's significant enough to affect conversion rates.
Diagnosing Regional Performance Issues
When a region is slow, diagnose whether the bottleneck is at the network level or application level:
# Diagnose regional performance from a specific location
# Using traceroute to see network path
traceroute example.com
# DNS resolution time from different locations
dig @8.8.8.8 example.com | grep "Query time"
# Full connection timing with curl
curl -o /dev/null -s -w "DNS: %{time_namelookup}s\n\
TCP: %{time_connect}s\n\
TLS: %{time_appconnect}s\n\
TTFB: %{time_starttransfer}s\n\
Total: %{time_total}s\n" https://example.com
Interpreting results:
| High Phase | Likely Cause | Solution | |---|---|---| | DNS lookup | Slow DNS resolver or propagation issue | Use fast global DNS (Cloudflare 1.1.1.1 or Route 53) | | TCP connect | Geographic distance to server | Add CDN or regional origin server | | TLS handshake | No TLS session resumption, certificate issues | Enable TLS session tickets, use CDN | | TTFB | Slow server/database, no CDN caching | Improve caching, add regional origin | | Transfer time | Large response size, slow bandwidth | Compress responses, reduce payload |
CDN Performance Analysis
A CDN should dramatically improve geographic performance by serving content from edge nodes close to users. Verify your CDN is working:
def analyze_cdn_performance(monitoring_data):
"""
Analyze whether CDN is actually serving requests from edge nodes.
Look for cache hit rates and edge vs origin latency.
"""
results = {}
for region, checks in monitoring_data.items():
cache_hits = sum(1 for c in checks if c.get('x-cache') in ['HIT', 'HIT from cloudfront'])
cache_misses = len(checks) - cache_hits
hit_latencies = [c['ttfb_ms'] for c in checks if c.get('x-cache') == 'HIT']
miss_latencies = [c['ttfb_ms'] for c in checks if c.get('x-cache') == 'MISS']
results[region] = {
'cache_hit_rate': cache_hits / len(checks) if checks else 0,
'avg_ttfb_hit': sum(hit_latencies) / len(hit_latencies) if hit_latencies else 0,
'avg_ttfb_miss': sum(miss_latencies) / len(miss_latencies) if miss_latencies else 0,
'cdn_benefit_ms': (
(sum(miss_latencies) / len(miss_latencies)) -
(sum(hit_latencies) / len(hit_latencies))
if hit_latencies and miss_latencies else 0
)
}
return results
If your cache hit rate is below 70%, you're not getting the full benefit of your CDN. Common causes:
- Too many unique URLs (query parameters not normalized)
- Short cache TTLs
- Bypassing CDN for authenticated requests
- Vary headers forcing cache fragmentation
Real User Geographic Data
Synthetic monitoring shows potential performance — RUM shows actual user experience:
-- Geographic RUM analysis with business impact
SELECT
geo_country,
geo_region,
COUNT(*) as sessions,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY lcp_ms) as lcp_p75,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY ttfb_ms) as ttfb_p75,
SUM(CASE WHEN purchased THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as conversion_rate,
SUM(revenue) as total_revenue
FROM user_sessions
WHERE session_date > CURRENT_DATE - INTERVAL '30 days'
GROUP BY geo_country, geo_region
HAVING COUNT(*) > 50
ORDER BY sessions DESC;
This query reveals both performance AND business impact by region. If Singapore has terrible LCP (4.2s) and poor conversion rate (1.2% vs 3.8% global average), there's a quantifiable business case for performance improvement there.
Regional Alerting Strategy
Set regional alerts that account for expected baseline differences:
# Regional-aware alerting
alerts:
# Absolute threshold (critical for any region)
- name: "Any Region Unavailable"
condition: "availability < 95% in any region"
severity: critical
# Regional threshold - accounts for expected latency differences
- name: "US Region Latency High"
condition: "p95_latency > 2000ms from us-east-1 or us-west-2"
severity: warning
- name: "Europe Region Latency High"
condition: "p95_latency > 3000ms from eu-central-1 or eu-west-1"
severity: warning
- name: "APAC Region Latency High"
condition: "p95_latency > 5000ms from ap-southeast-1 or ap-northeast-1"
severity: warning
# Relative degradation (any region gets 50% worse than its baseline)
- name: "Regional Performance Regression"
condition: |
region_latency > (region_baseline_latency * 1.5) for 5 minutes
severity: critical
message: "Region is 50% slower than baseline — possible issue"
Infrastructure Decisions Driven by Geographic Data
Geographic performance data should drive infrastructure decisions:
CDN expansion — If your monitoring shows high TTFB in Southeast Asia, investigate whether your CDN has adequate edge presence there or whether you should add another CDN provider with better coverage.
Multi-region deployment — If CDN alone doesn't solve TTFB issues (because your origin is far away), deploy application instances in regions where you have significant users.
Database read replicas — High TTFB often indicates database round-trips. Read replicas in distant regions can reduce database latency for read-heavy operations.
DNS routing — Use latency-based routing (AWS Route 53, Cloudflare Load Balancing) to automatically route users to the closest healthy server.
# AWS Route 53 - latency-based routing example
records:
- name: api.example.com
type: A
region: us-east-1
latency_based: true
health_check: api-us-east-health
- name: api.example.com
type: A
region: eu-west-1
latency_based: true
health_check: api-eu-west-health
- name: api.example.com
type: A
region: ap-southeast-1
latency_based: true
health_check: api-ap-health
Monitoring Regional Failover
When a region becomes unhealthy, traffic should automatically route to healthy regions. Monitor that this failover works:
def test_regional_failover(regions, health_check_url):
"""
Verify that traffic correctly routes away from unhealthy regions.
"""
results = {}
for region in regions:
# Check from each region
check = synthetic_check(
url=health_check_url,
from_region=region,
timeout=5
)
results[region] = {
"responding": check.status == 200,
"responding_region": check.headers.get("x-served-by"), # Which server actually handled it
"latency_ms": check.latency_ms
}
# Alert if any region is routing to itself when unhealthy
for region, result in results.items():
if not result["responding"] and result["responding_region"] == region:
alert(f"Region {region} is unhealthy but still routing traffic to itself")
return results
Conclusion
Geographic performance monitoring transforms "some users complain it's slow" into specific, diagnosable, fixable problems. By running synthetic checks from multiple regions, analyzing RUM data by country, and correlating performance with business metrics, you build a clear picture of where your global users are struggling and what the business impact is. AzMonitor runs monitoring checks from multiple global locations simultaneously, giving you the regional performance data you need to make informed infrastructure decisions and catch regional degradation before it affects enough users to show up in support tickets.
3 monitors free forever · No credit card needed · Set up in 2 minutes
Start monitoring free →