The API gateway is the front door to your entire service catalog. Every request passes through it before reaching your backend services. When the gateway has problems — high latency, misconfigured routing, failed authentication plugins — the impact is total. Every API is affected simultaneously. Yet many teams monitor their backend services carefully while treating the gateway as an afterthought.

What to Monitor in an API Gateway

API gateways introduce a unique layer of infrastructure between clients and services. This layer can fail or degrade in ways that are distinct from application-level failures:

Routing failures — A route misconfiguration can send traffic to the wrong backend or return 404s for valid endpoints.

Plugin failures — Authentication plugins, rate limiters, and request transformers can fail while the gateway itself stays up.

Latency overhead — The gateway adds latency to every request. This overhead should be consistently small (< 5ms). Spikes indicate gateway-level problems.

TLS termination — Certificate issues at the gateway level affect all services simultaneously.

Connection pool exhaustion — If the gateway runs out of upstream connections, all services degrade together.

AWS API Gateway Monitoring

AWS API Gateway exposes metrics through CloudWatch. The most important ones:

| Metric | Description | Alert Threshold | |---|---|---| | Count | Total number of API calls | Baseline deviation | | Latency | Full request latency including integration | > 3000ms | | IntegrationLatency | Backend response time (excludes gateway) | > 2500ms | | 4XXError | Client-side errors | > 2% rate | | 5XXError | Server-side errors | > 1% rate | | CacheHitCount | Response cache hits | Track efficiency | | CacheMissCount | Response cache misses | Track efficiency |

The difference between Latency and IntegrationLatency is the gateway's own overhead. If Latency is 500ms and IntegrationLatency is 490ms, the gateway adds ~10ms — normal. If the gateway adds 200ms, something's wrong.

CloudWatch Alarms for API Gateway

# Create CloudWatch alarm for 5XX errors
aws cloudwatch put-metric-alarm \
  --alarm-name "APIGateway-5XXErrors-High" \
  --alarm-description "API Gateway 5XX error rate above 1%" \
  --metric-name 5XXError \
  --namespace AWS/ApiGateway \
  --dimensions Name=ApiName,Value=my-api \
  --statistic Average \
  --period 300 \
  --threshold 0.01 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 2 \
  --alarm-actions "arn:aws:sns:us-east-1:123456789:alerts"

# Alarm for latency
aws cloudwatch put-metric-alarm \
  --alarm-name "APIGateway-Latency-High" \
  --alarm-description "API Gateway p99 latency above 3000ms" \
  --metric-name Latency \
  --namespace AWS/ApiGateway \
  --extended-statistic p99 \
  --period 300 \
  --threshold 3000 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 3 \
  --alarm-actions "arn:aws:sns:us-east-1:123456789:alerts"

Kong Gateway Monitoring

Kong is a popular open-source API gateway with rich plugin support. Monitor it through its Admin API and Prometheus metrics:

# Kong Prometheus plugin configuration
plugins:
  - name: prometheus
    config:
      status_code_metrics: true
      latency_metrics: true
      bandwidth_metrics: true
      upstream_health_metrics: true

Key Kong Prometheus metrics:

# Request rate by service
rate(kong_http_requests_total[5m])

# Error rate per service
rate(kong_http_requests_total{code=~"5.."}[5m])
/ rate(kong_http_requests_total[5m])

# Gateway latency (not including upstream)
histogram_quantile(0.99, rate(kong_kong_latency_ms_bucket[5m]))

# Upstream (backend) latency
histogram_quantile(0.99, rate(kong_upstream_latency_ms_bucket[5m]))

# Bandwidth usage
rate(kong_bandwidth_bytes_total[5m])

Kong Health Check Configuration

Kong can actively health-check upstream services. Configure this to get automatic failover and visibility into upstream health:

# Kong upstream with active health checks
upstreams:
  - name: user-service
    healthchecks:
      active:
        healthy:
          interval: 10
          successes: 2
        unhealthy:
          interval: 5
          http_failures: 3
          http_statuses: [429, 500, 503]
        http_path: "/health"
        timeout: 1
      passive:
        healthy:
          successes: 5
        unhealthy:
          http_failures: 5
          http_statuses: [500, 503]

Monitor Kong's upstream health through the Admin API:

# Check upstream health status
curl -s http://kong-admin:8001/upstreams/user-service/health | jq '
  .data[] | {
    address: .address,
    health: .health,
    weight: .weight
  }
'

# Output:
# {
#   "address": "user-service-1:8080",
#   "health": "HEALTHY",
#   "weight": 100
# }
# {
#   "address": "user-service-2:8080",
#   "health": "UNHEALTHY",
#   "weight": 0
# }

Nginx API Gateway Monitoring

For teams using Nginx as an API gateway, the key metrics come from the ngx_http_stub_status_module or the commercial Nginx Plus status:

# Enable status module
server {
    listen 8080;
    location /nginx_status {
        stub_status on;
        allow 127.0.0.1;
        deny all;
    }
}

Parse the status output:

# Nginx status endpoint output:
# Active connections: 45
# server accepts handled requests
#  1000 1000 5432
# Reading: 0 Writing: 5 Waiting: 40

# Parse with curl and awk
curl -s http://localhost:8080/nginx_status | \
  awk '/Active/ {print "active_connections=" $3} 
       /Reading/ {print "reading=" $2, "writing=" $4, "waiting=" $6}'

For detailed upstream monitoring, use the ngx_http_upstream_module:

upstream api_backends {
    server backend1:8080;
    server backend2:8080;
    
    # Expose upstream status
    keepalive 32;
}

Gateway Latency Overhead Analysis

Calculate how much latency the gateway adds versus what your backend contributes:

Total Latency = Gateway Overhead + Backend Processing + Network

Gateway Overhead = Total Latency - Integration Latency

Target gateway overhead by gateway type:

| Gateway | Acceptable Overhead | High Overhead | |---|---|---| | AWS API Gateway | < 10ms | > 50ms | | Kong | < 5ms | > 20ms | | Nginx | < 2ms | > 10ms | | Traefik | < 3ms | > 15ms | | Envoy | < 2ms | > 10ms |

When gateway overhead spikes, check for:

Plugin execution taking too long
Rate limiter checking Redis with high latency
JWT validation with slow key fetching
Excessive logging or request/response transformation

Monitoring Route Configuration

Invalid route configurations are a common source of incidents. Monitor the health of routing:

# Validate all routes are reachable
def validate_gateway_routes(gateway_admin_url, sample_check=True):
    """
    Fetch all configured routes and validate each one responds correctly.
    """
    # Get all configured routes
    routes = requests.get(f"{gateway_admin_url}/routes").json()['data']
    
    results = []
    for route in routes:
        service_name = route.get('service', {}).get('id', 'unknown')
        paths = route.get('paths', [])
        
        if not paths:
            continue
        
        if sample_check:
            # Test the first path
            test_path = paths[0]
            response = requests.get(
                f"https://api.example.com{test_path}",
                headers={"X-Monitor": "true"},
                timeout=5,
                allow_redirects=False
            )
            
            results.append({
                "route_id": route['id'],
                "service": service_name,
                "path": test_path,
                "status": response.status_code,
                "healthy": response.status_code not in [404, 502, 503]
            })
    
    unhealthy = [r for r in results if not r['healthy']]
    return {
        "total_routes": len(results),
        "healthy": len(results) - len(unhealthy),
        "unhealthy": unhealthy
    }

Cross-Gateway Latency Comparison

If you run multiple API gateways (for different environments or regions), compare their latency profiles:

| Gateway Instance | Avg Latency | P95 Latency | Error Rate | |---|---|---|---| | gateway-us-east-1 | 8ms | 24ms | 0.02% | | gateway-us-west-2 | 11ms | 31ms | 0.03% | | gateway-eu-west-1 | 9ms | 28ms | 0.02% | | gateway-ap-east-1 | 45ms | 130ms | 0.08% |

The ap-east-1 gateway is significantly slower — likely a configuration issue or a slow upstream DNS lookup in that region.

Alerting for Gateway Issues

alerts:
  # Gateway is adding excessive latency
  - name: "Gateway Latency Overhead High"
    condition: "gateway_overhead_ms > 50"
    severity: warning
    
  # High proportion of gateway errors
  - name: "Gateway Error Rate Critical"
    condition: "gateway_5xx_rate > 0.01"
    severity: critical
    runbook: "https://wiki.example.com/runbooks/api-gateway-5xx"
    
  # Upstream connections exhausted
  - name: "Gateway Connection Pool Exhausted"
    condition: "gateway_upstream_connections_available < 10"
    severity: critical
    
  # Route check failure
  - name: "Gateway Route Misconfigured"
    condition: "gateway_route_health_check_failed = 1"
    severity: critical

Conclusion

The API gateway sits at the intersection of all your services — when it has problems, everything has problems. Thorough gateway monitoring covers latency overhead (distinguish backend latency from gateway overhead), routing validation, plugin health, upstream connection pools, and error rates. AzMonitor can monitor your gateway's public endpoints from multiple regions to catch routing issues and latency problems that internal monitoring might miss, providing an outside-in view of your API perimeter health.

Tags:API gatewayKongAWS API GatewayAPI monitoring

Back to blog

AzMonitor Team

The AzMonitor team writes guides based on experience monitoring millions of endpoints daily across 10,000+ customer environments. Our expertise covers uptime monitoring, SRE practices, and reliability engineering.

Try AzMonitor free

3 monitors free forever · No credit card needed · Set up in 2 minutes

Start monitoring free →

API Gateway Monitoring: Observability for Your API Perimeter

What to Monitor in an API Gateway

AWS API Gateway Monitoring

CloudWatch Alarms for API Gateway

Kong Gateway Monitoring

Kong Health Check Configuration

Nginx API Gateway Monitoring

Gateway Latency Overhead Analysis

Monitoring Route Configuration

Cross-Gateway Latency Comparison

Alerting for Gateway Issues

Conclusion

Related articles

Uptime Monitoring for Mobile Apps and Backend APIs

Monitoring Protected Pages: Authenticated Endpoint Checks

Beyond Ping: Advanced Uptime Monitoring Techniques