Authentication failures are uniquely disruptive. When your API is slow, users complain. When authentication breaks, users are completely locked out. A broken login endpoint, an expired OAuth token, or a misconfigured JWT validation can lock everyone out simultaneously — and because auth failures often look like user errors at first, they can go undetected longer than other outages.

Why Auth Monitoring Is Harder Than Regular API Monitoring

Standard API monitoring checks: is the endpoint responding? Auth monitoring has to go further:

Is the authentication server reachable?
Are tokens being issued correctly?
Is token validation working end-to-end?
Are refresh flows functioning?
Have certificates or signing keys expired?

Each of these can fail independently. Your API might respond fine to requests with valid tokens while completely failing to issue new tokens — so existing users stay logged in while new users or expired sessions can't authenticate.

Monitoring OAuth2 Flows

OAuth2 is the most common auth protocol for modern APIs. The token endpoint is your most critical dependency:

monitor:
  name: "OAuth2 Token Endpoint"
  url: "https://auth.example.com/oauth/token"
  method: POST
  interval: 60
  headers:
    Content-Type: "application/x-www-form-urlencoded"
  body: "grant_type=client_credentials&client_id=${MONITOR_CLIENT_ID}&client_secret=${MONITOR_CLIENT_SECRET}&scope=monitoring:health"
  assertions:
    - type: status_code
      value: 200
    - type: json_body
      path: "$.access_token"
      operator: exists
    - type: json_body
      path: "$.token_type"
      value: "Bearer"
    - type: json_body
      path: "$.expires_in"
      operator: greater_than
      value: 0
    - type: response_time
      operator: less_than
      value: 2000

Create a dedicated monitoring OAuth2 client with minimal scopes — just enough to issue tokens and verify the flow works, without broad access to your data.

Multi-Step OAuth2 Monitoring

For authorization code flows, you need to simulate the full flow:

// Automated OAuth2 flow test
async function testOAuthFlow(config) {
  const steps = {};
  
  // Step 1: Get authorization URL
  const authUrl = `${config.authServer}/authorize?` +
    `client_id=${config.clientId}&` +
    `redirect_uri=${config.redirectUri}&` +
    `response_type=code&` +
    `scope=openid profile&` +
    `state=${generateState()}`;
  
  steps.authUrlGenerated = { status: 'ok', url: authUrl };
  
  // Step 2: Exchange code for token (using test credentials)
  const tokenResponse = await fetch(`${config.authServer}/oauth/token`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
    body: new URLSearchParams({
      grant_type: 'authorization_code',
      code: config.testCode,
      redirect_uri: config.redirectUri,
      client_id: config.clientId,
      client_secret: config.clientSecret,
    }),
  });
  
  const tokenData = await tokenResponse.json();
  steps.tokenIssued = {
    status: tokenResponse.ok ? 'ok' : 'error',
    hasAccessToken: !!tokenData.access_token,
    hasRefreshToken: !!tokenData.refresh_token,
  };
  
  // Step 3: Validate the token
  const validateResponse = await fetch(`${config.apiBase}/user/me`, {
    headers: { 'Authorization': `Bearer ${tokenData.access_token}` },
  });
  
  steps.tokenValidated = {
    status: validateResponse.ok ? 'ok' : 'error',
    statusCode: validateResponse.status,
  };
  
  // Step 4: Test refresh flow
  const refreshResponse = await fetch(`${config.authServer}/oauth/token`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
    body: new URLSearchParams({
      grant_type: 'refresh_token',
      refresh_token: tokenData.refresh_token,
      client_id: config.clientId,
      client_secret: config.clientSecret,
    }),
  });
  
  const refreshData = await refreshResponse.json();
  steps.refreshWorked = {
    status: refreshResponse.ok ? 'ok' : 'error',
    hasNewAccessToken: !!refreshData.access_token,
  };
  
  return steps;
}

JWT Monitoring

JWTs can fail in several ways that monitoring should catch:

| Failure Mode | Symptom | Detection Method | |---|---|---| | Expired signing key | All new tokens invalid | Verify newly issued tokens | | Key rotation mismatch | Old tokens rejected early | Check token exp claim | | Algorithm mismatch | Signature validation fails | Test full issue+validate cycle | | Claim missing | Authorization failures | Parse and validate JWT claims | | Clock skew | Valid tokens rejected | Check nbf and iat claims |

JWT Validation Monitor

import jwt
import requests
import time
from datetime import datetime

def monitor_jwt_health(token_endpoint, validation_endpoint, credentials):
    """
    End-to-end JWT health check:
    1. Issue a new token
    2. Decode and validate claims
    3. Use token to hit a protected endpoint
    """
    results = {}
    
    # Issue token
    token_response = requests.post(
        token_endpoint,
        data=credentials,
        timeout=5
    )
    
    if token_response.status_code != 200:
        return {"healthy": False, "step": "token_issuance", 
                "status_code": token_response.status_code}
    
    token_data = token_response.json()
    access_token = token_data.get("access_token")
    
    if not access_token:
        return {"healthy": False, "step": "token_issuance", 
                "error": "No access_token in response"}
    
    # Decode without verification to check claims
    try:
        claims = jwt.decode(access_token, options={"verify_signature": False})
        exp = claims.get("exp", 0)
        iat = claims.get("iat", 0)
        
        if exp < time.time():
            return {"healthy": False, "step": "claim_validation", 
                    "error": "Token already expired"}
        
        results["expires_in"] = exp - time.time()
        results["issued_at"] = datetime.fromtimestamp(iat).isoformat()
        
    except jwt.DecodeError as e:
        return {"healthy": False, "step": "token_decode", "error": str(e)}
    
    # Use token against protected endpoint
    protected_response = requests.get(
        validation_endpoint,
        headers={"Authorization": f"Bearer {access_token}"},
        timeout=5
    )
    
    results["validation_status"] = protected_response.status_code
    results["healthy"] = protected_response.status_code == 200
    
    return results

API Key Monitoring

API keys are simpler than OAuth2 but still need monitoring:

# Monitor API key authentication
monitor:
  name: "API Key Authentication Check"
  url: "https://api.example.com/v1/account"
  method: GET
  headers:
    X-API-Key: "${MONITORING_API_KEY}"
  assertions:
    - type: status_code
      value: 200
    - type: json_body
      path: "$.account.status"
      value: "active"

API key monitors should use keys with:

Read-only permissions
No access to sensitive data
Dedicated to monitoring (so you can rotate them independently)
Longer expiration than normal user keys

Monitoring API Key Expiration

Many APIs issue API keys with expiration dates. Build a check that alerts before keys expire:

def check_api_key_expiration(api_keys_config):
    """Alert if any monitoring API key expires within 30 days"""
    alerts = []
    
    for service, config in api_keys_config.items():
        response = requests.get(
            config['key_info_endpoint'],
            headers={'Authorization': f'Bearer {config["admin_token"]}'}
        )
        
        key_info = response.json()
        expiry = datetime.fromisoformat(key_info['expires_at'])
        days_until_expiry = (expiry - datetime.now()).days
        
        if days_until_expiry < 30:
            alerts.append({
                'service': service,
                'key_id': key_info['id'],
                'expires_in_days': days_until_expiry,
                'severity': 'critical' if days_until_expiry < 7 else 'warning'
            })
    
    return alerts

SAML and SSO Monitoring

Enterprise applications often use SAML for single sign-on. Monitoring SAML flows is more complex because they're browser-redirect-based, but you can monitor critical components:

# Monitor IdP metadata endpoint
monitor:
  name: "SAML IdP Metadata"
  url: "https://idp.example.com/metadata"
  method: GET
  assertions:
    - type: status_code
      value: 200
    - type: response_contains
      value: "EntityDescriptor"
    - type: response_time
      operator: less_than
      value: 2000

# Monitor SP metadata endpoint  
monitor:
  name: "SAML SP Metadata"
  url: "https://app.example.com/auth/saml/metadata"
  method: GET
  assertions:
    - type: status_code
      value: 200
    - type: response_contains
      value: "AssertionConsumerService"

Also monitor your SAML certificate expiration — a common cause of sudden SSO outages:

# Check SAML certificate expiry
openssl x509 -in saml-sp.crt -noout -enddate

# Output: notAfter=Mar 15 12:00:00 2026 GMT
# Set an alert 90 days before this date

Monitoring Auth Provider Dependencies

If you use Auth0, Okta, Cognito, or another identity provider, monitor their status and your dependency on them:

| Dependency | What to Monitor | Alert Threshold | |---|---|---| | Auth0 | Token endpoint latency | > 500ms | | Auth0 | Status page | Any degradation | | Okta | Authorization server availability | Any downtime | | Cognito | Token issuance latency | > 1000ms | | Your JWKS endpoint | Key set availability | Any downtime |

Track auth latency separately from API latency so you know when auth is contributing to overall request time:

Total Request Time = Auth Latency + Application Logic + Database

If auth latency is 400ms and your SLA is 500ms, you have only 100ms for everything else.

Auth Failure Alerting

Configure alerts specifically for authentication failures:

alerts:
  - name: "Auth Token Endpoint Down"
    condition: "oauth_token_endpoint_available = false"
    severity: critical
    message: "Authentication service is unavailable - users cannot log in"
    
  - name: "High Auth Failure Rate"
    condition: "auth_failure_rate > 5% for 5 minutes"
    severity: critical
    message: "Authentication is failing for more than 5% of attempts"
    
  - name: "JWT Validation Failures"
    condition: "jwt_validation_errors > 10 per minute"
    severity: warning
    message: "JWT validation errors spiking - possible key rotation issue"
    
  - name: "Refresh Token Failure"
    condition: "refresh_token_failure_rate > 2%"
    severity: warning
    message: "Refresh token flow is failing - users will be logged out"

Conclusion

Authentication is the gateway to your entire service. An outage in auth doesn't just degrade functionality — it locks users out completely. Comprehensive auth monitoring covers the full authentication lifecycle: token issuance, validation, refresh, and expiration. It monitors your dependencies on external identity providers and alerts early enough that you can fix issues before they cause widespread lockout. AzMonitor makes it straightforward to build multi-step authentication health checks that catch these failures before your users do.

Tags:API authenticationOAuth2JWT monitoringAPI security

Back to blog

AzMonitor Team

The AzMonitor team writes guides based on experience monitoring millions of endpoints daily across 10,000+ customer environments. Our expertise covers uptime monitoring, SRE practices, and reliability engineering.

Try AzMonitor free

3 monitors free forever · No credit card needed · Set up in 2 minutes

Start monitoring free →

API Authentication Monitoring: Keeping Auth Flows Healthy

Why Auth Monitoring Is Harder Than Regular API Monitoring

Monitoring OAuth2 Flows

Multi-Step OAuth2 Monitoring

JWT Monitoring

JWT Validation Monitor

API Key Monitoring

Monitoring API Key Expiration

SAML and SSO Monitoring

Monitoring Auth Provider Dependencies

Auth Failure Alerting

Conclusion

Related articles

Microservices API Monitoring: Observability at Scale

API SLA Monitoring: Tracking and Reporting on API Service Agreements

API Gateway Monitoring: Observability for Your API Perimeter