REST APIs are the backbone of modern web services. Every time a user clicks a button, submits a form, or loads a dashboard, dozens of API calls fire behind the scenes. When those APIs degrade or go down, everything grinds to a halt — and users notice before your team does. That's why REST API monitoring isn't optional anymore; it's as essential as server uptime checks.
What REST API Monitoring Actually Covers
REST API monitoring goes far beyond simply pinging an endpoint and checking for a 200 response. Comprehensive monitoring tracks availability, latency, response body correctness, authentication flows, and downstream dependencies. The difference between naive monitoring and real monitoring is catching issues before they compound into incidents.
The Four Dimensions of API Health
Availability — Is the endpoint responding at all? A simple HTTP check confirms the service is reachable, but availability without context is misleading.
Correctness — Does the response body contain the expected data? An API can return 200 with an empty array or malformed JSON, which is technically "up" but functionally broken.
Latency — How long does the endpoint take to respond? A p95 latency of 800ms might be acceptable; 4000ms is not.
Consistency — Does the API behave the same across regions, over time, and under load? Flaky behavior is often worse than clean downtime because it's harder to diagnose.
Setting Up REST API Monitoring
Defining Your Check Configuration
A typical REST API monitor needs these parameters:
monitor:
name: "Payment API - Charge Endpoint"
url: "https://api.example.com/v2/charges"
method: POST
interval: 60 # seconds
timeout: 5000 # milliseconds
headers:
Content-Type: "application/json"
Authorization: "Bearer ${API_KEY}"
body: |
{
"amount": 100,
"currency": "usd",
"source": "monitor_test"
}
assertions:
- type: status_code
value: 200
- type: response_time
operator: less_than
value: 2000
- type: json_body
path: "$.status"
value: "success"
Authentication Strategies
Most production APIs require authentication. The three most common patterns are:
| Auth Method | Header Format | Use Case |
|---|---|---|
| Bearer Token | Authorization: Bearer {token} | OAuth2, JWT APIs |
| API Key | X-API-Key: {key} | Simple service APIs |
| Basic Auth | Authorization: Basic {b64} | Legacy systems |
| HMAC Signature | X-Signature: {hash} | Webhook endpoints |
For monitoring purposes, create a dedicated monitoring API key with read-only or minimal permissions. Never use admin credentials in monitoring configurations.
Response Validation
The real power of API monitoring is validating what comes back. A response validator should check:
// Example response validation logic
function validatePaymentResponse(response) {
const body = JSON.parse(response.body);
// Check required fields exist
if (!body.transaction_id) throw new Error("Missing transaction_id");
if (!body.status) throw new Error("Missing status field");
// Check field types
if (typeof body.amount !== "number") throw new Error("amount must be number");
// Check business logic
if (body.status === "pending" && !body.estimated_completion) {
throw new Error("Pending status requires estimated_completion");
}
return true;
}
Monitoring Different HTTP Methods
GET endpoints are the easiest to monitor — they're idempotent and safe. POST, PUT, PATCH, and DELETE endpoints require more care because monitoring can have side effects.
Safe Monitoring Patterns for Write Operations
Test environments — Run write-operation monitors against a staging environment that mirrors production.
Idempotency keys — Use consistent idempotency keys so repeated monitor calls don't create duplicate records.
Dedicated test resources — Create test records specifically for monitoring (e.g., a "monitoring_test" user or account).
Cleanup endpoints — After POST monitors create data, a companion DELETE monitor cleans it up.
# Example: POST then DELETE monitoring sequence
# 1. Create resource
POST /api/users
Body: {"email": "monitor-test@internal.example.com", "role": "test"}
# 2. Validate creation
GET /api/users/{{created_id}}
# 3. Clean up
DELETE /api/users/{{created_id}}
Latency Percentiles Matter More Than Averages
Average response time is almost useless for capacity planning and SLA management. A p50 of 200ms with a p99 of 8000ms means 1% of your users are having a terrible time — and they're likely your biggest or most active users.
Track these percentiles:
| Percentile | Meaning | Target (typical) | |---|---|---| | p50 (median) | Half of requests complete faster than this | < 200ms | | p90 | 90% of requests complete faster than this | < 500ms | | p95 | 95% of requests complete faster than this | < 800ms | | p99 | 99% of requests complete faster than this | < 2000ms |
When your p99 suddenly spikes while p50 stays flat, it usually points to a database query that's slow for certain data patterns, a cold-start issue, or a garbage collection pause.
Multi-Region API Monitoring
An API that responds in 120ms from US-East might take 900ms from Southeast Asia if there's no CDN or regional deployment. Geographic monitoring exposes these asymmetries.
# Multi-region monitor configuration
monitor:
name: "Search API - Global Latency"
url: "https://api.example.com/search?q=test"
regions:
- us-east-1
- eu-west-1
- ap-southeast-1
- sa-east-1
alert_if:
any_region_latency_exceeds: 1000ms
any_region_unavailable: true
When you see latency spikes in a single region, it often indicates a CDN misconfiguration, regional infrastructure issue, or routing problem — not an application bug.
Alerting for API Failures
Good API monitoring alerts are specific enough to be actionable without being so granular that they cause alert fatigue.
Alert Tiers
Critical (page immediately) — Endpoint is completely down, returning 5xx errors, or authentication is failing. These affect all users.
Warning (Slack notification) — Latency exceeds p95 threshold, error rate above 1%, or specific regional degradation. These need investigation but aren't emergencies.
Info (dashboard only) — Gradual latency trends, cache hit rate drops, or non-critical validation warnings. Track these for weekly reviews.
alerts:
critical:
condition: "error_rate > 5% for 3 consecutive checks"
channels: ["pagerduty", "slack-incidents"]
warning:
condition: "p95_latency > 800ms for 10 minutes"
channels: ["slack-engineering"]
info:
condition: "p99_latency > 2000ms"
channels: ["dashboard"]
Correlating API Monitors with Deployments
The most valuable API monitoring data isn't the absolute values — it's how they change over time. Annotate your monitoring dashboards with deployment markers so you can correlate incidents with code changes.
Many teams use deployment hooks to automatically add annotations:
# In your CI/CD pipeline (example using curl)
curl -X POST https://api.azmonitor.com/v1/annotations \
-H "Authorization: Bearer ${AZMONITOR_API_KEY}" \
-d '{
"title": "Deployed v2.4.1",
"description": "Payment service update — new retry logic",
"timestamp": "'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"
}'
After adding this to your pipeline, every spike in API latency or error rate can be immediately correlated with a specific deployment, cutting diagnosis time dramatically.
Common REST API Monitoring Mistakes
Monitoring only the happy path — If you only test successful requests, you'll miss scenarios where error handling is broken or error responses are malformed.
Ignoring redirect chains — APIs that return 301 or 302 responses may be misconfigured. Monitor the final response, not just the first hop.
Using production credentials — Monitoring credentials leaked in logs or error reports can become a security incident. Use dedicated, scoped monitoring keys.
Setting timeouts too high — A 30-second timeout means a hung endpoint doesn't alert for 30 seconds. Set timeouts to slightly above your p99 latency — typically 3-5 seconds for internal APIs.
Not monitoring API versioning — When you deprecate /v1/ endpoints, monitors for the old version will catch broken clients still using deprecated paths.
Building a Monitoring Coverage Map
Before you can say your API is properly monitored, map your endpoints against monitoring coverage:
| Endpoint | Method | Monitored | Frequency | Assertions |
|---|---|---|---|---|
| /api/auth/login | POST | Yes | 1 min | Status, token present |
| /api/users/{id} | GET | Yes | 5 min | Status, schema |
| /api/payments/charge | POST | Yes | 2 min | Status, tx_id present |
| /api/reports/export | GET | No | — | — |
| /api/webhooks/stripe | POST | No | — | — |
Any blank rows in that table represent blind spots. Prioritize by business impact — payment and authentication endpoints should be checked every minute from multiple regions.
Conclusion
REST API monitoring is one of the highest-leverage investments an engineering team can make. The goal isn't to collect data for its own sake — it's to catch problems before users do, understand performance trends before they become incidents, and give your team confidence when deploying changes. AzMonitor makes it straightforward to configure multi-region API checks with custom assertions, deployment annotations, and intelligent alerting so your team spends less time firefighting and more time building.
3 monitors free forever · No credit card needed · Set up in 2 minutes
Start monitoring free →