API Monitoring

gRPC Monitoring: Observability for High-Performance RPC Services

Learn how to monitor gRPC services effectively, including status codes, streaming RPCs, interceptors, and performance benchmarking for production systems.

AzMonitor TeamJanuary 22, 20259 min read · 1,189 wordsUpdated January 20, 2026
gRPCAPI monitoringRPCmicroservices

gRPC powers some of the most performance-sensitive services in modern infrastructure. Google uses it internally for virtually everything. Netflix, Square, and Lyft rely on it for service-to-service communication. If you're running gRPC services in production, you need a monitoring strategy built for its unique characteristics — because the REST monitoring playbook doesn't apply here.

gRPC vs REST: Why Monitoring Differs

gRPC uses HTTP/2 as its transport layer and Protocol Buffers for serialization. This gives you bidirectional streaming, multiplexing, and extremely efficient binary encoding — but it also means:

  • Traditional HTTP monitors can't easily inspect gRPC traffic
  • Status codes are gRPC-specific, not HTTP status codes (though they map to HTTP/2 status codes)
  • Binary payloads require schema knowledge to validate
  • Streaming RPCs have different health semantics than request-response

Most infrastructure tools — load balancers, proxies, HTTP monitors — weren't designed with gRPC in mind. Proper monitoring requires intentional setup.

gRPC Status Codes

gRPC defines its own status codes. Understanding them is essential for writing meaningful monitors:

| Code | Name | Meaning | HTTP Equivalent | |---|---|---|---| | 0 | OK | Success | 200 | | 1 | CANCELLED | Client cancelled | 499 | | 2 | UNKNOWN | Unknown error | 500 | | 3 | INVALID_ARGUMENT | Bad request | 400 | | 4 | DEADLINE_EXCEEDED | Timeout | 504 | | 5 | NOT_FOUND | Resource not found | 404 | | 8 | RESOURCE_EXHAUSTED | Rate limit hit | 429 | | 13 | INTERNAL | Internal server error | 500 | | 14 | UNAVAILABLE | Service unavailable | 503 |

Alert thresholds should be set per status code. INVALID_ARGUMENT errors are usually client bugs; UNAVAILABLE errors are server-side problems that need immediate attention.

The gRPC Health Check Protocol

gRPC defines a standard health checking protocol you should implement in every service:

// health.proto (from grpc/grpc)
syntax = "proto3";

package grpc.health.v1;

message HealthCheckRequest {
  string service = 1;
}

message HealthCheckResponse {
  enum ServingStatus {
    UNKNOWN = 0;
    SERVING = 1;
    NOT_SERVING = 2;
    SERVICE_UNKNOWN = 3;
  }
  ServingStatus status = 1;
}

service Health {
  rpc Check(HealthCheckRequest) returns (HealthCheckResponse);
  rpc Watch(HealthCheckRequest) returns (stream HealthCheckResponse);
}

Implement this in your services:

// Go implementation
import "google.golang.org/grpc/health/grpc_health_v1"

type healthServer struct{}

func (s *healthServer) Check(
  ctx context.Context,
  req *grpc_health_v1.HealthCheckRequest,
) (*grpc_health_v1.HealthCheckResponse, error) {
  // Check dependencies
  if err := db.Ping(); err != nil {
    return &grpc_health_v1.HealthCheckResponse{
      Status: grpc_health_v1.HealthCheckResponse_NOT_SERVING,
    }, nil
  }
  return &grpc_health_v1.HealthCheckResponse{
    Status: grpc_health_v1.HealthCheckResponse_SERVING,
  }, nil
}

Then monitor it with grpc-health-probe:

# Install grpc-health-probe
go install github.com/grpc-ecosystem/grpc-health-probe@latest

# Check a service
grpc-health-probe -addr=localhost:50051 -service=UserService

# Output: status: SERVING
# Exit code: 0 = healthy, 1 = unhealthy, 2 = connection failed

This can be integrated into Kubernetes liveness and readiness probes:

livenessProbe:
  exec:
    command:
      - /bin/grpc_health_probe
      - -addr=:50051
  initialDelaySeconds: 5
  periodSeconds: 10
readinessProbe:
  exec:
    command:
      - /bin/grpc_health_probe
      - -addr=:50051
  initialDelaySeconds: 5
  periodSeconds: 10

Instrumenting gRPC Services with Interceptors

Interceptors (gRPC's equivalent of middleware) are the cleanest way to add observability without modifying every handler:

// Unary server interceptor for metrics
func metricsInterceptor(
  ctx context.Context,
  req interface{},
  info *grpc.UnaryServerInfo,
  handler grpc.UnaryHandler,
) (interface{}, error) {
  start := time.Now()
  
  resp, err := handler(ctx, req)
  
  duration := time.Since(start)
  statusCode := status.Code(err)
  
  // Record metrics
  requestCounter.WithLabelValues(
    info.FullMethod,
    statusCode.String(),
  ).Inc()
  
  requestDuration.WithLabelValues(
    info.FullMethod,
  ).Observe(duration.Seconds())
  
  return resp, err
}

// Register the interceptor
server := grpc.NewServer(
  grpc.UnaryInterceptor(metricsInterceptor),
)

The go-grpc-prometheus library provides ready-made Prometheus metrics:

import grpc_prometheus "github.com/grpc-ecosystem/go-grpc-prometheus"

server := grpc.NewServer(
  grpc.UnaryInterceptor(grpc_prometheus.UnaryServerInterceptor),
  grpc.StreamInterceptor(grpc_prometheus.StreamServerInterceptor),
)

grpc_prometheus.Register(server)

// Expose metrics
http.Handle("/metrics", promhttp.Handler())
go http.ListenAndServe(":9090", nil)

Key Metrics to Monitor

Once instrumented, track these metrics for each service:

# Request rate (requests per second)
rate(grpc_server_handled_total[5m])

# Error rate by service and method
rate(grpc_server_handled_total{grpc_code!="OK"}[5m])
/ rate(grpc_server_handled_total[5m])

# p99 latency
histogram_quantile(0.99,
  rate(grpc_server_handling_seconds_bucket[5m])
)

# In-flight requests
grpc_server_started_total - grpc_server_handled_total

Visualize these in a dashboard with alerts:

| Metric | Warning | Critical | |---|---|---| | Error rate | > 1% | > 5% | | p99 latency | > 500ms | > 2000ms | | Deadline exceeded rate | > 0.5% | > 2% | | In-flight requests | > 100 | > 500 |

Monitoring Streaming RPCs

Streaming RPCs — server streaming, client streaming, bidirectional — require extra attention because they hold connections open for extended periods.

Server-Side Streaming

// Example: streaming log entries
func (s *LogServer) StreamLogs(
  req *pb.LogRequest,
  stream pb.LogService_StreamLogsServer,
) error {
  startTime := time.Now()
  messageCount := 0
  
  defer func() {
    streamDuration.Observe(time.Since(startTime).Seconds())
    streamMessageCount.Observe(float64(messageCount))
  }()
  
  for log := range s.logChan {
    if err := stream.Send(log); err != nil {
      streamErrors.Inc()
      return err
    }
    messageCount++
    
    // Check if client cancelled
    if err := stream.Context().Err(); err != nil {
      return err
    }
  }
  return nil
}

Monitor streaming RPCs differently from unary calls:

  • Stream duration — How long streams stay open
  • Messages per stream — Total messages sent/received
  • Stream error rate — Streams that end with non-OK status
  • Concurrent streams — Number of active streaming connections

Testing gRPC with grpcurl

For ad-hoc testing and monitoring integration, grpcurl lets you interact with gRPC services without writing code:

# List services (requires server reflection)
grpcurl -plaintext localhost:50051 list

# List methods of a service
grpcurl -plaintext localhost:50051 list UserService

# Call a method
grpcurl -plaintext \
  -d '{"user_id": "monitor-test-123"}' \
  localhost:50051 \
  UserService/GetUser

# With TLS
grpcurl \
  -d '{"user_id": "monitor-test-123"}' \
  api.example.com:443 \
  UserService/GetUser

You can wrap grpcurl calls in shell scripts for basic health monitoring:

#!/bin/bash
RESULT=$(grpcurl -plaintext \
  -d '{"service": "UserService"}' \
  localhost:50051 \
  grpc.health.v1.Health/Check 2>&1)

if echo "$RESULT" | grep -q "SERVING"; then
  echo "HEALTHY"
  exit 0
else
  echo "UNHEALTHY: $RESULT"
  exit 1
fi

Distributed Tracing for gRPC

For multi-service gRPC architectures, distributed tracing is essential. OpenTelemetry provides gRPC instrumentation:

import (
  "go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
)

// Server with tracing
server := grpc.NewServer(
  grpc.UnaryInterceptor(otelgrpc.UnaryServerInterceptor()),
  grpc.StreamInterceptor(otelgrpc.StreamServerInterceptor()),
)

// Client with tracing (propagates trace context)
conn, err := grpc.Dial(
  "service:50051",
  grpc.WithUnaryInterceptor(otelgrpc.UnaryClientInterceptor()),
  grpc.WithStreamInterceptor(otelgrpc.StreamClientInterceptor()),
)

With tracing enabled, a slow gRPC call becomes immediately diagnosable — you can see which service in the call chain introduced the latency.

Load Balancer and Service Mesh Considerations

gRPC over HTTP/2 doesn't work well with traditional L4 load balancers. If all requests go to one server because the TCP connection stays open, monitoring will show healthy averages while individual servers are overloaded.

Proper gRPC load balancing requires:

  • L7-aware load balancers (Envoy, NGINX with grpc_pass, AWS ALB)
  • Client-side load balancing
  • Service mesh (Istio, Linkerd)

Monitor load distribution across instances:

# Check if load is evenly distributed
stddev(rate(grpc_server_handled_total[5m])) by (instance)
/ avg(rate(grpc_server_handled_total[5m]))

A high coefficient of variation (> 0.3) indicates uneven load distribution.

Conclusion

gRPC monitoring requires understanding its unique characteristics: binary protocols, streaming RPCs, custom status codes, and HTTP/2 multiplexing. Start with the standard health check protocol, add interceptor-based instrumentation, and build dashboards around error rate and latency percentiles broken down by method. AzMonitor can monitor gRPC health check endpoints and integrate with your existing observability stack to give you unified visibility across REST, GraphQL, and gRPC services.

Tags:gRPCAPI monitoringRPCmicroservices
Back to blog
A
AzMonitor Team
The AzMonitor team writes guides based on experience monitoring millions of endpoints daily across 10,000+ customer environments. Our expertise covers uptime monitoring, SRE practices, and reliability engineering.
Try AzMonitor free

3 monitors free forever · No credit card needed · Set up in 2 minutes

Start monitoring free →