The decision between a public and private status page isn't just about who can see it — it determines what you say, how much detail you share, and what trust you're building. Many organizations need both: a public page for customers and a private page for internal teams with richer operational detail.
Public Status Pages
A public status page is accessible to anyone without authentication. This is the status page that customers bookmark and check when they experience issues.
What Public Status Pages Should Show
## Public Status Page Content Guide
### SHOW customers:
- Current operational status (Up/Degraded/Down) per service component
- Recent incident history (last 90 days minimum)
- Active incident updates with timestamps
- Scheduled maintenance windows (with advance notice)
- Historical uptime percentages (builds credibility)
- Subscriber notification signup
### DON'T show customers:
- Internal service names they don't interact with ("redis-cluster-primary")
- Technical root cause details before you understand them
- Infrastructure configuration details (security risk)
- Names of engineers responding (privacy, not relevant)
- Speculation about causes before confirmed
- Metrics that require context to interpret (CPU %, connection pool utilization)
Public Status Page Component Naming
Translate infrastructure to customer language:
| Internal Name | Customer-Facing Name | |---|---| | api-gateway-prod | API | | postgresql-primary | Data storage | | redis-cluster | Real-time features | | cdn-edge-nodes | Website & app loading | | worker-fleet | Background processing | | auth-service | Login & authentication | | stripe-integration | Billing & payments | | notification-service | Email & SMS notifications |
Public Incident Communication Standards
## Public Incident Writing Guidelines
### First update (within 10 minutes):
What: State the symptom in customer terms
Example: "Some users may experience errors when attempting to log in."
What NOT to say: "Authentication microservice is returning 503s"
Why: Customers don't know what that means and it sounds like you're
unsure about your own architecture.
### Progress updates (every 15-30 minutes):
Good: "We have identified the cause and are implementing a fix."
Bad: "We are still investigating the issue and hope to resolve it soon."
### Resolution:
Good: "Login is now working normally. The issue lasted 23 minutes.
A detailed incident report will be published within 3 business days."
Bad: "Issue resolved." (No context, no commitment to learning)
Private Status Pages
Private status pages serve internal audiences — engineering, customer success, support, and leadership — with more detail than customers need.
Internal Status Page Audiences
Different internal audiences need different information:
Engineering team:
- Which specific services are affected
- Current metrics (error rate, latency, affected requests per second)
- Active incident channel link and war room details
- Deploy history for correlation
- Database and infrastructure metrics
Customer success/support:
- Which customers are affected
- Talking points for customer conversations
- Workarounds available for customers
- Expected resolution timeline
- Escalation path if customers need immediate help
Leadership/executives:
- Business impact (revenue affected, customers affected)
- Current status in plain English
- Estimated resolution time
- External communication status
Building a Private Status Dashboard
# private_status_api.py
from flask import Flask, jsonify, request
from functools import wraps
app = Flask(__name__)
def require_internal_auth(f):
"""Middleware to require internal authentication (SSO, API key, etc.)"""
@wraps(f)
def decorated(*args, **kwargs):
auth_token = request.headers.get("Authorization")
if not validate_internal_token(auth_token):
return jsonify({"error": "Unauthorized"}), 401
return f(*args, **kwargs)
return decorated
@app.route('/internal/status')
@require_internal_auth
def internal_status():
"""
Rich status for internal team — more detail than public page.
"""
return jsonify({
"generated_at": datetime.utcnow().isoformat(),
# Public status (same as public page)
"public_status": get_public_component_status(),
# Internal-only: detailed service health
"internal_services": {
"api_gateway": {
"status": "operational",
"error_rate_pct": 0.02,
"p99_latency_ms": 187,
"requests_per_second": 1240
},
"postgresql_primary": {
"status": "operational",
"connections_used": 245,
"connections_max": 500,
"replication_lag_ms": 12
},
"redis_cluster": {
"status": "operational",
"memory_used_pct": 68,
"hit_rate_pct": 94.2
}
},
# Active incidents with full detail
"active_incidents": get_active_incidents_with_detail(),
# Customer impact
"customer_impact": {
"enterprise_customers_affected": 0,
"total_requests_affected_last_hour": 0,
"active_support_tickets_related": 3
},
# Recent deployments (correlation context)
"recent_deployments": get_deployments_last_4_hours()
})
@app.route('/internal/status/customer-facing')
@require_internal_auth
def customer_success_view():
"""
Simplified view optimized for customer success team.
Shows customer impact without technical infrastructure detail.
"""
return jsonify({
"overall_status": get_simple_status(),
"customer_talking_points": get_current_talking_points(),
"affected_enterprise_accounts": get_affected_enterprise_accounts(),
"available_workarounds": get_current_workarounds(),
"eta": get_resolution_eta()
})
Hybrid Approach: Public and Private
Most mature organizations run both:
## Hybrid Status Page Architecture
### Public Status Page (status.example.com)
- Publicly accessible
- Customer-friendly language
- Component status + incident history
- Subscriber notifications
- Historical uptime
- Links to postmortems after incidents
### Internal Status Dashboard (internal-status.example.com)
- SSO required (company employees only)
- Full technical metrics
- Real-time customer impact data
- Active incident coordination
- Deployment history for correlation
- Links to war room channels
### When Information Flows Between Them
- Public page updated manually by Communications Lead during incidents
- Internal dashboard auto-updated from monitoring data
- Internal team sees full picture; customers see appropriate subset
What to Publish After Incidents
Post-incident transparency builds long-term trust:
Public Postmortem Structure
# Incident Report: Login Service Disruption
## August 13, 2025 | 14:22 - 14:45 UTC | 23 Minutes
### Summary
Some users experienced errors when attempting to log in to [Product].
The issue affected approximately 15% of login attempts during the 23-minute period.
### What happened
A configuration change deployed at 14:10 UTC contained an error that caused
the authentication service to reject valid login requests for a subset of users.
### What we did
Our monitoring system detected elevated error rates at 14:22 UTC and immediately
paged our on-call team. We identified the cause at 14:35 UTC and rolled back the
configuration change, restoring full service by 14:45 UTC.
### What we're doing to prevent recurrence
- Added automated testing for authentication configuration changes
- Implemented a canary deployment process for auth service changes
- Improved monitoring to detect this class of error more quickly
### Credits
Users affected during this period are entitled to a service credit per our SLA.
Credits will be automatically applied to your next invoice.
Status Page Trust Factors
What makes customers trust a status page:
| Trust Builder | Trust Destroyer | |---|---| | Historical transparency (showing past incidents) | No incident history (looks like you hide problems) | | Quick initial acknowledgment (< 10 minutes) | Long silence followed by "we were aware all along" | | Specific impact information | Vague "some users" language without clarification | | Resolution time estimates (when confident) | Premature "almost resolved" declarations | | Proactive updates on a schedule | Irregular or long gaps between updates | | Postmortem links after incidents | Incidents that "disappear" from history | | Uptime percentage displayed | No uptime data (implies you're hiding it) |
Conclusion
Public and private status pages serve fundamentally different audiences with different information needs. The public page builds customer trust through transparency about what matters to them; the private page gives internal teams the operational detail they need to respond effectively. The discipline of maintaining both — updating the public page with appropriate customer language while giving internal teams rich technical context — is a communication skill that improves with practice. AzMonitor's status page product provides both public status pages with historical uptime and incident history, and the monitoring data that powers automated status updates, making it practical to maintain the communication standards that build lasting customer trust.
3 monitors free forever · No credit card needed · Set up in 2 minutes
Start monitoring free →