Status Pages

Building a Status Page: Everything You Need to Know

Learn how to build an effective status page that keeps customers informed during outages, builds trust, and reduces support ticket volume.

AzMonitor TeamJuly 2, 20259 min read · 1,356 wordsUpdated January 20, 2026
status pageincident communicationcustomer trusttransparency

A status page is the public face of your reliability. When things go wrong — and they always do eventually — your status page determines whether customers feel informed and respected, or abandoned and deceived. Companies that invest in transparent, real-time status pages retain customers through incidents. Companies that stay quiet and hope nobody notices lose them.

What a Status Page Should Show

Before building anything, clarify what your status page needs to communicate:

Component status — Which parts of your service are currently operational? Break down into meaningful components that match how users think about your product.

Current incidents — Any ongoing issues, with real-time updates as the situation evolves.

Incident history — Past incidents with resolution details, demonstrating your track record and improvement over time.

Scheduled maintenance — Upcoming planned maintenance so users can plan around it.

System metrics — Response time and uptime graphs give users confidence even when everything's green.

Designing Your Component Structure

Component names should match user-facing functionality, not internal architecture:

| Don't Use (Internal) | Use Instead (User-facing) | |---|---| | postgres-primary-us-east | Database | | api-gateway-prod | API | | nginx-load-balancer | Website | | redis-cluster-1 | Checkout | | worker-pool-v2 | Background Processing |

Users don't care what your internal systems are called. They care whether they can log in, complete a purchase, or access their data. Name components accordingly:

# Status page components
components:
  - name: "Website"
    description: "Public-facing website and documentation"
    group: "Core Services"
    
  - name: "API"
    description: "Developer API and integrations"
    group: "Core Services"
    
  - name: "Dashboard"
    description: "Customer dashboard and management console"
    group: "Core Services"
    
  - name: "Authentication"
    description: "Login and account management"
    group: "Core Services"
    
  - name: "Payment Processing"
    description: "Payments, subscriptions, and billing"
    group: "Billing"
    
  - name: "Email Notifications"
    description: "Transactional emails and alerts"
    group: "Notifications"
    
  - name: "Webhooks"
    description: "Event delivery to your systems"
    group: "Developer Services"

Component Status Levels

Use consistent status levels that users can quickly understand:

| Status | Meaning | Color | |---|---|---| | Operational | Everything working normally | Green | | Degraded Performance | Slower than usual, still functional | Yellow | | Partial Outage | Some users or features affected | Orange | | Major Outage | Service unavailable or severely impaired | Red | | Under Maintenance | Planned maintenance in progress | Blue |

Be honest about the distinction between degraded and partial outage. A 50% error rate is a partial outage, not "degraded performance."

Status Page Update Quality

The quality of your status updates matters as much as their frequency. Good updates are:

Specific — "Users cannot complete purchases" not "some users may experience issues."

Honest — If you don't know the cause yet, say so. "We are investigating the root cause" is better than a vague statement that implies you know more than you do.

Forward-looking — What are you doing about it? When will you have more information?

Jargon-free — Write for a business user, not an engineer.

Compare:

Bad update:
"We are experiencing issues with our infrastructure. Our team is working on it."

Good update:
"Users are currently unable to log in to their accounts. 
Our engineering team identified the issue at 14:22 UTC and is working on a fix.
We expect to have more information by 15:00 UTC.
Your data is safe and unaffected."

Automatic vs Manual Status Updates

The best status pages combine automation (for speed) with human review (for accuracy):

Automated status changes — Monitoring tools can automatically update component status when checks fail. This is the fastest way to post initial alerts.

Human-written incident descriptions — The "what's happening" and "what we're doing" context requires human judgment. Automate the "something is wrong" notification; write the "here's what's happening" manually.

# Auto-update component status based on monitoring
def handle_monitor_failure(monitor_result):
    """
    Automatically update status page when monitors fail.
    """
    component_map = {
        "website-homepage": "Website",
        "api-health": "API",
        "checkout-flow": "Payment Processing",
        "login-endpoint": "Authentication"
    }
    
    component_name = component_map.get(monitor_result.monitor_name)
    if not component_name:
        return  # No status page mapping for this monitor
    
    if monitor_result.consecutive_failures >= 3:
        # Update component to degraded after 3 failures
        status_page.update_component(
            name=component_name,
            status="degraded_performance",
            notify_subscribers=True
        )
    
    if monitor_result.consecutive_failures >= 5:
        # Escalate to partial/major outage
        status_page.update_component(
            name=component_name,
            status="partial_outage",
            notify_subscribers=True
        )
        
        # Create incident automatically
        status_page.create_incident(
            name=f"{component_name} - Investigating Issues",
            status="investigating",
            components=[component_name],
            body="We are investigating reports of issues affecting " +
                 f"{component_name}. Our team has been notified."
        )

Subscriber Notifications

Email subscribers when incidents happen. This is proactive — don't make customers check your status page when they're already experiencing problems:

# Send subscriber notification
def notify_subscribers(incident, notification_type):
    """
    Send email/SMS notifications to status page subscribers.
    """
    subscribers = get_subscribers(
        components=incident.affected_components,
        notification_types=[notification_type]
    )
    
    templates = {
        "incident_created": {
            "subject": f"[Investigating] {incident.name} | YourService Status",
            "body": f"""
We're writing to let you know we are investigating an issue affecting {', '.join(incident.affected_components)}.

Current status: {incident.status}

We'll continue to provide updates as we learn more. You can follow along at:
https://status.yourservice.com

This message was sent because you subscribed to updates for {', '.join(incident.affected_components)}.
"""
        },
        "incident_resolved": {
            "subject": f"[Resolved] {incident.name} | YourService Status",
            "body": f"""
The issue affecting {', '.join(incident.affected_components)} has been resolved.

Duration: {incident.start_time} to {incident.end_time}
Total downtime: {incident.duration_minutes} minutes

We apologize for any inconvenience. We'll publish a detailed postmortem within 5 business days.

https://status.yourservice.com
"""
        }
    }
    
    template = templates[notification_type]
    
    for subscriber in subscribers:
        email.send(
            to=subscriber.email,
            subject=template["subject"],
            body=template["body"]
        )

Scheduled Maintenance Communication

Post maintenance windows at least 72 hours in advance:

# Scheduled maintenance post template
maintenance:
  name: "Database Infrastructure Upgrade"
  scheduled_start: "2025-07-15T02:00:00Z"
  scheduled_end: "2025-07-15T04:00:00Z"
  
  announcement_date: "2025-07-12T09:00:00Z"  # 3 days before
  
  components:
    - "API"
    - "Dashboard"
    - "Payment Processing"
    
  description: |
    We will be performing a scheduled upgrade to our database infrastructure
    to improve performance and reliability.
    
    During this 2-hour window, the following services will be unavailable:
    • API (all endpoints)
    • Dashboard access
    • Payment processing
    
    Authentication will remain available.
    
    We recommend completing any time-sensitive transactions before 02:00 UTC.
    
  reminders:
    - time: 24h_before
      message: "Reminder: Database maintenance begins in 24 hours (02:00-04:00 UTC)"
    - time: 1h_before
      message: "Maintenance begins in 1 hour. Please save your work."

Status Page Design Principles

Single source of truth — Your status page should be the definitive source for service status. If your status page says green and Twitter says it's down, you have a credibility problem.

Always available — Your status page should be hosted on separate infrastructure from your main service. If your service goes down and takes the status page with it, that's a communications failure.

Embedded status widgets — Provide embeddable widgets your customers can add to their own status pages or dashboards:

<!-- Example embeddable status badge -->
<script>
  window.StatusPage = window.StatusPage || {};
  window.StatusPage.pageId = "YOUR_PAGE_ID";
</script>
<script src="https://status.yourservice.com/embed/script.js"></script>

Historical metrics — Show uptime history on the status page. 99.98% uptime over the past 90 days is more reassuring than a current green light with no historical context.

Measuring Status Page Effectiveness

Track these metrics to understand if your status page is working:

| Metric | How to Measure | Target | |---|---|---| | Subscriber count | Count email subscriptions | Growing over time | | Subscriber notifications sent | Count per incident | 100% for P1/P2 | | Time to first status update | Incident start to first post | < 10 minutes | | Support ticket reduction | Compare tickets during incidents | < 20% fewer tickets | | Status page traffic during incidents | Google Analytics spike analysis | High = users are checking |

Conclusion

A status page is one of the highest-ROI investments in customer trust you can make. It takes 30 minutes to set up the basics, requires ongoing discipline to maintain honestly, and pays dividends every time something goes wrong — which is guaranteed to happen. The combination of real-time monitoring and a proactive status page means customers learn about issues from you, not from each other on Twitter. AzMonitor integrates with status pages to automatically trigger component updates when monitoring checks fail, ensuring your status page reflects reality without requiring manual intervention during the most stressful moments.

Tags:status pageincident communicationcustomer trusttransparency
Back to blog
A
AzMonitor Team
The AzMonitor team writes guides based on experience monitoring millions of endpoints daily across 10,000+ customer environments. Our expertise covers uptime monitoring, SRE practices, and reliability engineering.
Try AzMonitor free

3 monitors free forever · No credit card needed · Set up in 2 minutes

Start monitoring free →