Uptime monitoring service active; monitors public endpoint at least every 5 minutes; alerts on downtime

ab-000976 · deployment-readiness.monitoring-alerting.uptime-monitoring

Severity: highactive

Why it matters

Without uptime monitoring, you learn about production outages from users, not from alerts. A 5-minute monitoring interval is the operational threshold between acceptable detection time and prolonged silent downtime. SOC 2 A1.1 requires availability controls; NIST SI-4 requires system monitoring. ISO 25010 reliability.availability cannot be demonstrated without instrumentation. An undetected 2-hour outage during off-hours can erase a week of user trust — monitoring configured to alert in under 5 minutes limits exposure to minutes, not hours.

Severity rationale

High because without sub-5-minute uptime checks, production outages remain undetected until users report them, extending mean time to detection and amplifying business impact.

Remediation

Set up Uptime Robot (free tier supports 5-minute intervals) or Datadog Synthetics for your production endpoint.

For Uptime Robot: sign up at uptimerobot.com, create a monitor for https://your-app.com, set the interval to 5 minutes, and add a Slack webhook or email alert channel. Document the monitoring dashboard URL in DEPLOYMENT.md.

For Datadog:

# Datadog synthetic monitor (configured via UI or Terraform)
resource "datadog_synthetics_test" "uptime" {
  name    = "Production uptime"
  type    = "api"
  subtype = "http"
  request_definition {
    method = "GET"
    url    = "https://your-app.com/api/health"
  }
  locations = ["aws:us-east-1"]
  options_list {
    tick_every = 300  # 5 minutes
  }
}

Alert to a Slack channel tied to your on-call rotation.

Detection

ID: uptime-monitoring
Severity: high
What to look for: Enumerate every relevant item. Look for monitoring service integration: Pingdom, Uptime Robot, Datadog Synthetics, AWS CloudWatch, or New Relic. Check configuration files, documentation, or code for references to monitoring endpoints. Verify monitoring interval is at most 5 minutes.
Pass criteria: An uptime monitoring service is configured and actively monitoring the production endpoint with a frequency of 5 minutes or less. Alerts are sent to the team (Slack, email, PagerDuty) on downtime.
Fail criteria: No uptime monitoring service is configured, or monitoring is configured but not actively running, or monitoring interval exceeds 5 minutes.
Skip (N/A) when: The project is not planned for production, or is API-only with no public HTTP endpoint.
Detail on fail: "No uptime monitoring service detected. Production endpoint is not monitored for availability." or "Uptime Robot configured but monitoring only every 30 minutes — exceeds 5 minute SLA." or "No alert destination configured for downtime notifications."
Remediation: Set up uptime monitoring. Using Uptime Robot (free tier available):
1. Sign up at uptime.robot
2. Add a monitor for your production endpoint: https://your-app.com
3. Set check interval to 5 minutes
4. Add notification channels (Slack webhook, email)
5. Document the monitoring dashboard URL in your DEPLOYMENT.md
Or use Datadog:
```
# datadog.yml or via Datadog UI
monitors:
  - name: Uptime Check
    type: uptime
    query: "http.request.https://your-app.com/health"
    threshold: 5m
    alert_to: slack_channel
```

External references

iso-25010:2011 · reliability.availability — Availability — degree to which system is operational and accessible
nist:rev5 · SI-4 — System Monitoring
soc2:2017 · A1.1 — Availability — uptime monitoring and notification

Taxons

observability

History

2026-04-18·v1.0.0·Initial import from deployment-readiness·automated