Without exported metrics, you cannot answer the most basic operational questions: how many connections are active right now, is message throughput degrading, is the outbound queue backing up? Invisible failures — a Redis adapter that silently stops publishing, a connection leak growing at 10 sockets per hour — go undetected until they cause an outage. ISO 25010 reliability includes the ability to monitor system health; a real-time service with no observability is operationally blind.
Low because missing metrics delay detection of connection leaks and queue backlogs, turning slow-burn failures into surprise outages rather than paged alerts.
Export at minimum three gauges — active connections, messages processed, and queue depth — to a Prometheus-compatible endpoint or your existing observability backend.
import promClient from 'prom-client';
const activeConns = new promClient.Gauge({
name: 'ws_active_connections',
help: 'Live WebSocket connections',
});
const msgTotal = new promClient.Counter({
name: 'ws_messages_total',
help: 'Messages processed since startup',
});
io.on('connection', (socket) => {
activeConns.inc();
socket.on('disconnect', () => activeConns.dec());
socket.on('send_message', () => msgTotal.inc());
});
app.get('/metrics', async (_req, res) => {
res.set('Content-Type', promClient.register.contentType);
res.end(await promClient.register.metrics());
});
Scrape /metrics from Prometheus or forward to Datadog/CloudWatch. Add an alert on ws_active_connections exceeding your tested ceiling.
ID: community-realtime.realtime-ux.observability-metrics
Severity: low
What to look for: Count all metrics and observability integrations. Enumerate the metric types tracked: active connection count, message throughput, queue depth, error rates, latency percentiles. Count the observability backends: Prometheus, Datadog, CloudWatch, or equivalent.
Pass criteria: The system exposes at least 3 real-time metrics to an observability backend (Prometheus, Datadog, CloudWatch, etc.) including at minimum active connection count.
Fail criteria: No metrics are exposed, or metrics are only logged without being sent to an observability system.
Skip (N/A) when: Never — observability is essential for production systems.
Cross-reference: For broader monitoring patterns and error tracking, the SaaS Error Handling Audit covers observability infrastructure.
Detail on fail: "No metrics infrastructure. Unable to monitor connection count, throughput, or queue health."
Remediation: Export real-time metrics to an observability system:
import promClient from 'prom-client';
const activeConnections = new promClient.Gauge({
name: 'websocket_active_connections',
help: 'Number of active WebSocket connections',
});
const messagesThroughput = new promClient.Counter({
name: 'websocket_messages_total',
help: 'Total number of messages processed',
});
io.on('connection', (socket) => {
activeConnections.inc();
socket.on('disconnect', () => activeConnections.dec());
});
socket.on('message', () => {
messagesThroughput.inc();
});
// Expose Prometheus endpoint
app.get('/metrics', (req, res) => {
res.set('Content-Type', promClient.register.contentType);
res.end(promClient.register.metrics());
});