ChatRoom Monitoring Guide
This guide covers using Prometheus and Grafana to monitor the ChatRoom application.
Quick Start
Start Monitoring Services
bash
# Start full monitoring stack
docker compose --profile monitoring up -d1
2
2
Access URLs
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (admin/admin)
- Application Metrics: http://localhost:8080/metrics
Available Metrics
HTTP Metrics
| Metric Name | Type | Description |
|---|---|---|
http_requests_total | Counter | Total HTTP requests |
http_request_duration_seconds | Histogram | HTTP request latency |
http_requests_in_flight | Gauge | Current requests being processed |
WebSocket Metrics
| Metric Name | Type | Description |
|---|---|---|
chat_ws_connections | Gauge | Current WebSocket connections |
chat_ws_messages_total | Counter | Total WebSocket messages |
Note: The application exposes chat_ws_connections and chat_ws_messages_total for WebSocket monitoring.
Business Metrics
Metrics related to application business logic:
| Metric Name | Type | Description |
|---|---|---|
chatroom_users_total | Counter | Total registered users |
chatroom_rooms_total | Counter | Total rooms created |
chatroom_messages_total | Counter | Total messages sent |
Note: Custom business metrics can be added following the Prometheus client library patterns.
Grafana Dashboard
Import Dashboard
- Login to Grafana (http://localhost:3000)
- Navigate to Dashboards > Import
- Upload
grafana-dashboard.jsonfile (if available in deploy/prometheus/) - Select Prometheus data source
- Click Import
Dashboard Panels
- Overview: Request volume, error rate, P99 latency
- WebSocket: Connection count, message throughput
- System: CPU, memory, Goroutine count
Alert Rules
Example Alert Rules
yaml
groups:
- name: chatroom
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
- alert: HighLatency
expr: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "High latency detected"
- alert: TooManyConnections
expr: chat_ws_connections > 1000
for: 1m
labels:
severity: warning
annotations:
summary: "Too many WebSocket connections"1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Common PromQL Queries
Request Rate
rate(http_requests_total[5m])1
Error Rate
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))1
P99 Latency
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))1
WebSocket Connection Count
chat_ws_connections1
WebSocket Message Rate
rate(chat_ws_messages_total[5m])1
Troubleshooting
Metrics Endpoint Not Responding
- Check if application is running:
curl http://localhost:8080/health - Check metrics endpoint:
curl http://localhost:8080/metrics - Check Prometheus target status: http://localhost:9090/targets
Grafana No Data
- Confirm Prometheus datasource configuration is correct
- Check if time range is correct
- Verify PromQL query syntax
Configuration Files
The monitoring stack configuration files are located in:
deploy/prometheus/
├── prometheus.yml # Prometheus configuration
├── grafana-dashboard.json # Grafana dashboard definition
└── alert-rules.yml # Alert rules (optional)1
2
3
4
2
3
4
Prometheus Configuration Example
yaml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'chatroom'
static_configs:
- targets: ['chatroom:8080']
metrics_path: /metrics1
2
3
4
5
6
7
8
2
3
4
5
6
7
8
Further Reading
- Prometheus Documentation
- Grafana Documentation
- Architecture Documentation — System metrics implementation details
🌐 Languages: English | 简体中文