Архитектура мониторинга
Настройка Prometheus
▸prometheus.yml
yaml
1global:2 scrape_interval: 15s3 evaluation_interval: 15s45alerting:6 alertmanagers:7 - static_configs:8 - targets: ['alertmanager:9093']910rule_files:11 - 'alert_rules.yml'1213scrape_configs:14 - job_name: 'prometheus'15 static_configs:16 - targets: ['localhost:9090']1718 - job_name: 'node-exporter'19 static_configs:20 - targets: ['node-exporter:9100']2122 - job_name: 'app'23 static_configs:24 - targets: ['app:3000']25 metrics_path: '/metrics'
▸Docker Compose для Prometheus
yaml
1version: '3.8'2services:3 prometheus:4 image: prom/prometheus5 volumes:6 - ./prometheus.yml:/etc/prometheus/prometheus.yml7 - prometheus_data:/prometheus8 ports:9 - "9090:9090"1011 grafana:12 image: grafana/grafana13 environment:14 - GF_SECURITY_ADMIN_PASSWORD=admin15 volumes:16 - grafana_data:/var/lib/grafana17 ports:18 - "3000:3000"1920 node-exporter:21 image: prom/node-exporter22 volumes:23 - /proc:/host/proc:ro24 - /sys:/host/sys:ro25 command:26 - '--path.procfs=/host/proc'27 - '--path.sysfs=/host/sys'2829volumes:30 prometheus_data:31 grafana_data:
Экспорт метрик в приложении
▸Node.js (prom-client)
javascript
1const client = require('prom-client');23// Сбор метрик4const collectDefaultMetrics = client.collectDefaultMetrics;5collectDefaultMetrics();67// Кастомные метрики8const httpRequestDuration = new client.Histogram({9 name: 'http_request_duration_seconds',10 help: 'Duration of HTTP requests',11 labelNames: ['method', 'route', 'status'],12 buckets: [0.01, 0.05, 0.1, 0.5, 1, 5],13});1415// Эндпоинт /metrics16app.get('/metrics', async (req, res) => {17 res.set('Content-Type', client.register.contentType);18 res.end(await client.register.metrics());19});
▸C# (prometheus-net)
csharp
1using Prometheus;23var app = WebApplication.CreateBuilder(args);45builder.Services.AddMetricServer();67var counter = Metrics.CreateCounter("http_requests_total", "Total HTTP requests");8var histogram = Metrics.CreateHistogram("http_request_duration", "Request duration");910app.MapGet("/", async context =>11{12 counter.Inc();13 using (histogram.NewTimer())14 {15 await context.Response.WriteAsync("Hello World!");16 }17});1819app.Run();
Запросы PromQL
promql
1# Средняя загрузка CPU2rate(node_cpu_seconds_total{mode="user"}[5m])34# Использование памяти5node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 10067# HTTP запросы в секунду8rate(http_requests_total[5m])910# 95-й перцентиль времени ответа11histogram_quantile(0.95, rate(http_request_duration_bucket[5m]))
Алерты
yaml
1groups:2 - name: alerts3 rules:4 - alert: HighCPU5 expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 806 for: 5m7 labels:8 severity: warning9 annotations:10 summary: "High CPU usage on {{ $labels.instance }}"1112 - alert: ServiceDown13 expr: up == 014 for: 1m15 labels:16 severity: critical17 annotations:18 summary: "Service {{ $labels.job }} is down"
Настройка Alertmanager
yaml
1global:2 slack_api_url: 'https://hooks.slack.com/services/xxx'34route:5 receiver: 'slack-notifications'67receivers:8 - name: 'slack-notifications'9 slack_configs:10 - channel: '#alerts'11 send_resolved: true
Заключение
Prometheus и Grafana — это стандартный стек мониторинга. Метрики, алерты и дашборды обеспечивают полную observability для продакшн-приложений.