Setting Up Kubernetes Monitoring with Prometheus and Grafana
Monitoring your Kubernetes clusters is crucial for maintaining healthy applications and infrastructure. Today, we’ll set up a complete monitoring stack using Prometheus and Grafana.
Why Monitor Kubernetes?
- Resource Utilization: Track CPU, memory, and storage usage
- Application Health: Monitor pod status and application metrics
- Alerting: Get notified before issues become critical
- Capacity Planning: Understand growth patterns
Installation with Helm
First, let’s install the Prometheus stack:
# Add the Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Install kube-prometheus-stack
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
--set grafana.adminPassword=admin123
Key Components
Prometheus Server
- Metrics Collection: Scrapes metrics from various sources
- Time Series Database: Stores metrics data
- Query Engine: PromQL for querying metrics
Grafana
- Visualization: Beautiful dashboards
- Alerting: Visual alert management
- Data Sources: Connects to Prometheus
AlertManager
- Alert Routing: Routes alerts to different channels
- Grouping: Groups similar alerts
- Silencing: Temporarily disable alerts
Essential Kubernetes Metrics
# Node CPU usage
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Pod memory usage
container_memory_usage_bytes{pod!="", container!="POD"}
# Pod restart count
increase(kube_pod_container_status_restarts_total[1h])
# Node disk usage
100 - ((node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes)
Tomorrow, we’ll explore advanced Prometheus queries and creating custom alerting rules for your specific use cases.
How do you monitor your Kubernetes clusters? Share your monitoring strategies!