Setting Up Kubernetes Monitoring with Prometheus and Grafana

Monitoring your Kubernetes clusters is crucial for maintaining healthy applications and infrastructure. Today, we’ll set up a complete monitoring stack using Prometheus and Grafana.

Why Monitor Kubernetes?

Resource Utilization: Track CPU, memory, and storage usage
Application Health: Monitor pod status and application metrics
Alerting: Get notified before issues become critical
Capacity Planning: Understand growth patterns

Installation with Helm

First, let’s install the Prometheus stack:

# Add the Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Install kube-prometheus-stack
helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace \
  --set grafana.adminPassword=admin123

Key Components

Prometheus Server

Metrics Collection: Scrapes metrics from various sources
Time Series Database: Stores metrics data
Query Engine: PromQL for querying metrics

Grafana

Visualization: Beautiful dashboards
Alerting: Visual alert management
Data Sources: Connects to Prometheus

AlertManager

Alert Routing: Routes alerts to different channels
Grouping: Groups similar alerts
Silencing: Temporarily disable alerts

Essential Kubernetes Metrics

# Node CPU usage
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

# Pod memory usage
container_memory_usage_bytes{pod!="", container!="POD"}

# Pod restart count
increase(kube_pod_container_status_restarts_total[1h])

# Node disk usage
100 - ((node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes)

Tomorrow, we’ll explore advanced Prometheus queries and creating custom alerting rules for your specific use cases.

How do you monitor your Kubernetes clusters? Share your monitoring strategies!

Setting Up Kubernetes Monitoring with Prometheus and Grafana#

Why Monitor Kubernetes?#

Installation with Helm#

Key Components#

Prometheus Server#

Grafana#

AlertManager#

Essential Kubernetes Metrics#