¿Qué es Prometheus?

Q: ¿Qué es Prometheus?

Prometheus is a monitoring system and time series database designed to collect metrics from systems and applications, with query and alerting capabilities.

Prometheus

Prometheus is a monitoring system and time series database designed to collect metrics from systems and applications, with query and alerting capabilities.

What is Prometheus?

Prometheus is an open-source monitoring tool that collects metrics from systems and applications, stores them in a time series database, and provides query and alerting capabilities.

Prometheus Architecture

Main Components

Prometheus Server: Main collection server
Exporters: Agents that expose metrics
Pushgateway: Gateway for push metrics
Alertmanager: Alert management
Service Discovery: Automatic service discovery

Data Flow

Applications → Exporters → Prometheus Server → Alertmanager → Alerts
                    ↓
              TSDB Database
                    ↓
              PromQL Queries

Metrics and Types

Metric Types

Counter: Values that only increase
Gauge: Values that can go up or down
Histogram: Value distribution
Summary: Quantiles and sums

Metric Examples

# Counter
http_requests_total{method="GET", status="200"} 1024

# Gauge
memory_usage_bytes{instance="server1"} 1073741824

# Histogram
http_request_duration_seconds_bucket{le="0.1"} 100
http_request_duration_seconds_bucket{le="0.5"} 200
http_request_duration_seconds_bucket{le="1.0"} 250
http_request_duration_seconds_bucket{le="+Inf"} 300
http_request_duration_seconds_sum 150.5
http_request_duration_seconds_count 300

Configuration

Basic Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "alert_rules.yml"

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']

Service Discovery

1
2
3
4
5
6
7
8
scrape_configs:
  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true

PromQL - Query Language

Basic Queries

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Simple query
up

# Filtering by labels
http_requests_total{method="GET"}

# Aggregations
sum(http_requests_total) by (method)

# Time functions
rate(http_requests_total[5m])

# Mathematical operators
cpu_usage_percent / 100

Advanced Queries

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Percentiles
histogram_quantile(0.95, http_request_duration_seconds_bucket)

# Changes over time
increase(http_requests_total[1h])

# Comparisons
cpu_usage_percent > 80

# Window functions
avg_over_time(cpu_usage_percent[5m])

Alerts

Alert Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
groups:
- name: example
  rules:
  - alert: HighCPUUsage
    expr: cpu_usage_percent > 80
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High CPU usage detected"
      description: "CPU usage is above 80% for more than 5 minutes"

Alertmanager

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'web.hook'

receivers:
- name: 'web.hook'
  webhook_configs:
  - url: 'http://127.0.0.1:5001/'

Popular Exporters

System

Node Exporter: Operating system metrics
Windows Exporter: Windows metrics
SNMP Exporter: SNMP metrics

Applications

JMX Exporter: Java metrics
MySQL Exporter: MySQL metrics
PostgreSQL Exporter: PostgreSQL metrics
Redis Exporter: Redis metrics

Cloud

AWS CloudWatch Exporter: AWS metrics
Azure Monitor Exporter: Azure metrics
GCP Exporter: Google Cloud metrics

Kubernetes Integration

ServiceMonitor

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: metrics
    interval: 30s

PrometheusRule

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: example-rules
spec:
  groups:
  - name: example
    rules:
    - alert: PodDown
      expr: up == 0
      for: 5m

Best Practices

Metrics

Naming: Use consistent conventions
Cardinality: Avoid high cardinality
Retention: Configure appropriate retention
Labels: Use labels efficiently

Performance

Scrape Interval: Appropriate intervals
Query Performance: Optimize queries
Storage: Configure adequate storage
Memory: Monitor memory usage

Security

Authentication: Implement authentication
Authorization: Access control
TLS: Use secure connections
Network: Network segmentation

Kubernetes - Platform that Prometheus monitors
Docker - Containers that Prometheus monitors
AWS S3 - Storage for metrics
Grafana - Prometheus metrics visualization
SIEM - System that can integrate Prometheus
SOAR - Automation based on alerts
Dashboards - Metrics visualization
Logs - Logs complementary to metrics
Monitoring and Review - Process that Prometheus supports
Metrics and KPIs - Measurement through Prometheus
Zero Trust - Monitoring for Zero Trust
Defense in Depth - Monitoring layer