Prometheus and Grafana

Shardul | Apr 6, 2025 min read

Prometheus

Prometheus is an open-source system monitoring and alerting toolkit originally built at SoundCloud. It is now a standalone open source project . Prometheus joined the Cloud Native Computing Foundation in 2016 as the second hosted project, after Kubernetes.

Components

The Prometheus ecosystem consists of multiple components, many of which are optional:

  1. the main Prometheus server which scrapes and stores time series data.
  2. client libraries for instrumenting application code.
  3. a push gateway for supporting short-lived jobs.
  4. special-purpose exporters for services like HAProxy, StatsD, Graphite, etc.
  5. an alertmanager to handle alerts.
  6. various support tools.

Prometheus Installation

Prometheus is a monitoring platform that collects metrics from monitored targets by scraping metrics HTTP endpoints on these targets.

Download the latest release of Prometheus for your platform. In our case we are downloading with version (2.53.3), then extract it.

wget https://github.com/prometheus/prometheus/releases/download/v2.53.3/prometheus-2.53.3.linux-amd64.tar.gz
tar zxvf prometheus-2.53.3.linux-amd64.tar.gz

This will create a directory called prometheus-2.53.3.linux-amd64 containing two binary files (prometheus and promtool), consoles and console_libraries directories containing the web interface files, a license, a notice, and several example files.

  1. Copy the two binaries to the /usr/local/bin directory.
  2. Create a directory at /etc/ with name prometheus.
  3. Copy all remaining file at /etc/prometheus
cd prometheus-2.53.3.linux-amd64
mv prometheus /usr/local/bin/
mv promtool /usr/local/bin/
mkdir /etc/prometheus/
mv * /etc/prometheus/

Configuring Prometheus:

Open the prometheus.yml will contain just enough information to run Prometheus for the first time.

In the global settings, define the default interval for scraping metrics. Note that Prometheus will apply these settings to every exporter unless an individual exporter’s own settings override the globals.

This scrape_interval value tells Prometheus to collect metrics from its exporters every 15 seconds, which is long enough for most exporters.

The evaluation_interval option controls how often Prometheus will evaluate rules. Prometheus uses rules to create new time series and to generate alerts.

The rule_files block specifies the location of any rules we want the Prometheus server to load.For now we’ve got no rules.

The last block, scrape_configs, controls what resources Prometheus monitors. Since Prometheus also exposes data about itself as an HTTP endpoint it can scrape and monitor its own health. In the default configuration there is a single job, called prometheus, which scrapes the time series data exposed by the Prometheus server. The job contains a single, statically configured, target, the localhost on port 9090. Prometheus expects metrics to be available on targets on a path of /metrics. So this default job is scraping via the URL: http://localhost:9090/metrics.

global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
    monitor: 'codelab-monitor'

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9090']

  - job_name: "Node"
    scrape_interval: 5s
 
    static_configs:
      - targets: ["node1:9100"]
      - targets: ["node2:9100"]

 
  - job_name: "Docker-Node"
    scrape_interval: 5s
 
    static_configs:
      - targets: ["node1:9323"]
      - targets: ["node2:9323"]

Create Prometheus Service

Create a file with name prometheus.service at location /etc/systemd/system/ to create a systemd service.

[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=root
Group=root
Type=simple
ExecStart=/usr/local/bin/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries \
  --web.external-url=https://prom.example.com \
  --web.listen-address=:9090 \
  --web.enable-lifecycle \
  --web.enable-admin-api \
  --log.level=info
[Install]
WantedBy=multi-user.target

Starting Prometheus

Below commands will start the Prometheus service and enable it, so that whenever your machine is started it will automatically starts the Prometheus service.

systemctl start prometheus.service
systemctl enable prometheus.service
systemctl status  prometheus.service

node exporter : 

The Prometheus Node Exporter is a single static binary that you can install on any linux node.

https://prometheus.io/download/#node_exporter download from here then extract it and copy it to /usr/local/bin/.

wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
tar zxvf node_exporter-1.8.2.linux-amd64.tar.gz
cd node_exporter-1.8.2.linux-amd64
cp node_exporter /usr/local/bin/

Create node exporter Service

Create a file with name node_exporter.service at location /etc/systemd/system/ to create a systemd service.

[Unit]
Description=Node Exporter
After=network.target
[Service]
User=root
Group=root
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target

Starting node exporter

Below commands will start the node_exporter service and enable it, so that whenever your machine is started it will automatically starts the node_exporter service and prometheus will get machine’s matrics.

systemctl start node_exporter.service
systemctl enable node_exporter.service
systemctl status node_exporter.service

Once the Node Exporter is installed and running, you can verify that metrics are being exported by cURLing the /metrics endpoint. curl http://localhost:9100/metrics . You need to same to all the nodes that needs to be monitored.

cadvisor

Create the docker container using below command to all the nodes that need to monitored containers. 

docker pull   gcr.io/cadvisor/cadvisor:$VERSION

docker run \
  --volume=/:/rootfs:ro \
  --volcume=/var/run:/var/run:ro \
  --volume=/sys:/sys:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --volume=/dev/disk/:/dev/disk:ro \
  --publish=8080:8080 \
  --detach=true \
  --name=cadvisor \
  --privileged \
  --device=/dev/kmsg \
  gcr.io/cadvisor/cadvisor:$VERSION

Check if the docker container is running:

docker ps -a

Grafana : 

Grafana supports querying Prometheus. The Grafana data source for Prometheus is included since Grafana 2.5.0 (2015-10-28).

Installing Grafana

You can install grafana in many ways, first is using rpm file.

sudo wget https://dl.grafana.com/oss/release/grafana-11.4.0-1.x86_64.rpm
sudo yum install -y grafana-11.4.0-1.x86_64.rpm

In this way you do not need to create grafana service Manually. second is using tar file.

wget https://dl.grafana.com/oss/release/grafana-11.6.0.linux-amd64.tar.gz
tar -zxvf grafana-11.6.0.linux-amd64.tar.gz
cd grafana-11.6.0.linux-amd64
cp * /usr/share/grafana

Create grafana Service

[Unit]
Description=Grafana instance
Documentation=http://docs.grafana.org
Wants=network-online.target
After=network-online.target
After=postgresql.service mariadb.service mysqld.service influxdb.service

[Service]
EnvironmentFile=/etc/sysconfig/grafana-server
User=root
Group=root
Type=notify
Restart=on-failure
WorkingDirectory=/usr/share/grafana
RuntimeDirectory=grafana
RuntimeDirectoryMode=0750
ExecStart=/usr/share/grafana/bin/grafana server                                     \
                            --config=${CONF_FILE}                                   \
                            --pidfile=${PID_FILE_DIR}/grafana-server.pid            \
                            --packaging=rpm                                         \
                            cfg:default.paths.logs=${LOG_DIR}                       \
                            cfg:default.paths.data=${DATA_DIR}                      \
                            cfg:default.paths.plugins=${PLUGINS_DIR}                \
                            cfg:default.paths.provisioning=${PROVISIONING_CFG_DIR}

LimitNOFILE=10000
TimeoutStopSec=20
CapabilityBoundingSet=
DeviceAllow=
LockPersonality=true
MemoryDenyWriteExecute=false
NoNewPrivileges=true
PrivateDevices=true
PrivateTmp=true
ProtectClock=true
ProtectControlGroups=true
ProtectHome=true
ProtectHostname=true
ProtectKernelLogs=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectProc=invisible
ProtectSystem=full
RemoveIPC=true
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
SystemCallArchitectures=native
UMask=0027

[Install]
WantedBy=multi-user.target

Starting grafana

systemctl start grafana-server.service
systemctl enable grafana-server.service
systemctl status grafana-server.service

By default, Grafana will be listening on http://localhost:3000. The default login is “admin” /“admin”.