Prometheus 联邦集群

Prometheus Server 环境:

192.168.15.100 #主节点

192.168.15.101 #联邦节点1

192.168.15.102 #联邦节点2

192.168.15.101 #node1,联邦节点1的目标采集服务器

192.168.15.101 #node2,联邦节点1的目标采集服务器

部署 prometheus server

Prometheus 主Server 和 prometheus 联邦 server 分别部署 prometheus

1
2
3
4
5
6
7
8
9
cd /apps
tar xvf prometheus-2.32.1.liunx-amd64.tar.gz

# 创建软连接
ln -sv /apps/prometheus-2.32.1.liunx-amd64 /apps/prometheus

cd /apps/prometheus
# 检测配置文件、检测 metrics 数据等
./promtool check config prometheus.yml

vim /etc/systemd/system/prometheus.service

1
2
3
4
5
6
7
8
9
10
11
12
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network.target

[Service]
Restart=on-failure
WorkingDirectory=/apps/prometheus/
ExecStart=/apps/prometheus/prometheus --config.file=/apps/prometheus/prometheus.yml

[Install]
WantedBy=multi-user.target

启动 prometheus 服务

1
2
3
systemctl daemon-reload
systemctl restart prometheus
systemctl enable prometheus

node_exporter 部署

下载解压二进制程序

1
2
3
4
5
6
cd /apps 
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
tar xf node_exporter-1.3.1.linux-amd64.tar.gz

# 创建软连接
ln -sv /apps/node_exporter-1.3.1.linux-amd64 /apps/node_exporter

创建 node-exporter service 启动脚本

vim /etc/systemd/system/node-exporter.service

1
2
3
4
5
6
7
8
9
[Unit]
Description=Prometheus Node Exporter
After=network.target

[Service]
ExecStart=/apps/node_exporter/node_exporter

[Install]
wantedBy=multi-user.target

启动 node exporter 服务

1
2
3
systemctl daemon-reload
systemctl restart node-exporter
systemctl enable node-exporter.service

配置联邦 server 监控 node_exporter

分别在联邦节点1 监控 node1,在联邦节点2 监控 node2

1
2
3
4
5
6
7
8
9
Prometheus 联邦节点1
vim /apps/prometheus/prometheus.yml

- job_name: 'prometheus-node'
static_configs:
- targets: ['192.168.15.101:9100'] #node_exporter1

# 重启 Prometheus 服务
systemctl restart prometheus.service
1
2
3
4
5
6
7
8
9
Prometheus 联邦节点2
vim /apps/prometheus/prometheus.yml

- job_name: 'prometheus-node'
static_configs:
- targets: ['192.168.15.102:9100'] #node_exporter2

# 重启 Prometheus 服务
systemctl restart prometheus.service

分别查看 prometheus target 中是否存在数据

prometheus server 采集联邦 server

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'
static_configs:
- targets: ['localhost:9090']

- job_name: 'prometheus-federate-2.101'
scrape_interval: 10s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
- '{__name__=~"node.*"}'
static_configs:
- targets:
- '192.168.15.101:9090'

- job_name: 'prometheus-federate-2.102'
scrape_interval: 10s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
- '{__name__=~"node.*"}'
static_configs:
- targets:
- '192.168.15.102:9090'

验证 prometheus server

查看 192.168.15.100:9090/targets 是否存在联邦 prometheus 数据

验证指标数据

在 graph 界面查询 node_load1 查看是否存在数据