Binary installation of Prometheus
Server list:
Server name | Operating system | IP address | Service |
---|---|---|---|
test03 | Ubuntu 16.04.4 | 192.168.1.58 | Prometheus, Alertmanager,grafana |
test02 | Ubuntu 16.04.4 | 192.168.1.57 | Node_exporter |
1, install prometheus
-
Prometheus official website Download link: https://prometheus.io/download/
-
Download prometheus
[email protected]:~# wget https://github.com/prometheus/prometheus/releases/download/v2.11.0/prometheus-2.11 .0.linux-amd64.tar.gz
-
Unzip prometheus
[emailprotected]:~# tar xf prometheus-2.11.0.linux-amd64 .tar.gz
-
Move to the /usr/local/prometheus directory
[emailprotected]:~# mv prometheus-2.11.0 .linux- amd64 /usr/local/prometheus
- Set the prometheus background service to start
[emailprotected]:~# cat /lib/systemd /system/prometheus.service
[Unit]
Description=https://prometheus.io
[Service]
ExecStart=/usr/local/prometheus/ prometheus --config.file="/usr/local/prometheus/prometheus.yml"
[Install]
WantedBy=multi-user.target
-
Create prometheus service
[email protected]:~# systemctl enable prometheus.service
Created symlink from /etc/systemd/system/multi-user.target.wants/prometheus. service to /lib/systemd/system/prometheus.service. -
Start prometheus service
[emailprotected]:~# systemctl start prometheus.service code>
-
View promethues service status
[email protected]:~# systemctl status prometheus.service
● prometheus.service-https: //prometheus.io
Loaded: loaded (/lib/systemd/system/prometheus.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2019-07-10 11:10:45 CST; 4s ago
Main PID: 818 (prometheus)
...... - Visit: http ://192.168.1.58:9090
< h3>2, install Grafana
-
docker installation
[email protected]:~# docker run -d -p 3000:3000 grafana/grafana
[email protected]:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a6ff7bd88b42 grafana/grafana "/run.sh" 43 seconds ago Up 41 seconds 0.0.0.0:3000->3000 /tcp peaceful_brattain -
Visit: http://192.168.1.58:3000
Log in to the gafana interface:
The default account is: admin
The default password is: admin
After logging in for the first time, it is prompted to reset the password. -
Add data source
< /li> - Enter Prometheus address
3. Monitor the Linux server
- Install node_exporter and start it
[emailprotected]:~# wget https://github.com/prometheus /node_exporter/releases/download/v0.18.1/node_exporter-0.18.1.linux-amd64.tar.gz
[emailprotected]:~# tar xf node_exporter-0.18.1.linux-amd64.tar.gz
[email protected]:~# mv node_exporter-0.18.1.linux-amd64 /usr/local/node_exporter
[email protected]:~# cd /usr/local/node_exporter
[ email protected]:/usr/local/node_exporter# cat /lib/systemd/system/node_exporter.service
[Unit]
Description=https://prometheus.io/docs/guides/node-exporter /
[Service]
ExecStart=/usr/local/node_exporter/node_exporter
[Install]
WantedBy=multi-user.target
[email protected]:/usr/local/node_exporter# systemctl enable node_exporter.service
Created symlink from /etc/systemd/system/multi-user.t arget.wants/node_exporter.service to /lib/systemd/system/node_exporter.service.
[email protected]:/usr/local/node_exporter# systemctl start node_exporter.service
[email protected]:/ usr/local/node_exporter# systemctl status node_exporter.service
● node_exporter.service-https://prometheus.io/docs/guides/node-exporter/
Loaded: loaded (/lib/systemd/system /node_exporter.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2019-07-10 14:23:35 CST; 5s ago
Main PID: 774 (node_exporter)< br /> CGroup: /system.slice/node_exporter.service
└─774 /usr/local/node_exporter/node_exporter
-
Visit: http://192.168.1.57 :9100/metrics, you can view the data collected by node_exporter
li>
- Configuration service discovery
cat /usr/local/prometheus/prometheus.yml
# my global config
global:
scrape_interval: 15s # Set the scrape in terval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).< br />
# Alertmanager configuration
alerting:
alertmanagers:
-static_configs:
-targets:
#-alertmanager:9093
< br /># Load rules once and periodically evaluate them according to the global'evaluation_interval'.
rule_files:
#-"first_rules.yml"
#-"second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=` to any timeseries scraped from this config.
-job_name:'prometheus'
# metrics_path defaults to'/metrics'
# scheme defaults to'http'.
static_configs:
-targets: ['localhost:9090']
-job_name :'host'
file_sd_configs:
-files: ['/usr/local/prometheus/sd_config/host.yml']
refresh_interval: 5s
- Create host.yaml file
[email protected]:/usr/local/prometheus/sd_config# cat /usr/local/prometheus/sd_config/host.yml
- targets:
-192.168.1.57:9100
-
Reload configuration file
prometheus_id=`ps -ef |grep prometheus.yml|grep -v grep |awk'{print $2}'`
kill -hup $prometheus_id - Check Targets host, host group, there is already 192.168.1.57 monitored terminal
- grafana import linux basic monitoring module: 9276
- After entering 9276, wait a few seconds Automatic loading template
- View host resource display
- Download Alertmanager
- Unzip alertmanager-0.18.0.linux-amd64 .tar.gz and move to /usr/local/alertmanager
[email protected]:~# tar xf alertmanager-0.18.0.linux-amd64.tar.gz
[email protected]:~# mv alertmanager-0.18.0.linux-amd64 /usr/local/alertmanager - Configure alertmanager background startup
4. Install Alertmanager
[email protected]:~# wget https://github.com /prometheus/alertmanager/releases/download/v0.18.0/alertmanager-0.18.0.linux-amd64.tar.gz
[email protected]:~# cd / usr/local/alertmanager
[email protected]:/usr/local/alertmanager# cat /lib/systemd/system/alertmanager.service
[Unit]
Description=https://prometheus .io/docs/prometheus/latest/configuration/alerting_rules/
[Service]
ExecStart=/usr/local/alertmanager/alertmanager --config.file=/usr/local/alertmanager /alertmanager.yml
[Install]
WantedBy=multi-user.target
- Configure email alerts
[email protected]:/usr/local /alertmanager# cat /usr/local/alertmanager/alertmanager.yml
global:
resolve_timeout: 5m
smtp_smarthost:'smtp.163.com:25'
smtp_from:'[ email protected]'
smtp_auth_username:'[email protected]'
smtp_auth_password:'xxxxxx'
smtp_require_tls: false
route:
group_by: ['alertname']< br /> group_wait: 10s
group_interval: 10s
repeat_interval: 1m
receiver:'mail'
receivers:
- name:'mail'
email_configs :
-to:'[email protected]'
- Start alertmanager
[email protected]:/usr/local/alertmanager# systemctl enable alertmanager.service
Created symlink from /etc/systemd/system/multi-user.target.wants/alertmanager.service to /lib/systemd/system/alertmanager.service.
[emailprotected]: /usr/local/alertmanager# systemctl start alertmanager.service
[email protected]:/usr/local/alertmanager# systemctl status alertmanager.service
● alertmanager.service- https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
Loaded: loaded (/lib/systemd/system/alertmanager.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2019-07-10 16:28:20 CST; 2min 15s ago
Main PID: 19847 (alertmanager)
Tasks: 9
Memory: 9.0M
CPU: 290ms
CGroup: /system.slice/alertmanager.service
└─19847 /usr/local/alertmanager/alertmanager --config.file=/usr/local/alertmanager/alertmanager.yml< /pre>
- Configure alert information
# Alertmanager configuration
alerting:
alertmanagers:
-static_configs:
- targets:
-127.0.0.1:9093
# Load rules once and periodically evaluate them according to the global'evaluation_interval'.
rule_files:
-"rules/ *.yml"
[email protected]:/usr/local/prometheus/rules# cat /usr/local/prometheus/rules/targets.yml
groups:
- name: targets< br /> rules:
# Alert for any instance that is unreachable for >5 minutes.
-alert: InstanceDown
expr: up == 0
for: 1m
labels:
severity: error
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."
- Reload the Prometheus service file and send a signal according to the Prometheus process number 818
prometheus_id=`ps -ef |grep prometheus.yml| grep -v grep|awk'{print $2}'`
kill -hup $prometheus_id - View alarm rules
- Check the alarm status, (active) means: active
- Test node stop
[emailprotected]:~# systemctl stop node_exporter.service
- Pending: The threshold has been triggered, but the alarm duration is not met
- Firing: The threshold has been triggered and the alarm duration is met. The alert is sent to the recipient.
* - Receive alert email