一、环境

主机名	IP地址	系统	说明
localhost	192.168.11.61	Ubuntu 20.04	docker安装的prometheus
test	192.168.11.62	Ubuntu 20.04	mysql8.0

1、环境搭建

docker安装

略

docker-compose安装

略

为了快速测试，我使用docker-compose安装mysql，当然也可以自行安装

docker-compose安装mysql

mkdir /data/mysql -p
cd /data/mysql

通过cat 创建docker-compose.yaml文件

cat > docker-compose.yaml <<"EOF"
version: '3.1'
services:
  db:
    image: mysql
    restart: always
    container_name: mysql
    environment:
      TZ: Asia/Shanghai
      LANG: en_US.UTF-8
      MYSQL_ROOT_PASSWORD: 123456
    command:
      --default-authentication-plugin=mysql_native_password
      --character-set-server=utf8mb4
      --collation-server=utf8mb4_general_ci
      --lower_case_table_names=1
      --performance_schema=1
      --sql-mode=""
      --skip-log-bin
    volumes:
      #- /data/mysql/conf:/etc/mysql/conf.d #数据文件挂载
      - /data/mysql/data:/var/lib/mysql #数据文件挂载
    ports:
      - 3306:3306
EOF

运行

docker-compose up -d

检查

docker ps

STATUS列全部为up 为正常

二、监控mysql

mysqld_exporter

1、创建用户

登陆mysql

docker exec -it mysql mysql -uroot -p

输入密码：123456

创建

CREATE USER 'exporter'@'%' IDENTIFIED BY 'password' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'%';

验证

docker exec -it mysql mysql -uexporter -p

2、二进制安装（二选一）

下载地址：https://github.com/prometheus/mysqld_exporter/releases

下载解压后移动/opt目录

wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.14.0/mysqld_exporter-0.14.0.linux-amd64.tar.gz
tar vxf mysqld_exporter-0.14.0.linux-amd64.tar.gz

mkdir /opt/prometheus/

mv mysqld_exporter-0.14.0.linux-amd64 /opt/prometheus/mysqld_exporter

创建用户

useradd -M -s /usr/sbin/nologin prometheus

创建连接数据库的文件

cat >/opt/prometheus/mysqld_exporter/.mysqld_exporter.cnf << "EOF"
[client]
user=exporter
password=password
host=192.168.11.62
port=3306
EOF

更改exporter文件夹权限

chmod 600 /opt/prometheus/mysqld_exporter/.mysqld_exporter.cnf

chown prometheus:prometheus -R /opt/prometheus

创建systemd服务

cat > /etc/systemd/system/mysqld_exporter.service <<"EOF"
[Unit]
Description=Prometheus MySQL Exporter
After=network.target

[Service]
Type=simple
User=prometheus
Group=prometheus
Restart=always
ExecStart=/opt/prometheus/mysqld_exporter/mysqld_exporter \
--config.my-cnf=/opt/prometheus/mysqld_exporter/.mysqld_exporter.cnf \
--collect.auto_increment.columns \
--collect.info_schema.processlist \
--collect.binlog_size \
--collect.info_schema.tablestats \
--collect.info_schema.innodb_metrics \
--collect.info_schema.userstats \
--collect.info_schema.tables \
--collect.perf_schema.tablelocks \
--collect.perf_schema.indexiowaits \
--collect.perf_schema.tableiowaits \
--collect.slave_status

[Install]
WantedBy=multi-user.target
EOF

启动 mysqld_exporter

systemctl daemon-reload
systemctl start mysqld_exporter

加入到开机自启动

systemctl enable mysqld_exporter

检查

systemctl status mysqld_exporter

启动不了检查日志

journalctl -u mysqld_exporter -f

2、docker安装（二选一）

docker-compose运行

注：latest版本配置方法改变了，指定v0.14.0版本。如下：

mkdir /data/mysqld_exporter -p
cd /data/mysqld_exporter

cat >docker-compose.yaml<<"EOF"
version: '3.3'
services:
  mysqld-exporter:
    image: prom/mysqld-exporter
    container_name: mysqld-exporter
    restart: always
    command:
      - '--collect.info_schema.processlist'
      - '--collect.info_schema.innodb_metrics'
      - '--collect.info_schema.tablestats'
      - '--collect.info_schema.tables'
      - '--collect.info_schema.userstats'
      - '--collect.engine_innodb_status'
      - '--config.my-cnf=/my.cnf'
    volumes:
      - ./my.cnf:/my.cnf
    ports:
      - 9104:9104
EOF

创建my.cnf配置文件

cat > my.cnf << "EOF"
[client]
host=192.168.11.62
user=exporter
password=password
port=3306
EOF

启动

docker-compose up -d

检查

docker ps

3、参数解释

Name	MySQL Version	Description
collect.auto_increment.columns	5.1	从 information_schema 收集 auto_increment 列和最大值.
collect.binlog_size	5.1	收集所有注册的binlog文件的当前大小
collect.engine_innodb_status	5.1	收集SHOW ENGINE INNODB STATUS
collect.engine_tokudb_status	5.6	收集SHOW ENGINE TOKUDB STATUS .
collect.global_status	5.1	收集SHOW GLOBAL STATUS（默认启用）
collect.global_variables	5.1	收集SHOW GLOBAL VARIABLES（默认启用）
collect.info_schema.clientstats	5.5	如果以 userstat=1 运行，设置为 true 以收集客户端统计信息
collect.info_schema.innodb_metrics	5.6	从 information_schema.innodb_metrics收集指标
collect.info_schema.innodb_tablespaces	5.7	从information_schema.innodb_sys_tablespaces收集指标
collect.info_schema.innodb_cmp	5.5	从information_schema.innodb_cmp收集 InnoDB 压缩表指标。
collect.info_schema.innodb_cmpmem	5.5	从information_schema.innodb_cmpmem缓冲池压缩指标。
collect.info_schema.processlist	5.1	从 information_schema.processlist 收集线程状态计数
collect.info_schema.processlist.min_time	5.1	线程必须处于要计算的每个状态的最短时间。（默认值：0）
collect.info_schema.query_response_time	5.5	如果 query_response_time_stats 为 ON，则收集查询响应时间分布。
collect.info_schema.replica_host	5.6	从 information_schema.replica_host_status 收集指标。
collect.info_schema.tables	5.1	从information_schema.tables收集指标。
collect.info_schema.tables.databases	5.1	要为其收集表统计信息的数据库列表，或为所有
collect.info_schema.tablestats	5.1	如果以 userstat=1 运行，设置为 true 以收集表统计信息。
collect.info_schema.schemastats	5.1	如果以 userstat=1 运行，设置为 true 以收集架构统计信息
collect.info_schema.userstats	5.1	如果以 userstat=1 运行，设置为 true 以收集用户统计信息。
collect.mysql.user	5.5	从 mysql.user 表中收集数据
collect.perf_schema.eventsstatements	5.6	从 performance_schema.events_statements_summary_by_digest 收集指标。
collect.perf_schema.eventsstatements.digest_text_limit	5.6	规范化语句文本的最大长度。（默认值：120）
collect.perf_schema.eventsstatements.limit	5.6	按响应时间限制事件语句摘要的数量。（默认值：250）
collect.perf_schema.eventsstatements.timelimit	5.6	以秒为单位限制“last_seen”事件语句的存在时间。（默认值：86400）
collect.perf_schema.eventsstatementssum	5.7	从 performance_schema.events_statements_summary_by_digest 汇总收集指标。
collect.perf_schema.eventswaits	5.5	从 performance_schema.events_waits_summary_global_by_event_name 收集指标
collect.perf_schema.file_events	5.6	从 performance_schema.file_summary_by_event_name 收集指标
collect.perf_schema.file_instances	5.5	从 performance_schema.file_summary_by_instance 收集指标。
collect.perf_schema.file_instances.remove_prefix	5.5	删除 performance_schema.file_summary_by_instance 中的路径前缀。
collect.perf_schema.indexiowaits	5.6	从 performance_schema.table_io_waits_summary_by_index_usage 收集指标。
collect.perf_schema.memory_events	5.7	从 performance_schema.memory_summary_global_by_event_name 收集指标。
collect.perf_schema.memory_events.remove_prefix	5.7	删除 performance_schema.memory_summary_global_by_event_name 中的仪器前缀。
collect.perf_schema.tableiowaits	5.6	从 performance_schema.table_io_waits_summary_by_table 收集指标。
collect.perf_schema.tablelocks	5.6	从 performance_schema.table_lock_waits_summary_by_table 收集指标。
collect.perf_schema.replication_group_members	5.7	从 performance_schema.replication_group_members 收集指标。
collect.perf_schema.replication_group_member_stats	5.7	从 performance_schema.replication_group_member_stats 收集指标。
collect.perf_schema.replication_applier_status_by_worker	5.7	从 performance_schema.replication_applier_status_by_worker 收集指标。
collect.slave_status	5.1	从 SHOW SLAVE STATUS 收集（默认启用）
collect.slave_hosts	5.1	从 SHOW SLAVE HOSTS 收集
collect.heartbeat	5.1	从心跳收集
collect.heartbeat.database	5.1	从哪里收集心跳数据的数据库。（默认：心跳）
collect.heartbeat.table	5.1	从哪里收集心跳数据的表。（默认：心跳）
collect.heartbeat.utc	5.1	使用 UTC 作为当前服务器的时间戳（使用 `--utc` 调用 `pt-heartbeat` ）。（默认值：假）

4、Prometheus配置

配置prometheus去采集（拉取）mysql_exporter的监控样本数据

cd /data/docker-prometheus 

#在scrape_configs(搜刮配置):下面增加如下配置：

cat >> prometheus/prometheus.yml << "EOF"
  - job_name: 'mysqld_exporter'
    static_configs:
    - targets: ['192.168.11.62:9104']
      labels:
        instance: test服务器
EOF

重新加载配置

curl -X POST http://localhost:9090/-/reload

检查

5、常用监控指标

mysql_up                                              # 服务器是否在线
mysql_global_status_uptime                            # 运行时长，单位 s
delta(mysql_global_status_bytes_received[1m])         # 网络接收的 bytes
delta(mysql_global_status_bytes_sent[1m])             # 网络发送的 bytes


mysql_global_status_threads_connected                 # 当前的客户端连接数
mysql_global_variables_max_connections                # 允许的最大连接数
mysql_global_status_threads_running                   # 正在执行命令的客户端连接数，即非 sleep 状态
delta(mysql_global_status_aborted_connects[1m])       # 客户端建立连接失败的连接数，比如登录失败
delta(mysql_global_status_aborted_clients[1m])        # 客户端连接之后，未正常关闭的连接数


delta(mysql_global_status_commands_total{command="xx"}[1m]) > 0     # 每分钟各种命令的次数
delta(mysql_global_status_handlers_total{handler="xx"}[1m]) > 0     # 每分钟各种操作的次数
delta(mysql_global_status_handlers_total{handler="commit"}[1m]) > 0 # 每分钟 commit 的次数
delta(mysql_global_status_table_locks_immediate[1m])  # 请求获取锁，且立即获得的请求数
delta(mysql_global_status_table_locks_waited[1m])     # 请求获取锁，但需要等待的请求数。该值越少越好


delta(mysql_global_status_queries[1m])                # 每分钟的查询数
delta(mysql_global_status_slow_queries[1m])           # 慢查询数。如果未启用慢查询日志，则为 0
mysql_global_status_innodb_page_size                  # innodb 数据页的大小，单位 bytes
mysql_global_variables_innodb_buffer_pool_size        # innodb_buffer_pool 的限制体积
mysql_global_status_buffer_pool_pages{state="data"}   # 包含数据的数据页数，包括洁页、脏页
mysql_global_status_buffer_pool_dirty_pages           # 脏页数

6、触发器配置

Prometheus配置

# 报警(触发器)配置
rule_files:
  - "alert.yml"  
  - "rules/*.yml"

添加mysql触发器（告警规则）

cd /data/docker-prometheus

使用cat创建文件

cat >> prometheus/rules/mysqld.yml <<"EOF"
groups:
- name: MySQL
  rules:
  - alert: MysqlDown
    expr: mysql_up == 0
    for: 30s
    labels:
      severity: critical
    annotations:
      summary: "MySQL Down,实例:{{ $labels.instance }}"
      description: "MySQL_exporter连不上MySQL了，当前状态为：{{ $value }}"    
  - alert: MysqlTooManyConnections
    expr: max_over_time(mysql_global_status_threads_connected[1m]) / mysql_global_variables_max_connections * 100 > 80
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "Mysql连接数过多告警,实例:{{ $labels.instance }}"
      description: "MySQL连接数>80%,当前值：{{ $value }}"
  - alert: MysqlHighThreadsRunning
    expr: max_over_time(mysql_global_status_threads_running[1m]) > 20
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "Mysql运行的线程过多,实例:{{ $labels.instance }}"
      description: "Mysql运行的线程 > 20，当前运行的线程：{{ $value }}" 
  - alert: MysqlSlowQueries
    expr: increase(mysql_global_status_slow_queries[2m]) > 0
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "Mysql慢日志告警,实例:{{ $labels.instance }}"
      description: "MySQL在过去2分钟有新的{{ $value }}条慢查询"
  #MySQL innodb 日志写入停滞
  - alert: MysqlInnodbLogWaits
    expr: rate(mysql_global_status_innodb_log_waits[15m]) > 10
    for: 0m
    labels:
      severity: warning
    annotations:
      summary: "MySQL innodb日志等待,实例:{{ $labels.instance }}"
      description: "MySQL innodb日志写入停滞，当前值： {{ $value }}"
  - alert: MysqlRestarted
    expr: mysql_global_status_uptime < 60
    for: 0m
    labels:
      severity: info
    annotations:
      summary: "MySQL 重启,实例:{{ $labels.instance }}"
      description: "不到一分钟前，MySQL重启过"
EOF

检查：

vim prometheus/alert.yml

检查配置

docker exec -it prometheus promtool check config /etc/prometheus/prometheus.yml

重新加载配置

curl -X POST http://localhost:9090/-/reload

检查

http://192.168.11.61:9090/alerts?search=

或：

http://192.168.11.61:9090/rules

7、dashboard

grafana展示prometheus从mysql_exporter收集到的的数据

https://grafana.com/grafana/dashboards/7362

https://github.com/percona/grafana-dashboards/tree/main/dashboards/MySQL

Top Process States和 Process States图形修改为如下：

mysql_info_schema_threads 替换成：
mysql_info_schema_processlist_threads

数据库表监控

https://grafana.com/grafana/dashboards/9625

注意：2个图表没有数据，是因为只支持percona server 和 mariadb

Buffer pool size of total ram图形问题

如果mysql安装在linux服务上，并且这台linux服务器也是用node_exporter对其监控。那Buffer pool size of total ram这个图形面板显示不正常，解决如下：

把下面默认的表达式：

(mysql_global_variables_innodb_buffer_pool_size{instance="$host"} * 100) / on (instance) node_memory_MemTotal_bytes{instance="$host"}

修改为：

(label_replace(mysql_global_variables_innodb_buffer_pool_size{instance="$host"}, "nodename", "$1", "instance", "(.*):.*") * 100) / on(nodename) (label_replace(node_memory_MemTotal_bytes, "nodename", "$1", "instance", "(.*):.*"))

解决好，显示正常如下图：

目录CONTENT

2.10.监控mysql

一、环境

1、环境搭建

docker安装

docker-compose安装

docker-compose安装mysql

二、监控mysql

1、创建用户

2、二进制安装（二选一）

下载解压后移动/opt目录

创建用户

创建连接数据库的文件

更改exporter文件夹权限

创建systemd服务

启动 mysqld_exporter

加入到开机自启动

检查

2、docker安装（二选一）

docker-compose运行

3、参数解释

4、Prometheus配置

5、常用监控指标

6、触发器配置

添加mysql触发器（告警规则）

检查配置

重新加载配置

检查

7、dashboard

Buffer pool size of total ram图形问题

评论区