Contents
  1. 1. 前提
  2. 2. 特殊、 打印出执行时间超过3秒的connection,仅仅打印,不kill
  3. 3. 一、打印出sleep时间超过3秒的connection,仅仅打印,不kill
  4. 4. 二、kill掉query语句中带有sleep关键字(不区分大小写)的connection, 且Time超过3秒
  5. 5. 三、kill掉非系统用户的select开头,且执行时间超过3秒的 connection
  6. 6. 四、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒的 connection
  7. 7. 五、kill掉指定特征的query语句
  8. 8. 六、kill掉非系统库的select开头,且执行时间超过3秒的 connection
  9. 9. 七、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒,且不是被locked住 的 connection
  10. 10. 八、kill掉非系统用户,指定state(Locked、login、Updating、Sorting for order等状态),且执行时间超过3秒 的 connection
  11. 11. 九、kill掉非系统用户,指定Command(Query、Sleep、Binlog Dump、Connect等状态),且执行时间超过3秒 的 connection
  12. 12. 十、kill掉指定来源host ip ,且select开头的,且执行时间超过3s的connection
  13. 13. 十一、kill掉非系统用户,Command=Sleep,且空闲时间为3s的connection
  14. 14. 十二、kill掉非系统用户,指定特征的query,在后台运行,并打印日志
  15. 15. 十三、—victims的用法
  16. 16. 十四、kill掉非系统用户,指定库的所有链接
  17. 17. 十五、kill掉指定用户的所有链接

列出几种常用场景,并进行分析实战测试

前提

1
2
3
4
* pt-kill必须获得的权限
1. PROCESS , SUPER
2. 否则,你只能看到打印出kill id,但是实际上并没有被kill掉

特殊、 打印出执行时间超过3秒的connection,仅仅打印,不kill

  • 每2秒循环一次,超过10秒就退出pt-kill程序
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --busy-time=3 --interval 2 --run-time=10
重点注意: 这里的--busy-time=3,指的是Command=Query的连接,其他的并不会被匹配哦 , 所以一般情况下删除的都是比较安全的用户thread
重点注意2: ddl,dml,select,都是属于Command=Query
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 492777 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 492765 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Sleep | 0 | | NULL |
| 1053 | dbadmin | localhost | heartbeat_db | Query | 1 | altering table | alter table heartbeat add column ts2 date |
| 1055 | pt_kill | xx.xxx.126.166:63167 | NULL | Sleep | 1 | | NULL |
| 1056 | dbadmin | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------+
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 492716 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 492704 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Sleep | 0 | | NULL |
| 1053 | dbadmin | localhost | heartbeat_db | Query | 3 | updating | update heartbeat set ts = '2018-09-06 00:07:58' |
| 1055 | pt_kill | xx.xxx.126.166:63167 | NULL | Sleep | 0 | | NULL |
| 1056 | dbadmin | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------------------------------------+
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 492613 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 492601 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Sleep | 0 | | NULL |
| 1053 | dbadmin | localhost | heartbeat_db | Query | 3 | User sleep | select 1,sleep(4) |
| 1055 | pt_kill | xx.xxx.126.166:63167 | NULL | Sleep | 1 | | NULL |
| 1056 | dbadmin | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+-------------------+
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 492740 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 492728 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Sleep | 0 | | NULL |
| 1053 | dbadmin | localhost | heartbeat_db | Query | 4 | starting | rollback |
| 1055 | pt_kill | xx.xxx.126.166:63167 | NULL | Sleep | 0 | | NULL |
| 1056 | dbadmin | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+------------------+

一、打印出sleep时间超过3秒的connection,仅仅打印,不kill

  • 每2秒循环一次,无限循环下去
1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --idle-time=3 --interval 2
  • 每2秒循环一次,超过10秒就退出pt-kill程序
1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --idle-time=3 --interval 2 --run-time=10

二、kill掉query语句中带有sleep关键字(不区分大小写)的connection, 且Time超过3秒

1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --print --ignore-self --interval 2 --match-info "(?i-xsm:(sleep))" --busy-time=3 --kill --victims all

三、kill掉非系统用户的select开头,且执行时间超过3秒的 connection

1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))" --ignore-user="root|repl" --busy-time=3 --kill --victims all

四、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒的 connection

1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))" --ignore-user="root|repl" --busy-time=3 --kill --victims all

五、kill掉指定特征的query语句

  • kill掉非系统用户,且query语句中同时包含heartbeat 和 where ,且heartbeat在前where在后,且执行时间超过3秒的 connection
1
2
3
4
5
6
7
8
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info ".*heartbeat.*where.*" --ignore-user="root|repl" --busy-time=3 --kill --victims all
# 2018-11-15T10:38:03 KILL 1053 (Query 27 sec) select *,sleep(10) from heartbeat where id < '1000000000'
# 2018-11-15T10:38:05 KILL 1053 (Query 29 sec) select *,sleep(10) from heartbeat where id < '1000000000'
# 2018-11-15T10:38:07 KILL 1053 (Query 31 sec) select *,sleep(10) from heartbeat where id < '1000000000'
# 2018-11-15T10:38:09 KILL 1053 (Query 33 sec) select *,sleep(10) from heartbeat where id < '1000000000'

六、kill掉非系统库的select开头,且执行时间超过3秒的 connection

1
2
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))" --ignore-db="mysql|information_schema" --ignore-user="root|repl" --busy-time=3 --kill --victims all

七、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒,且不是被locked住 的 connection

1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))" --ignore-state="Locked" --ignore-user="root|repl" --busy-time=3 --kill --victims all

八、kill掉非系统用户,指定state(Locked、login、Updating、Sorting for order等状态),且执行时间超过3秒 的 connection

  • 指定Locked的connection删除掉
1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))" --match-state="Locked" --ignore-user="root|repl" --busy-time=3 --kill --victims all

九、kill掉非系统用户,指定Command(Query、Sleep、Binlog Dump、Connect等状态),且执行时间超过3秒 的 connection

  • kill掉指定Connect的command connection
1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))|(?i-xsm:^(update))|(?i-xsm:^(delete))" --match-command="Connect" --ignore-user="root|repl" --busy-time=3 --kill --victims all

十、kill掉指定来源host ip ,且select开头的,且执行时间超过3s的connection

1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info "(?i-xsm:^(select))" --ignore-host="x.x.x.x" --ignore-user="root|repl" --busy-time=3 --kill --victims all

十一、kill掉非系统用户,Command=Sleep,且空闲时间为3s的connection

1
2
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl" --idle-time=3 --kill --victims all

十二、kill掉非系统用户,指定特征的query,在后台运行,并打印日志

1
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-info ".*heartbeat.*where.*" --ignore-user="root|repl" --busy-time=3 --daemonize --log='/root/kill.log' --kill --victims all

十三、—victims的用法

  • 背景
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pid:1103 , 是最早开启事务的空闲进程 T1
pid:1095 , 是第二早开启事务的ddl进程 T2
pid:502 , 是最后一个开启事务的dml进程 T3
事务顺序是: T1 锁住了 T2,T3, T2锁住了T3 , T3被T1,T2锁住
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
| 3 | repl | xx.xxx.126.166:60528 | NULL | Binlog Dump GTID | 507504 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 4 | repl | xx.xxx.126.165:48604 | NULL | Binlog Dump GTID | 507492 | Master has sent all binlog to slave; waiting for more updates | NULL |
| 502 | job_heartbeat | xx.xxx.2.217:34626 | heartbeat_db | Query | 328 | Waiting for table metadata lock | insert into heartbeat(ts) values('2018-11-15 13:43:50') |
| 1095 | dbadxxx | localhost | heartbeat_db | Query | 329 | Waiting for table metadata lock | alter table heartbeat add column ts3 date |
| 1103 | dbadxxx | localhost | heartbeat_db | Sleep | 341 | | NULL |
| 1104 | dbadxxx | localhost | NULL | Query | 0 | starting | show processlist |
+------+---------------+----------------------+--------------+------------------+--------+---------------------------------------------------------------+---------------------------------------------------------+
6 rows in set (0.00 sec)
  • —match-command=”Query|Sleep” —victims oldest —busy-time=3

只kill最老的command为Query|Sleep的最老的链接

1
2
3
4
5
pt-kill --host xx.xxx.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl" --match-command="Query|Sleep" --busy-time=3 --victims oldest
Enter MySQL password:
# 2018-11-15T13:49:07 KILL 1103 (Sleep 330 sec) NULL
# 2018-11-15T13:49:09 KILL 1103 (Sleep 332 sec) NULL
  • —match-command=”Query|Sleep” —victims all —busy-time=3

kill 所有command=”Query|Sleep” 的所有链接

1
2
3
4
5
Enter MySQL password:
# 2018-11-15T13:48:41 KILL 1103 (Sleep 304 sec) NULL
# 2018-11-15T13:48:41 KILL 1095 (Query 292 sec) alter table heartbeat add column ts3 date
# 2018-11-15T13:48:41 KILL 502 (Query 291 sec) insert into heartbeat(ts) values('2018-11-15 13:43:50')
# 2018-11-15T13:48:41 KILL 1104 (Sleep 262 sec) NULL
  • —match-command=”Query” —victims all —busy-time=3

kill 所有 command=”Query”(默认不填也就是Query)的所有链接

1
2
3
Enter MySQL password:
# 2018-11-15T13:46:59 KILL 1095 (Query 190 sec) alter table heartbeat add column ts3 date
# 2018-11-15T13:46:59 KILL 502 (Query 189 sec) insert into heartbeat(ts) values('2018-11-15 13:43:50')
  • —match-command=”Query” —victims oldest —busy-time=3

kill 所有 command=”Query”(默认不填也就是Query)的最老的链接

1
2
Enter MySQL password:
# 2018-11-15T13:59:01 KILL 1095 (Query 912 sec) alter table heartbeat add column ts3 date

十四、kill掉非系统用户,指定库的所有链接

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
举例一、kill掉非root和repl账号,使用open_db或xf_loupan_db的所有链接
pt-kill --host 10.126.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --ignore-user="root|repl" --match-db="open_db|xf_loupan_db" --victims all --kill
dbadmin:xf_loupan_db> select *,sleep(100) from loupan_grade_photo;
ERROR 2013 (HY000): Lost connection to MySQL server during query
dbadmin:xf_loupan_db> select *,sleep(100) from loupan_grade_photo;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 245
Current database: xf_loupan_db
ERROR 2013 (HY000): Lost connection to MySQL server during query
Enter MySQL password:
# 2018-12-26T12:34:46 KILL 243 (Query 7 sec) select *,sleep(100) from loupan_grade_photo
# 2018-12-26T12:35:00 KILL 245 (Query 1 sec) select *,sleep(100) from loupan_grade_photo
特别注意: 这里面只针对db这一列是open_db或xf_loupan_db才会有效果,如果这个用户拥有多个库的权限
那么他在xx库去 select * from xf_loupan_db.loupan_grade_photo 是不会生效的
所以,要保证这个库账号只能对应一个库,才会有效果
比如这样是没有效果的:
root:(none)> show processlist;
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
| 248 | root | localhost | NULL | Query | 0 | starting | show processlist |
| 252 | pt_kill | 10.126.126.166:33088 | NULL | Sleep | 1 | | NULL |
| 257 | dbadmin | localhost | sys | Query | 19 | User sleep | select *,sleep(100) from xf_loupan_db.loupan_grade_photo |
+-----+---------+----------------------+------+---------+------+------------+----------------------------------------------------------+
3 rows in set (0.00 sec)

十五、kill掉指定用户的所有链接

一般我们什么时候使用它呢? 一般是在迁移某个DB,需要将这个DB上的账号链接全部清理掉,防止长连接

1
2
3
4
举例:kill掉所有lc_rx rc_ronly 两个账号的所有链接
pt-kill --host 10.126.126.164 --port 3306 --user pt_kill --ask-pass --ignore-self --print --interval 2 --match-user="lc_rx|rc_ronly" --victims all --kill
Contents
  1. 1. 前提
  2. 2. 特殊、 打印出执行时间超过3秒的connection,仅仅打印,不kill
  3. 3. 一、打印出sleep时间超过3秒的connection,仅仅打印,不kill
  4. 4. 二、kill掉query语句中带有sleep关键字(不区分大小写)的connection, 且Time超过3秒
  5. 5. 三、kill掉非系统用户的select开头,且执行时间超过3秒的 connection
  6. 6. 四、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒的 connection
  7. 7. 五、kill掉指定特征的query语句
  8. 8. 六、kill掉非系统库的select开头,且执行时间超过3秒的 connection
  9. 9. 七、kill掉非系统用户的select,update,delete开头,且执行时间超过3秒,且不是被locked住 的 connection
  10. 10. 八、kill掉非系统用户,指定state(Locked、login、Updating、Sorting for order等状态),且执行时间超过3秒 的 connection
  11. 11. 九、kill掉非系统用户,指定Command(Query、Sleep、Binlog Dump、Connect等状态),且执行时间超过3秒 的 connection
  12. 12. 十、kill掉指定来源host ip ,且select开头的,且执行时间超过3s的connection
  13. 13. 十一、kill掉非系统用户,Command=Sleep,且空闲时间为3s的connection
  14. 14. 十二、kill掉非系统用户,指定特征的query,在后台运行,并打印日志
  15. 15. 十三、—victims的用法
  16. 16. 十四、kill掉非系统用户,指定库的所有链接
  17. 17. 十五、kill掉指定用户的所有链接

幸福,不在于得到的多

而在于计较的少