MySQL联表查询的索引使用
项目中一般使用的都是单表查询,但是在一些业务场景下,偶尔会选择联表查询,一直对联表查询时如何使用索引一直感到很好奇。正好近期项目中遇到一个问题,联表查询时,没有建立索引,耗时居然达到了可耻的 10分钟 ,所以趁机了解了一下。
表数据
一共3张表
knowledge
,
knowledge_question
,
knowledge_answer
,数据在
6000~10000
之间。
knowledge: 6126
knowledge_question:9647
knowledge_answer:8267
执行的语句:
SELECT DISTINCT(k.base_id) FROM knowledge AS k
LEFT JOIN knowledge_question AS q ON k.id=q.knowledge_id
LEFT JOIN knowledge_answer AS a ON k.id=a.knowledge_id
WHERE k.update_time>'2019-01-01 12:00:00' AND q.update_time>'2019-01-01 12:00:00' AND a.update_time>'2019-01-01 12:00:00'
没有索引(只有主键)
mysql > SELECT DISTINCT(k.base_id) FROM knowledge AS k LEFT JOIN knowledge_question AS q ON k.id=q.knowledge_id LEFT JOIN knowledge_answer AS a ON k.id=a.knowledge_id WHERE k.update_time>'2019-01-01 12:00:00' AND q.update_time>'2019-01-01 12:00:00' AND a.update_time>'2019-01-01 12:00:00';
+---------+
| base_id |
+---------+
| 159 |
| 413 |
| 414 |
+---------+
145 rows in set, 3 warnings (9 min 26.57 sec)
执行时间约
10分钟
,查看执行计划如下:
mysql > explain SELECT DISTINCT(k.base_id) FROM knowledge AS k LEFT JOIN knowledge_question AS q ON k.id=q.knowledge_id LEFT JOIN knowledge_answer AS a ON k.id=a.knowledge_id WHERE k.update_time>'2019-01-01 12:00:00' AND q.update_time>'2019-01-01 12:00:00' AND a.update_time>'2019-01-01 12:00:00';
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------------------------+
| 1 | SIMPLE | k | NULL | ALL | NULL | NULL | NULL | NULL | 6238 | 100.00 | Using temporary |
| 1 | SIMPLE | q | NULL | ALL | NULL | NULL | NULL | NULL | 9540 | 100.00 | Using where; Distinct; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 8410 | 100.00 | Using where; Distinct; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------------------------+
3 rows in set, 4 warnings (0.00 sec)
全部都是全表扫描,根据
MySQL
联表查询的算法
Nested-Loop Join
,MySQL查询的结果集是3张表的笛卡尔积,所以效率特别低。
JOIN字段建立索引
explain SELECT DISTINCT(k.base_id) FROM knowledge AS k LEFT JOIN knowledge_question AS q ON k.id=q.knowledge_id LEFT JOIN knowledge_answer AS a ON k.id=a.knowledge_id WHERE k.update_time>'2019-01-01 12:00:00' AND q.update_time>'2019-01-01 12:00:00' AND a.update_time>'2019-01-01 12:00:00';
+----+-------------+-------+------------+------+---------------+---------+---------+-------------------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+-------------------+------+----------+------------------------------+
| 1 | SIMPLE | k | NULL | ALL | PRIMARY | NULL | NULL | NULL | 6444 | 33.33 | Using where; Using temporary |
| 1 | SIMPLE | a | NULL | ref | idx_kid | idx_kid | 4 | knowledge_base.k.id | 1 | 33.33 | Using where; Distinct |
| 1 | SIMPLE | q | NULL | ref | idx_kid | idx_kid | 4 | knowledge_base.k.id | 1 | 33.33 | Using where; Distinct |
+----+-------------+-------+------------+------+---------------+---------+---------+-------------------+------+----------+------------------------------+
执行结果
+---------+
| base_id |
+---------+
| 159 |
| 413 |
| 414 |
+---------+
145 rows in set (0.02 sec)
耗时变成
20毫秒
Where条件建立索引
给Where条件建立索引,并不一定会使用。
比如:在表
knowledge
的字段
update
上建立索引
idx_time
:
MySQL [knowledge_base]> alter table knowledge add index idx_time(update_time);
MySQL [knowledge_base]> explain SELECT DISTINCT(k.base_id) FROM knowledge AS k LEFT JOIN knowledge_question AS q ON k.id=q.knowledge_id LEFT JOIN knowledge_answer AS a ON k.id=a.knowledge_id WHERE k.update_time>'2019-01-03 12:00:00' AND q.update_time>'2019-01-01 12:00:00' AND a.update_time>'2019-01-01 12:00:00';
+----+-------------+-------+------------+------+------------------+---------+---------+-------------------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+---------+---------+-------------------+------+----------+------------------------------+
| 1 | SIMPLE | k | NULL | ALL | PRIMARY,idx_time | NULL | NULL | NULL | 6444 | 19.01 | Using where; Using temporary |
| 1 | SIMPLE | a | NULL | ref | idx_kid | idx_kid | 4 | knowledge_base.k.id | 1 | 33.33 | Using where; Distinct |
| 1 | SIMPLE | q | NULL | ref | idx_kid | idx_kid | 4 | knowledge_base.k.id | 1 | 33.33 | Using where; Distinct |
+----+-------------+-------+------------+------+------------------+---------+---------+-------------------+------+----------+------------------------------+
结果执行上来看,并没有使用索引
idx_time
。
如果where条件从
k.update_time>'2019-01-03 12:00:00'
修改为
k.update_time='2019-01-03 12:00:00'
(从
>
变成
=
)
MySQL [knowledge_base]> explain SELECT DISTINCT(k.base_id) FROM knowledge AS k LEFT JOIN knowledge_question AS q ON k.id=q.knowledge_id LEFT JOIN knowledge_answer AS a ON k.id=a.knowledge_id WHERE k.update_time='2019-01-03 12:00:00' AND q.update_time>'2019-01-01 12:00:00' AND a.update_time>'2019-01-01 12:00:00';
+----+-------------+-------+------------+------+------------------+----------+---------+-------------------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------+----------+---------+-------------------+------+----------+-----------------------+
| 1 | SIMPLE | k | NULL | ref | PRIMARY,idx_time | idx_time | 4 | const | 1 | 100.00 | Using temporary |
| 1 | SIMPLE | a | NULL | ref | idx_kid | idx_kid | 4 | knowledge_base.k.id | 1 | 33.33 | Using where; Distinct |
| 1 | SIMPLE | q | NULL | ref | idx_kid | idx_kid | 4 | knowledge_base.k.id | 1 | 33.33 | Using where; Distinct |
+----+-------------+-------+------------+------+------------------+----------+---------+-------------------+------+----------+-----------------------+
则会使用索引
idx_time
继续试验发现,如果在
knowledge_question
和knowledge_answer
表上的字段update_time
上建立索引,有时候会较大幅度的改变执行计划。 所以说,检查SQL语句是否用到索引,一定要用explain查看执行计划,MySQL优化器做了太多的工作了。
其他知识点
在建立索引的时候,会遇到
Table Metadata Lock
的问题,可以先
show processlist
,找到占用表锁的连接,然后
kill
。
MySQL [(none)]> show processlist;