MySQL RC级别下并发insert锁超时问题 - 源码分析
本系列打算用三篇文章来聊聊这个事情。这是第二篇,从源码层面来论证所做的假设。第一篇链接:
基于第一篇的假设
对于主键索引,最多存在一条主键相同的记录,该记录或者是delete-marked状态,或者是普通状态。因为对记录进行不涉及主键字段的update时总是inplace的,不存在delete+insert情况;insert时如果发现了主键相同的delete-marked记录,则直接复用该记录,即insert转为inplace update。而对于二级索引,update时总是执行delete+insert,insert时也不会复用delete-marked状态的记录。
看看代码层面是否支持。
update场景分析
update一条记录时,先更新主键索引,再判断是否需要修改了其他二级索引的字段,将被动了索引字段的索引也更新掉。
主键索引
主键update的函数入口是row_upd_clust_step,在该函数中会判断本次update是否更新了主键中的字段,若是,则调用row_upd_clust_rec_by_insert:
row_upd_changes_ord_field_binary_func函数的实现逻辑是判断该索引的 n_uniq 列的值是否被改变,如其中的列值被改变,则返回true,否则返回false。也就是说如果update语句改变了主键索引定义中指明的那些列的值,那么走delete+insert。本文关注的是未更新这些排序字段的场景,但从row_upd_clust_rec_by_insert函数的介绍可以反推出如果没有更新排序字段,那么都是采用直接更新现有记录或delete-marked记录,而不是通过delete+insert方式。为了证明这个推断,可以通过是否调用了函数btr_cur_del_mark_set_clust_rec来进行判断。
二级索引
二级索引update的函数入口是row_upd_sec_step,先通过下面的语句判断是否需要更新某个二级索引:
if (node->state == UPD_NODE_UPDATE_ALL_SEC
|| row_upd_changes_ord_field_binary(node->index, node->update,
thr, node->row, node->ext)) {
return(row_upd_sec_index_entry(node, thr));
}
#define UPD_NODE_UPDATE_ALL_SEC 5 /* an ordering field of the clustered
index record was changed, or this is
a delete operation: should update
all the secondary index records */
UPD_NODE_UPDATE_ALL_SEC标志位对应update操作更改了主键索引的排序字段(对应row_upd_clust_rec_by_insert)或删除了主键。row_upd_changes_ord_field_binary上面已经介绍过,是修改了该索引的排序字段(n_uniq)。所以只有在主键的n_uniq或本索引的n_uniq被修改的时候才需要更新二级索引。
我们接下来看,若确实需要更新二级索引,则调用row_upd_sec_index_entry函数来执行更新逻辑。
/***********************************************************//**
Updates a secondary index entry of a row.
@return DB_SUCCESS if operation successfully completed, else error
code or DB_LOCK_WAIT */
static MY_ATTRIBUTE((warn_unused_result))
dberr_t
row_upd_sec_index_entry(
在该函数中,会调用btr_cur_del_mark_set_sec_rec来将旧记录标记为delete-marked
并调用row_build_index_entry向该二级索引中插入一条新的记录
所以,从二级索引更新逻辑可以看到,都是采用delete+insert的流程。当然,像主键索引一样,可以通过判断是否调用了btr_cur_del_mark_set_sec_rec函数来确认。
综合主键索引和二级索引的情况,可以说跟我们的假设是没有冲突的。
insert场景分析
接下来看看insert一条记录是如何执行的,由于InnoDB是索引组织表,insert操作当然都是发生在每个索引上。入口函数为row_ins,其会为每个索引循环调用row_ins_index_entry_step->row_ins_index_entry来处理具体的索引插入行为。
/***************************************************************//**
Inserts an index entry to index. Tries first optimistic, then pessimistic
descent down the tree. If the entry matches enough to a delete marked record,
performs the insert by updating or delete unmarking the delete marked
record.
@return DB_SUCCESS, DB_LOCK_WAIT, DB_DUPLICATE_KEY, or some other error code */
static
dberr_t
row_ins_index_entry(
该函数根据是否为主键索引又分别调用row_ins_clust_index_entry和row_ins_sec_index_entry。我们分别看看主键索引和二级索引的处理函数。
主键索引
row_ins_clust_index_entry进一步调用row_ins_clust_index_entry_low:
/***************************************************************//**
Tries to insert an entry into a clustered index, ignoring foreign key
constraints. If a record with the same unique key is found, the other
record is necessarily marked deleted by a committed transaction, or a
unique key violation error occurs. The delete marked record is then
updated to an existing record, and we must write an undo log record on
the delete marked record.
@retval DB_SUCCESS on success
@retval DB_LOCK_WAIT on lock wait when !(flags & BTR_NO_LOCKING_FLAG)
@retval DB_FAIL if retry with BTR_MODIFY_TREE is needed
@return error code */
dberr_t
row_ins_clust_index_entry_low(
从row_ins_clust_index_entry_low函数描述知道,在主键索引插入时,如果发现有delete-marked的记录,该记录的唯一性字段跟要插入的记录一样,那么直接调用row_ins_clust_index_entry_by_modify函数进行复用。
二级索引
row_ins_sec_index_entry进一步调用row_ins_sec_index_entry_low:
/***************************************************************//**
Tries to insert an entry into a secondary index. If a record with exactly the
same fields is found, the other record is necessarily marked deleted.
It is then unmarked. Otherwise, the entry is just inserted to the index.
@retval DB_SUCCESS on success
@retval DB_LOCK_WAIT on lock wait when !(flags & BTR_NO_LOCKING_FLAG)
@retval DB_FAIL if retry with BTR_MODIFY_TREE is needed
@return error code */
dberr_t
row_ins_sec_index_entry_low(
从row_ins_sec_index_entry_low函数描述可知,不同于主键索引,只有找到该索引的所有字段都相同的delete-marked索引时才会复用,此时调用的函数为row_ins_sec_index_entry_by_modify,而其他情况都是执行插入新记录的逻辑。
从上面的分析可以知道,对于插入场景,都会在某些情况下复用delete-marked状态的老记录。而这是通过函数row_ins_must_modify_rec来判断的。
/***************************************************************//**
Checks if an index entry has long enough common prefix with an
existing record so that the intended insert of the entry must be
changed to a modify of the existing record. In the case of a clustered
index, the prefix must be n_unique fields long. In the case of a
secondary index, all fields must be equal. InnoDB never updates
secondary index records in place, other than clearing or setting the
delete-mark flag. We could be able to update the non-unique fields
of a unique secondary index record by checking the cursor->up_match,
but we do not do so, because it could have some locking implications.
@return TRUE if the existing record should be updated; FALSE if not */
UNIV_INLINE
ibool
row_ins_must_modify_rec(
/*====================*/
const btr_cur_t* cursor) /*!< in: B-tree cursor */
/* NOTE: (compare to the note in row_ins_duplicate_error_in_clust)
Because node pointers on upper levels of the B-tree may match more
to entry than to actual user records on the leaf level, we
have to check if the candidate record is actually a user record.
A clustered index node pointer contains index->n_unique first fields,
and a secondary index node pointer contains all index fields. */
return(cursor->low_match
>= dict_index_get_n_unique_in_tree(cursor->index)
&& !page_rec_is_infimum(btr_cur_get_rec(cursor)));
}
该函数的说明很清楚得介绍了什么时候会复用delete-marked记录:如果主键索引搜索到一条记录,只要delete-marked的记录与说要插入记录的前缀匹配长度等于主键索引的 n_unique 长度,那么就复用。但对于二级唯一索引,则需要该索引的每个字段( n_fields )的值都相同才会复用。
我们进一步分析该函数的实现。其中cursor-> low_match的解释为:
ulint low_match; /*!< if search mode was PAGE_CUR_LE,
the number of matched fields to the
first user record AT THE CURSOR or
to the left of it after
btr_cur_search_to_nth_level;
NOT defined for PAGE_CUR_GE or any
other search modes; see also the NOTE
in up_match! */
函数dict_index_get_n_unique_in_tree的定义为:
dict_index_get_n_unique_in_tree(
/*============================*/
const dict_index_t* index) /*!< in: an internal representation
of index (in the dictionary cache) */
ut_ad(index);
ut_ad(index->magic_n == DICT_INDEX_MAGIC_N);
ut_ad(index->cached);
if (dict_index_is_clust(index)) {
return(dict_index_get_n_unique(index));