laitimes

A live accident caused by an update statement!

author:Civen

Last night while paddling in a group, I saw a reader say such a thing.

A live accident caused by an update statement!

Probably, when executing an update statement online to modify the database data, the where condition did not bring the index, resulting in the direct collapse of the business and a wave of lessons from the boss

This time we'll take a look:

  • Why did this accident happen?
  • And how can such accidents be avoided?

On the premise, the next cases are based on the InnoDB storage engine, and the isolation level of transactions is repeatable.

Why did this accident happen?

The default transaction isolation level of the InnoDB storage engine is "repeatable read", but under this isolation level, when multiple transactions are concurrent, the problem of phantom reading occurs, the so-called phantom read refers to the execution of the same query statement twice in a row under the same transaction, and the second query statement may return rows that did not exist before.

Therefore, the InnoDB storage engine implements its own row lock, which locks the "gap" between the record itself and the record through a next-key lock (a combination of record lock and gap lock), preventing other transactions from inserting new records between this record, thus avoiding the phenomenon of phantom reading.

When we execute the update statement, we actually add an exclusive lock (X lock) to the record, which is blocked if other transactions modify the record holding the exclusive lock. In addition, the lock is not released when the update statement is executed, but when the transaction ends.

In InnoDB practice, the basic unit of locking a record is a next-key lock, but it will degenerate into a gap lock or a record lock due to some conditions. The location of the lock is precisely that the lock is added to the index, not the row.

For example, if a unique index is used in the where condition of the update statement, the next-key lock will degenerate into a record lock, that is, only one row of records will be locked.

For example, here is a database table where id is the primary key index.

A live accident caused by an update statement!

Suppose there are two transactions executed in the following order:

A live accident caused by an update statement!

It can be seen that where is an equal-value query in the update statement of transaction A, and id is a unique index, so only the record with id = 1 will be locked, so the update operation of transaction B will not block.

However, if no index is used in the where condition of the update statement, the full table will be scanned, so a next-key lock (record lock + gap lock) will be added to all records, which is equivalent to locking the entire table.

Suppose there are two transactions executed in the following order:

A live accident caused by an update statement!

As you can see, the update statement of transaction B is blocked this time.

This is because there is no index column in the where condition in the update statement of transaction A, and all records will be locked, that is, this update statement produces 4 record locks and 5 gap locks, which is equivalent to locking the entire table.

A live accident caused by an update statement!

Therefore, when executing an update statement on a database table with a very large amount of data, if no index is used, a next-key lock will be added to the entire table, and the lock will last for a long time until the end of the transaction, except for select ... from statement, other statements will be locked and cannot be executed, the business will be stagnant, and what awaits you next is the boss's scolding.

So where with the index of the update statement can avoid locking the full table records?

Not.

The key depends on the execution of this statement, the optimizer finally chooses whether to scan the index or the full table, if the full table scan is taken, the records of the full table will be locked.

And how can such accidents be avoided?

We can set the sql_safe_updates parameter in MySQL to 1 to enable the security update mode.

官方的解释: If set to 1, MySQL aborts UPDATE or DELETE statements that do not use a key in the WHERE clause or a LIMIT clause. (Specifically, UPDATE statements must have a WHERE clause that uses a key or a LIMIT clause, or both. DELETE statements must have both.) This makes it possible to catch UPDATE or DELETE statements where keys are not used properly and that would probably change or delete a large number of rows. The default value is 0.

Roughly this means when sql_safe_updates is set to 1.

The update statement must meet one of the following conditions to execute successfully:

  • Use where and there must be an index column in the where condition;
  • Use limit;
  • Use both where and limit, where there can be no index column in the where condition;

The delete statement must meet one of the following conditions for it to succeed:

  • Use where and there must be an index column in the where condition;
  • Use both where and limit, where there can be no index column in the where condition;

If the where condition brings an index column, but the optimizer finally scans the full table instead of the index, we can use force index([index_name]) to tell the optimizer which index to use, so as to avoid the hidden danger caused by locking the full table.

summary

Don't underestimate an update statement, improper use on production machines may lead to business stagnation or even crash.

When we want to execute the update statement, make sure that the index column is included in the where condition, and confirm whether the statement is an index scan on the test machine to prevent all records in the table from being locked because the entire table is scanned.

We can turn on the MySQL sql_safe_updates parameter, which prevents the where condition from not carrying the index column when the update operation.

If we find that even if the column index column is brought in the where condition, the optimizer is still scanning the full standard, then we need to use force index([index_name]) to tell the optimizer which index to use.

That's it for this time, be careful next time, don't be scolded by the boss again.