MySQL Notepad: February 2011

We had some interesting problem on one production MYSQL server today.
One developer reported there's a delete statement taking very long time to run on small tables.
The three tables involved this query, category_group, tmp_category_group category_group_info have 7400, 7600 and 25 rows respectively.
While checking slow query logs, shockingly I found the query has 3.8MM examined rows. Change the query to select statement will yield the same result.

# Query_time: 9 Lock_time: 0 Rows_sent: 0 Rows_examined: 3809728
delete from category_group where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select category_id from tmp_category_group cg where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));

# Query_time: 8 Lock_time: 0 Rows_sent: 0 Rows_examined: 3810182
select * from category_group where group_id = (select group_id from category_group_info x
where x.group_name = 'Storylines') and category_id in
( select category_id from tmp_category_group cg where group_id =
(select group_id from category_group_info x where x.group_name = 'KidsCharacter'));

The explain plan doesn't provide much help.

mysql> explain select * from category_group where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select category_id from tmp_category_group cg where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));

+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+

| id | select_type        | table          | type | possible_keys | key      | key_len | ref   | rows | Extra                    |

+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+

|  1 | PRIMARY            | category_group | ref  | group_id      | group_id | 4       | const |  501 | Using where; Using index | 

|  3 | DEPENDENT SUBQUERY | cg             | ALL  | NULL          | NULL     | NULL    | NULL  | 7396 | Using where              | 

|  4 | SUBQUERY           | x              | ALL  | NULL          | NULL     | NULL    | NULL  |   25 | Using where              | 

|  2 | SUBQUERY           | x              | ALL  | NULL          | NULL     | NULL    | NULL  |   25 | Using where              | 

+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+

4 rows in set (0.01 sec)

After extensive testing, I figure out the query will run fast if we take off where clause of this subquery.

select category_id from tmp_category_group cg where group_id = ...

That means the query run much faster on whole table of tmp_category_group than having a predicate. That doesn't make sense right?
That end up leading me to add an index on group_id of tmp_category_group and the query running lightening fast now.

Could this be some sort of Bug?

mysql> select * from category_group where group_id = (select distinct group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select distinct category_id from tmp_category_group cg where group_id = (select distinct group_id from category_group_info x where x.group_name = 'KidsCharacter'));

Empty set (0.31 sec)

mysql> delete from category_group

-> where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines')

-> and category_id in (

-> select category_id

-> from tmp_category_group cg

-> where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));

Query OK, 0 rows affected (0.36 sec)

Found a very good explanation of how Seconds_behind_master is computed in this bug report.

http://bugs.mysql.com/bug.php?id=2826

Here's how Seconds_behind_master is computed.

* When the slave connects to the master (when the slave I/O thread connects to the
master),
it does a = SELECT UNIX_TIMESTAMP() on master,
and b= SELECT UNIX_TIMESTAMP() on slave, and computes c=b-a.

* Every statement executed on the master is stored on the master's binary log together
with the timestamp when it was executed.

* So when the slave SQL thread reads a statement from the binary log, it executes it and
then it does:

seconds_behind_master= SELECT UNIX_TIMESTAMP() on slave
- timestamp of the master's binary log event
- c

MySQL Notepad

Sunday, February 20, 2011

Very slow query with high number of Rows_examined

Thursday, February 10, 2011

How Seconds_behind_master is computed

Useful Links

FEEDJIT Live Traffic Map

Blog Archive

About Me