Saturday, September 3, 2011

Large number of MYSQL connections get time out or time wait

Not sure how frequently people run to this issue. Sometime when you suddenly have a flood MYSQL connection request to MYSQL servers in a very short time. You might get connection timeout errors or see a lot of connection in time wait state. 
This is because MYSQL can't serve the connection fast enough and it's default listening queue size is rather small at 50.
You can set MYSQL variable back_log to lift this limit.
However this require a MYSQL server bounce.

back_log
 
Command-Line Format --back_log=#
Option-File Format back_log
Option Sets Variable Yes, back_log
Variable Name back_log
Variable Scope Global
Dynamic Variable No
Permitted Values
Type numeric
Default 50
Range 1 .. 65535

The number of outstanding connection requests MySQL can have. This comes into play when the main MySQL thread gets very many connection requests in a very short time. It then takes some time (although very little) for the main thread to check the connection and start a new thread. The back_log value indicates how many requests can be stacked during this short time before MySQL momentarily stops answering new requests. You need to increase this only if you expect a large number of connections in a short period of time.
In other words, this value is the size of the listen queue for incoming TCP/IP connections. Your operating system has its own limit on the size of this queue. The manual page for the Unix listen() system call should have more details. Check your OS documentation for the maximum value for this variable. back_log cannot be set higher than your operating system limit.

Friday, May 20, 2011

In response to: What is the query plan for a running query?

Mark in Facebook recently posted a great article about how to peek the query plan for a running query. 
What is the query plan for a running query?
Because there's number of cases where you can't use explain to find out which plan the running query is using.


One reason Mark pointed out in his article is when query using temporary table in join condition, you don't have the temporary table definition as well as the actual data got populated in it. Even you could get the create table statement for the temp table, the data you populated into it may not same as that running query depends on how dynamic your data is.


Another reason I can think of is similar to bind variable peeking issue in Oracle database. When you have skew in your data, different bind variable could result different SQL plan as demonstrated below. 
So if that case, explain can't really help you unless you know what bind variable was used at the time.


mysql> select table_type, count(*) from test_plan group by table_type;
+-------------+----------+
| table_type  | count(*) |
+-------------+----------+
| BASE TABLE  |     4910 | 
| SYSTEM VIEW |       95 | 
+-------------+----------+
2 rows in set (0.01 sec)


mysql> explain select * from test_plan where table_type='BASE TABLE' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test_plan
         type: ALL
possible_keys: table_type_inx
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 5017
        Extra: Using where
1 row in set (0.00 sec)


mysql> explain select * from test_plan where table_type='SYSTEM VIEW' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test_plan
         type: ref
possible_keys: table_type_inx
          key: table_type_inx
      key_len: 194
          ref: const
         rows: 95
        Extra: Using where
1 row in set (0.00 sec)


Sunday, February 20, 2011

Very slow query with high number of Rows_examined

We had some interesting problem on one production MYSQL server today.
One developer reported there's a delete statement taking very long time to run on small tables. 
The three tables involved this query, category_group, tmp_category_group category_group_info have 7400, 7600 and 25 rows respectively.
While checking slow query logs, shockingly I found the query has 3.8MM examined rows. Change the query to select statement will yield the same result.


# Query_time: 9  Lock_time: 0  Rows_sent: 0  Rows_examined: 3809728
delete from category_group where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select category_id from tmp_category_group cg where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));


# Query_time: 8  Lock_time: 0  Rows_sent: 0  Rows_examined: 3810182
select * from category_group where group_id = (select group_id from category_group_info x
 where x.group_name = 'Storylines') and category_id in
 ( select category_id from tmp_category_group cg where group_id =
 (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));


The explain plan doesn't provide much help.


mysql> explain select * from category_group where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select category_id from tmp_category_group cg where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));
+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+
| id | select_type        | table          | type | possible_keys | key      | key_len | ref   | rows | Extra                    |
+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+
|  1 | PRIMARY            | category_group | ref  | group_id      | group_id | 4       | const |  501 | Using where; Using index | 
|  3 | DEPENDENT SUBQUERY | cg             | ALL  | NULL          | NULL     | NULL    | NULL  | 7396 | Using where              | 
|  4 | SUBQUERY           | x              | ALL  | NULL          | NULL     | NULL    | NULL  |   25 | Using where              | 
|  2 | SUBQUERY           | x              | ALL  | NULL          | NULL     | NULL    | NULL  |   25 | Using where              | 
+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+
4 rows in set (0.01 sec)

After extensive testing, I figure out the query will run fast if we take off where clause of this subquery.

select category_id from tmp_category_group cg where group_id = ...

That means the query run much faster on whole table of tmp_category_group than having a predicate. That doesn't make sense right?
That end up leading me to add an index on group_id of tmp_category_group and the query running lightening fast now. 


Could this be some sort of Bug?

mysql> select * from category_group where group_id = (select distinct group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select distinct category_id from tmp_category_group cg where group_id = (select distinct group_id from category_group_info x where x.group_name = 'KidsCharacter'));
Empty set (0.31 sec)

mysql> delete from category_group
-> where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines')
-> and category_id in (
-> select category_id
-> from tmp_category_group cg
-> where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));
Query OK, 0 rows affected (0.36 sec) 

Thursday, February 10, 2011

How Seconds_behind_master is computed

Found a very good explanation of how Seconds_behind_master is computed in this bug report.

http://bugs.mysql.com/bug.php?id=2826


Here's how Seconds_behind_master is computed.

* When the slave connects to the master (when the slave I/O thread connects to the
master),
it does a = SELECT UNIX_TIMESTAMP() on master,
and b= SELECT UNIX_TIMESTAMP() on slave, and computes c=b-a.

* Every statement executed on the master is stored on the master's binary log together
with the timestamp when it was executed.

* So when the slave SQL thread reads a statement from the binary log, it executes it and
then it does:

seconds_behind_master= SELECT UNIX_TIMESTAMP() on slave
- timestamp of the master's binary log event
- c