Saturday, September 3, 2011

Large number of MYSQL connections get time out or time wait

Not sure how frequently people run to this issue. Sometime when you suddenly have a flood MYSQL connection request to MYSQL servers in a very short time. You might get connection timeout errors or see a lot of connection in time wait state. 
This is because MYSQL can't serve the connection fast enough and it's default listening queue size is rather small at 50.
You can set MYSQL variable back_log to lift this limit.
However this require a MYSQL server bounce.

back_log
 
Command-Line Format --back_log=#
Option-File Format back_log
Option Sets Variable Yes, back_log
Variable Name back_log
Variable Scope Global
Dynamic Variable No
Permitted Values
Type numeric
Default 50
Range 1 .. 65535

The number of outstanding connection requests MySQL can have. This comes into play when the main MySQL thread gets very many connection requests in a very short time. It then takes some time (although very little) for the main thread to check the connection and start a new thread. The back_log value indicates how many requests can be stacked during this short time before MySQL momentarily stops answering new requests. You need to increase this only if you expect a large number of connections in a short period of time.
In other words, this value is the size of the listen queue for incoming TCP/IP connections. Your operating system has its own limit on the size of this queue. The manual page for the Unix listen() system call should have more details. Check your OS documentation for the maximum value for this variable. back_log cannot be set higher than your operating system limit.

Friday, May 20, 2011

In response to: What is the query plan for a running query?

Mark in Facebook recently posted a great article about how to peek the query plan for a running query. 
What is the query plan for a running query?
Because there's number of cases where you can't use explain to find out which plan the running query is using.


One reason Mark pointed out in his article is when query using temporary table in join condition, you don't have the temporary table definition as well as the actual data got populated in it. Even you could get the create table statement for the temp table, the data you populated into it may not same as that running query depends on how dynamic your data is.


Another reason I can think of is similar to bind variable peeking issue in Oracle database. When you have skew in your data, different bind variable could result different SQL plan as demonstrated below. 
So if that case, explain can't really help you unless you know what bind variable was used at the time.


mysql> select table_type, count(*) from test_plan group by table_type;
+-------------+----------+
| table_type  | count(*) |
+-------------+----------+
| BASE TABLE  |     4910 | 
| SYSTEM VIEW |       95 | 
+-------------+----------+
2 rows in set (0.01 sec)


mysql> explain select * from test_plan where table_type='BASE TABLE' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test_plan
         type: ALL
possible_keys: table_type_inx
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 5017
        Extra: Using where
1 row in set (0.00 sec)


mysql> explain select * from test_plan where table_type='SYSTEM VIEW' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test_plan
         type: ref
possible_keys: table_type_inx
          key: table_type_inx
      key_len: 194
          ref: const
         rows: 95
        Extra: Using where
1 row in set (0.00 sec)


Sunday, February 20, 2011

Very slow query with high number of Rows_examined

We had some interesting problem on one production MYSQL server today.
One developer reported there's a delete statement taking very long time to run on small tables. 
The three tables involved this query, category_group, tmp_category_group category_group_info have 7400, 7600 and 25 rows respectively.
While checking slow query logs, shockingly I found the query has 3.8MM examined rows. Change the query to select statement will yield the same result.


# Query_time: 9  Lock_time: 0  Rows_sent: 0  Rows_examined: 3809728
delete from category_group where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select category_id from tmp_category_group cg where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));


# Query_time: 8  Lock_time: 0  Rows_sent: 0  Rows_examined: 3810182
select * from category_group where group_id = (select group_id from category_group_info x
 where x.group_name = 'Storylines') and category_id in
 ( select category_id from tmp_category_group cg where group_id =
 (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));


The explain plan doesn't provide much help.


mysql> explain select * from category_group where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select category_id from tmp_category_group cg where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));
+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+
| id | select_type        | table          | type | possible_keys | key      | key_len | ref   | rows | Extra                    |
+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+
|  1 | PRIMARY            | category_group | ref  | group_id      | group_id | 4       | const |  501 | Using where; Using index | 
|  3 | DEPENDENT SUBQUERY | cg             | ALL  | NULL          | NULL     | NULL    | NULL  | 7396 | Using where              | 
|  4 | SUBQUERY           | x              | ALL  | NULL          | NULL     | NULL    | NULL  |   25 | Using where              | 
|  2 | SUBQUERY           | x              | ALL  | NULL          | NULL     | NULL    | NULL  |   25 | Using where              | 
+----+--------------------+----------------+------+---------------+----------+---------+-------+------+--------------------------+
4 rows in set (0.01 sec)

After extensive testing, I figure out the query will run fast if we take off where clause of this subquery.

select category_id from tmp_category_group cg where group_id = ...

That means the query run much faster on whole table of tmp_category_group than having a predicate. That doesn't make sense right?
That end up leading me to add an index on group_id of tmp_category_group and the query running lightening fast now. 


Could this be some sort of Bug?

mysql> select * from category_group where group_id = (select distinct group_id from category_group_info x where x.group_name = 'Storylines') and category_id in ( select distinct category_id from tmp_category_group cg where group_id = (select distinct group_id from category_group_info x where x.group_name = 'KidsCharacter'));
Empty set (0.31 sec)

mysql> delete from category_group
-> where group_id = (select group_id from category_group_info x where x.group_name = 'Storylines')
-> and category_id in (
-> select category_id
-> from tmp_category_group cg
-> where group_id = (select group_id from category_group_info x where x.group_name = 'KidsCharacter'));
Query OK, 0 rows affected (0.36 sec) 

Thursday, February 10, 2011

How Seconds_behind_master is computed

Found a very good explanation of how Seconds_behind_master is computed in this bug report.

http://bugs.mysql.com/bug.php?id=2826


Here's how Seconds_behind_master is computed.

* When the slave connects to the master (when the slave I/O thread connects to the
master),
it does a = SELECT UNIX_TIMESTAMP() on master,
and b= SELECT UNIX_TIMESTAMP() on slave, and computes c=b-a.

* Every statement executed on the master is stored on the master's binary log together
with the timestamp when it was executed.

* So when the slave SQL thread reads a statement from the binary log, it executes it and
then it does:

seconds_behind_master= SELECT UNIX_TIMESTAMP() on slave
- timestamp of the master's binary log event
- c

Sunday, September 19, 2010

2010 Oracle Open World - MYSQL Sunday

Just come back from MYSQL Sunday sessions of 2010 Oracle Open World. Share some thoughts while memory is still fresh, the two Facebook sessions are pretty good. Harrison shared some insight of how Facebook using MYSQL replication from operation perspective. Mark's session is similar to the one he did in MYSQL Conf this April.

Global Transaction IDs
  We recently experienced some network issue between our Data Center and Cloud provider. Finding the right binlog position for slaves in cloud after network issue some time can be tricky. And in many cases, the relylog is totally toasted, we need to change master to previous position based on exec_master_log_pos. The thing is we can't guarantee the integrity of slave data because some corrupted statement could get applied. In most case, the slave SQL thread stopped because it reads invalid statement from corrupted relylog. Who's to say there's no corrupted 'valid' statement applied?
  Well, we are still considered lucky in this case as master is operational and all binlogs are in good shape. Image if Master crashes and we need to prompt one slave to new master and sync other slaves to the new master. Theoretically we need to find a slave that closest to master and prompt it to master and then sync the rest slaves. It’s easy saying than to be done in real world crashes. Because MYSQL didn’t provide an easy mapping between master BINLOG position and slave Rely LOG position.  On top of that, each slave maintain its own rely log that means the rely log number and position can’t be shared globally among slaves. This makes the tasks to sync all slaves with new master particularly difficult if not impossible. What Facebook and Google MYSQL team was using is a Global Transaction ID (similar to SCN in Oracle), they use this Global Transaction ID to map to rely log and binlog position, this makes slave master synchronization a lot easier. Unfortunately, this Global Transaction ID only exists in Mysql Google patch and Facebook patch. Not in main line MySQL.

DRBD Master 
   The second thought is derived from first, switch between master and slaves is not an easy task especially with multi-slaves setup. It’s particularly difficult when master crashes instead of graceful shutdown. Even with Global Transaction ID it’s still involve manual work and prolonged downtime. So some users actually setup protected masters using DRBD. http://www.drbd.org/  DRBD is a pair of DB in a cluster that has block to block mirroring. If case of crashes, the other one can quickly take over with minimum data loss.
Thus we got master protected and guaranteed. We can then build slaves against the cluster. Greatly reduced the need to synchronize slaves as master pretty much crash free.
Of course this is not solution for everything, by eliminating the need to switch between master and slave it also loss some advantage of master slave rotation. For example, we can’t run expensive DDL like ALTER TABLE on slaves and switch to master and do the same DDL on former master. Because in this case, slave is not in the same league as master.

Seconds behind master
   I think most of you already noticed this is not a reliable indicator to determine if slave is running behind master. The problem is because this indicator is not a heartbeat indicator. It’s only updated when there’s update events received by slaves. So Slave will happily register 0 seconds behind master even after it’s not getting any event for a long time. In some case it might be legitimate that there’s really no event on master and a lot of time it’s because we have network issue between master and slave, the connection simply got dropped. The solution is update a heartbeat table and check the difference manually. MySQL 5.5 has built in Heartbeat mechanism which definitely a big plus for MYSQL replication.

Tuesday, September 7, 2010

log-bin.~rec~ in BINLOG directory

Anyone noticed there's a log-bin.~rec~ file under MySQL BinLog directory?
I didn't notice this file until recently our rsync script that syncing 
Binlog to backup server failed with this error.


building file list ... rsync: link_stat 
"/DB/MYPROD/arch/MPROD_arch.~rec~" failed: 
No such file or directory (2) rsync error: 
some files could not be transferred (code 23) at main.c(892)
[sender=2.6.8]


A little research revealed that this is actually a by product of MySQL bug fix.
It's a temporary register file MySQL used to record BinLog files before adding or purging them.
It only exists briefly in the BinLOG dir, our rsync script was lucky enough to catch it and failed to sync it before it's gone.

http://bugs.mysql.com/bug.php?id=45292

To fix these issues, we record the files to be purged or created before really removing or adding them. So if a failure happens such records can be used to automatically remove dangling files.
The new steps might be outlined as follows:
(purge routine - sql/log.cc - MYSQL_BIN_LOG::purge_logs)
1 - register the files to be removed in the log-bin.~rec~ placed in the data directory.
2 - update the log-bin.index.
3 - flush the log-bin.index.
4 - delete the log-bin.~rec~.
(create routine - sql/log.cc - MYSQL_BIN_LOG::open)
1 - register the file to be created in the log-bin.~rec~ placed in the data directory.
2 - open the new log-bin.
3 - update the log-bin.index.
4 - delete the log-bin.~rec~.
(recovery routine - sql/log.cc - MYSQL_BIN_LOG::open_index_file)
1 - open the log-bin.index.
2 - open the log-bin.~rec~.
3 - for each file in log-bin.~rec~.
3.1 Check if the file is in the log-bin.index and if so ignore it.
3.2 Otherwise, delete it.
The third issue can be described as follows.
The purge operation was allowing to remove a file in use thus leading to the loss of data and possible inconsistencies between the master and slave. Roughly, the routine was only taking into account the dump threads and so if a slave was not connect the file might be delete even though it was in use.

Thursday, February 11, 2010

'Unknown command' from loading mysqldump file

If so happen when you load a mysqldump file get these errors,

mysql < mysqldump_021010.sql
ERROR at line 5122: Unknown command 'ufffd'

You need to increase max_allowed_packet and try again.
The error itself is not very informative, actually somewhat misleading.
It's probably because you have big insert statement in your dumpfile the exceeded the limit and mysql interpret row data after that as command.

In my case, running mysql --max_allowed_packet=64M fixed the issue. The default is 16M in mysql 5.0 and 5.1

Actually according to MySQL doc, it is safe to increase the value of this variable because the extra memory is allocated only when needed. So 1G here we go.