I didn't notice this file until recently our rsync script that syncing
Binlog to backup server failed with this error.
Binlog to backup server failed with this error.
building file list ... rsync: link_stat
"/DB/MYPROD/arch/MPROD_arch.~rec~" failed:
No such file or directory (2) rsync error:
some files could not be transferred (code 23) at main.c(892)
[sender=2.6.8]
"/DB/MYPROD/arch/MPROD_arch.~rec~" failed:
No such file or directory (2) rsync error:
some files could not be transferred (code 23) at main.c(892)
[sender=2.6.8]
A little research revealed that this is actually a by product of MySQL bug fix.
It's a temporary register file MySQL used to record BinLog files before adding or purging them.
It only exists briefly in the BinLOG dir, our rsync script was lucky enough to catch it and failed to sync it before it's gone.
http://bugs.mysql.com/bug.php?id=45292
To fix these issues, we record the files to be purged or created before really removing or adding them. So if a failure happens such records can be used to automatically remove dangling files.
The new steps might be outlined as follows:
(purge routine - sql/log.cc - MYSQL_BIN_LOG::purge_logs)
1 - register the files to be removed in the log-bin.~rec~ placed in the data directory.
2 - update the log-bin.index.
3 - flush the log-bin.index.
4 - delete the log-bin.~rec~.
(create routine - sql/log.cc - MYSQL_BIN_LOG::open)
1 - register the file to be created in the log-bin.~rec~ placed in the data directory.
2 - open the new log-bin.
3 - update the log-bin.index.
4 - delete the log-bin.~rec~.
(recovery routine - sql/log.cc - MYSQL_BIN_LOG::open_index_file)
1 - open the log-bin.index.
2 - open the log-bin.~rec~.
3 - for each file in log-bin.~rec~.
3.1 Check if the file is in the log-bin.index and if so ignore it.
3.2 Otherwise, delete it.
The third issue can be described as follows.
The purge operation was allowing to remove a file in use thus leading to the loss of data and possible inconsistencies between the master and slave. Roughly, the routine was only taking into account the dump threads and so if a slave was not connect the file might be delete even though it was in use.
No comments:
Post a Comment