Monday, September 29, 2008

Binary Logs and saving data

Back when I started working with computers, you actually carved each character with a machine into oddly shaped pieces of paper. My FORTRAN class required up to two dozen of these carefully manicured pieces of paper for each program. The folks in the RPG class had litterally stacks of data one or two feet deep and my mind boggled at the amount of data these poor folks had to carry around with them. My best guess now is that they may have had up to two thousand records of up to 80 characters. My lowly FORTAN class saw the RPG programming class as demi-gods as they trudged along with trays of cards under their arms.

And then one day it rained. Data cards were sensitive to humidity especially the type of humidity you find in the bottom of a rain filled gutter, 12-edge down. My fellow students and I commiserated with one of those formerly envied RPG students as we viewed what was now a paper mache lump. The object of our empathy had tripped and dropped their card tray while trying to extend an umbrella on an rare rainy day in El Cajon, California.

The data recovery process was fairly simple. Fresh IBM cards were purchased at the campus book store and then feed into a bin the card duplicator. Then another student's card deck was placed in another bin. The card duplicator was then turned on and it made a loud staccatto blast as it attempted to make exact copies of every card. But the card duplicator was an old, electromechanical device that made what could be call a 'best effort'. Sometimes the card reading device would not find all the holes in the card. But to be fair, it would often inject unneeded holes in other cards in a frustrating karmic balance.

So, a few feet of cards were hand checked. It took the better part of a day but the RPG student did recover the data. The student was very lucky in that she had access to a replica of the original data.

So how does a MySQL DBA recover from the modern equivilent, say the programmer who forgets to add the WHERE ID_number = '999' to the DELETE FROM huge_dataset statement? Use the binary log!

MySQL's binary log records 'events' that could change a database. So SELECTs are not recorded but UPDATEs, DELETEs, and other statements are. Enabling the binary log will allow the server to record these changes into a file. You can read the file mysqlbinlog program.

So the programmer forgets the last half of the above statement and you have to rebuild huge_dataset.

Step 1. Stop the server. Make sure users can not connect while work to restore the table by the process you and company favor.

Step 2. Restore the needed table(s) from you last backup (you do backup your data, right?).

Step 3. Find the binary logs files that cover the time period from the last backup to shutdown. Pipe the output of mysqlbinlog to a text file. Append log output to this file. This text file contains the SQL statements that have potential to change the data.

Step 4. Find the offending DELETE FROM huge_dataset statement and remove it from the text file created in Step 3.

Step 5. Use the text file (\. text-file) as input.

And you are back in business.

Of course my five steps are a simplification. But the general idea is there. You may also want to check out the MySQL manual on 'safe mode' if you want to avoid 'DELETE *' issues like the one above.

No comments: