Monday, February 29, 2016

Backup Restore Chain revisited.

(This is an update to Backup Restore Chain article I posted back in September.)

Again, much thanks to SQLSoldier for the initial code.

Recently I got promoted to a new team that has been without a BDA for a while, and as such started poking around, first of all making sure backups are running, and that we have a reasonable expectation of being able to recover in case of a failure or problem with our databases.

I started some of these by grabbing some random database to test the restore chain on, running the script from the previous article.  On one of the databases this query just went into a tail spin, never returning, on a hunch I changed the MaxRecursion option from 0 to 100, knowing that it should never hit more than about 30 at the time due to the data.  Just selecting out of the Logs CTE, it became very apparent that I got into a cycle where it kept repeating over and over until it hit the max recursion allowed.

Looking at the backupset table, I saw this (condensed down for clarity and size since this table is huge):

SELECT  backup_set_idbackup_Start_dateB.First_LsnB.Last_Lsn,  B.TYPE
FROM 
msdb.dbo.backupset B
WHERE database_name 'AndersTest'
  




Backup_set_id 1713355 and 1713378 has First_Lsn = Last_Lsn, and they are the same between the two records.  There was more records where this was true, just happens to be early in the result set so was easy to spot.

Question then arose in my mind if this caused a problem with the CTE, which it is obvious that it did, but to fix it, I would have to remove those records from being considered in the restore chain (or switch to a loop from a recursive CTE).

I have to admit, I had always been of the impression that ALL backups in a chain had to be restored, but then started to think about the error message when you try to skip one, it says to the effect "LSN to large, try an earlier backup."  This makes sense, but would the reverse be true, that if the LSN did not change, you would not need to restore it?

Testing this I restored backup_set_ids 1713302, 1713332 and 1713401 (i.e. the full backup, and the 5:15 am and 6:00 am transaction log backups, skipping the 5:30 and 5:45 backups).  After the last transaction log backup was restored, I recovered the database and it came up just fine.

This tested, it was time to figure out exactly how I could make the Backup Restore Chain script work, after a few, uhm, misfires, it turned out to be very simple, simply ignore the records where the two LSNs where identical, added this to both queries inside the Logs CTE:

AND B.First_lsn <> B.Last_Lsn

This simple change did the trick, now when I run the query, I got this as my result:




I have tested the following script in as many scenarios as I could think of, but if you find something wrong, please let me know, either in comments here, or to my twitter account.


In rewriting this, I also decided to make it into a stored procedure that could take the database name as a required parameter, and a date as an optional.  With the date sent in, it will use the newest FULL backup prior on or before that date, this could very easily be changed if you need to do it by date/time if you run multiple full backups per day.  I decided to add this in case I ever need to restore prior to the latest full backup.


Code for procedure below, you can also find it saved on Github

USE MASTERGO
CREATE PROCEDURE sp_RestoreChainByDB
   
@DBName SYSNAME,
   
@MaxFullBackupDate date NULL
AS
BEGIN
/****************
Created by @SQLSoldier
http://www.sqlsoldier.com/wp/sqlserver/day3of31daysofdisasterrecoverydeterminingfilestorestoredatabase

2015-09-24
Modified by Anders Pedersen
Twitter: @arrowdrive
Blog: Blog.AndersOnSQL.com
Made to work with SQL 2008R2.


2016-02-26
Modified by Anders Pedersen
Fix issue where First_LSN and Last_LSN values are the same on a record for the trasaction log backups.
These transaction log backups can be ignored in the restore chain.
Also made into a procedure, with the database name for a required parameter, and max date for the full backup to start with as a parameter
*****************/
   
DECLARE @DBBackupLSN numeric(250)

   
DECLARE @Baks TABLE (
       
BakID INT IDENTITY(11) NOT NULL PRIMARY KEY,
       
backup_set_id INT NOT NULL,
       
media_set_id INT NOT NULL,
       
first_family_number tinyint NOT NULL,
       
last_family_number tinyint NOT NULL,
       
first_lsn numeric(250) NULL,
       
last_lsn numeric(250) NULL,
       
database_backup_lsn numeric(250) NULL,
       
backup_finish_date DATETIME NULL,
       
TYPE CHAR(1) NULL,
       
family_sequence_number tinyint NOT NULL,
       
physical_device_name NVARCHAR(260) NOT NULL,
       
device_type tinyint NULL,
       
checkpoint_LSN numeric (25,0)-- Anders 2015-09-24
       
)
   
SET NOCOUNT ON;



   
-- Get the most recent full backup with all backup files
   
INSERT INTO @Baks (backup_set_id,
       
media_set_id,
       
first_family_number,
       
last_family_number,
       
first_lsn,
       
last_lsn,
       
database_backup_lsn,
       
backup_finish_date,
       
TYPE,
       
family_sequence_number,
       
physical_device_name,
       
device_type
       
,checkpoint_LSN)-- Anders
   
SELECT TOP(1WITH Ties B.backup_set_id,
       
B.media_set_id,
       
B.first_family_number,
       
B.last_family_number,
       
B.first_lsn,
       
B.last_lsn,
       
B.database_backup_lsn,
       
B.backup_finish_date,
       
B.TYPE,
       
BF.family_sequence_number,
       
BF.physical_device_name,
       
BF.device_type
       
B.checkpoint_lsn -- Anders 2015-09-24
   
FROM msdb.dbo.backupset AS B
   
INNER JOIN msdb.dbo.backupmediafamily AS BF
       
ON BF.media_set_id B.media_set_id
           
AND BF.family_sequence_number BETWEEN B.first_family_number AND B.last_family_number
   
WHERE B.database_name @DBName
   
AND B.is_copy_only 0
   
AND B.TYPE = 'D'
   
AND BF.physical_device_name NOT IN ('Nul''Nul:')
   AND 
CONVERT(DATE,B.backup_start_date) <=(CASE
                                           
WHEN @MaxFullBackupDate IS NOT NULL THEN   @MaxFullBackupDate
                                           
ELSE CONVERT(DATE,B.backup_start_dateEND-- Anders 2016-02-26
      
   
ORDER BY backup_finish_date DESCbackup_set_id;


   
-- Get the lsn that the differential backups, if any, will be based on
   
SELECT @DBBackupLSN checkpoint_LSN -- Anders 2015-09-24
   
FROM @Baks;

   
-- Get the most recent differential backup based on that full backup
   
INSERT INTO @Baks (backup_set_id,
       
media_set_id,
       
first_family_number,
       
last_family_number,
       
first_lsn,
       
last_lsn,
       
database_backup_lsn,
       
backup_finish_date,
       
TYPE,
       
family_sequence_number,
       
physical_device_name,
       
device_type)
   
SELECT TOP(1WITH Ties B.backup_set_id,
       
B.media_set_id,
       
B.first_family_number,
       
B.last_family_number,
       
B.first_lsn,
       
B.last_lsn,
       
B.database_backup_lsn,
       
B.backup_finish_date,
       
B.TYPE,
       
BF.family_sequence_number,
       
BF.physical_device_name,
       
BF.device_type
   
FROM msdb.dbo.backupset AS B
   
INNER JOIN msdb.dbo.backupmediafamily AS BF
       
ON BF.media_set_id B.media_set_id
           
AND BF.family_sequence_number BETWEEN B.first_family_number AND B.last_family_number
   
WHERE B.database_name @DBName
   
AND B.is_copy_only 0
   
AND B.TYPE = 'I'
   
AND BF.physical_device_name NOT IN ('Nul''Nul:')
   AND 
B.database_backup_lsn @DBBackupLSN

   
ORDER BY backup_finish_date DESCbackup_set_id;


   
--select * from @Baks

   -- Get the last LSN included in the differential backup,
   -- if one was found, or of the full backup
   
SELECT TOP @DBBackkupLSN last_lsn
   
FROM @Baks
   
WHERE TYPE IN ('D''I')
   
ORDER BY BakID DESC;

   
-- Get first log backup, if any, for restore, where
   -- last_lsn of previous backup is >= first_lsn of the
   -- log backup and <= the last_lsn of the log backup
   
INSERT INTO @Baks (backup_set_id,
       
media_set_id,
       
first_family_number,
       
last_family_number,
       
first_lsn,
       
last_lsn,
       
database_backup_lsn,
       
backup_finish_date,
       
TYPE,
       
family_sequence_number,
       
physical_device_name,
       
device_type)
   
SELECT TOP(1WITH Ties B.backup_set_id,
       
B.media_set_id,
       
B.first_family_number,
       
B.last_family_number,
       
B.first_lsn,
       
B.last_lsn,
       
B.database_backup_lsn,
       
B.backup_finish_date,
       
B.TYPE,
       
BF.family_sequence_number,
       
BF.physical_device_name,
       
BF.device_type
   
FROM msdb.dbo.backupset B
   
INNER JOIN msdb.dbo.backupmediafamily AS BF
       
ON BF.media_set_id B.media_set_id
           
AND BF.family_sequence_number BETWEEN B.first_family_number AND B.last_family_number
   
WHERE B.database_name @DBName
   
AND B.is_copy_only 0
   
AND B.TYPE = 'L'
   
AND BF.physical_device_name NOT IN ('Nul''Nul:')
   AND 
@DBBackupLSN BETWEEN B.first_lsn AND B.last_lsn
   
ORDER BY backup_finish_datebackup_set_id;

   
-- Get last_lsn of the first log backup that will be restored
   
SET @DBBackupLSN NULL;
   
SELECT @DBBackupLSN MAX(last_lsn)
   
FROM @Baks
   
WHERE TYPE = 'L';


   
--select * from @Baks;

   -- Recursively get all log backups, in order, to be restored
   -- first_lsn of the log backup = last_lsn of the previous log backup
   
WITH Logs
   
AS (SELECT B.backup_set_id,
           
B.media_set_id,
           
B.first_family_number,
           
B.last_family_number,
           
B.first_lsn,
           
B.last_lsn,
           
B.database_backup_lsn,
           
B.backup_finish_date,
           
B.TYPE,
           
BF.family_sequence_number,
           
BF.physical_device_name,
           
BF.device_type,
           
AS LogLevel
       
FROM msdb.dbo.backupset B
       
INNER JOIN msdb.dbo.backupmediafamily AS BF
           
ON BF.media_set_id B.media_set_id
               
AND BF.family_sequence_number BETWEEN B.first_family_number AND B.last_family_number
       
WHERE B.database_name @DBName
       
AND B.is_copy_only 0
       
AND B.TYPE = 'L'
       
AND BF.physical_device_name NOT IN ('Nul''Nul:')
       AND 
B.first_lsn @DBBackupLSN
       
AND B.First_lsn <> B.Last_Lsn -- Anders 2016-02-26
       
UNION ALL
       
SELECT B.backup_set_id,
           
B.media_set_id,
           
B.first_family_number,
           
B.last_family_number,
           
B.first_lsn,
           
B.last_lsn,
           
B.database_backup_lsn,
           
B.backup_finish_date,
           
B.TYPE,
           
BF.family_sequence_number,
           
BF.physical_device_name,
           
BF.device_type,
           
L.LogLevel 1
       
FROM msdb.dbo.backupset B
       
INNER JOIN msdb.dbo.backupmediafamily AS BF
           
ON BF.media_set_id B.media_set_id
               
AND BF.family_sequence_number BETWEEN B.first_family_number AND B.last_family_number
       
INNER JOIN Logs L ON L.database_backup_lsn B.database_backup_lsn
       
WHERE B.database_name @DBName
       
AND B.is_copy_only 0
       
AND B.TYPE = 'L'
       
AND BF.physical_device_name NOT IN ('Nul''Nul:')
       AND 
B.first_lsn L.last_lsn
       
AND B.First_lsn <> B.Last_Lsn-- Anders 2016-02-26
   
INSERT INTO @Baks (backup_set_id,
       
media_set_id,
       
first_family_number,
       
last_family_number,
       
first_lsn,
       
last_lsn,
       
database_backup_lsn,
       
backup_finish_date,
       
TYPE,
       
family_sequence_number,
       
physical_device_name,
       
device_type)
   
SELECT backup_set_id,
       
media_set_id,
       
first_family_number,
       
last_family_number,
       
first_lsn,
       
last_lsn,
       
database_backup_lsn,
       
backup_finish_date,
       
TYPE,
       
family_sequence_number,
       
physical_device_name,
       
device_type
   
FROM Logs
   
OPTION(MaxRecursion 0);

   
-- Select out just the columns needed to script restore

   
SELECT RestoreOrder Row_Number() OVER(Partition BY family_sequence_number ORDER BY BakID),
       
RestoreType CASE WHEN TYPE IN ('D''I'THEN 'Database'
               
WHEN TYPE = 'L' THEN 'Log'
           
END,
       
DeviceType CASE WHEN device_type IN (2102THEN 'Disk'
               
WHEN device_type IN (5105THEN 'Tape'
           
END,
       
PhysicalFileName physical_device_name
   
FROM @Baks
   
ORDER BY BakID;


   
SET NOCOUNT OFF;
END -- proc

Tuesday, November 10, 2015

T-SQL Tuesday #72: Data modelling gone extremely wrong


T-SQL Tuesday #72 invite

T-SQL Tuesday #72 invite


Once upon a time....

Way back in 1997 or 1998, I was called late one Friday afternoon by my manager at the consulting company I worked.  One of the large local newspapers where in serious trouble.  Friday afternoon and they had just turned on their new whiz-bang VB and SQL based solution to manage their newspaper business, everything from selling and managing subscriptions, to determining how many papers to print every day, and to which delivery vehicles they needed to be delivered on.  Basically your typical manage the newspaper kind of applications, nothing fancy, no internet self service (oh those blissful days before E-Commerce!).

I quickly got dressed up for a client meeting and back in my car, luckily I had gone straight home from the other client that day instead of out to eat, or more enjoyable activities.  Drove the hour to the news papers IT office, was met by a frazzled DBA and CIO, both seemed in need of a long nap.

While getting to know each other we walked to the Project office, while getting an idea about their problems and installation.  The SQL Server was, as prescribed by the developer of the software: SQL 6.5 running on DEC Alpha servers.

As I stepped into the office, I saw what looked like a giant flow chart covering one wall.  "The database diagram?" I asked.  Yupp.  "Well, there is your problem."  Silence.

And more silence.

"What do you mean?"

I started poking around, and finally found the main subscriber table, it looked something like this:

CREATE TABLE subscriber (
   
SubscriberID INT IDENTITY(1,1),
   
FirstNameID INT NOT NULL,
   
MiddleNameID INT NOT NULL,
   
LastNameID INT NOT NULL,
   
StreetNumberID INT NOT NULL,
   
StreetNameID INT NOT NULL,
   
StreetNameTypeID INT NOT NULL,
   
AppartmentID INT NULL,
   
CityNameID INT NOT NULL,
   
StateCodeID INT NOT NULL,
   
ZipCodeID INT NOT NULL,
   
PhoneAreaCodeID INT,
   
PhoneExchangeID INT,
   
PhoneNumberID INT,
   
SubscriptionTypeID INT
   
)


Everything was a FK out.  Now some of this makes sense, such as Street names being FK'ed out?  Sure, although being a newspaper with need to know which order to write pick lists, I would have done it a bit different.  Street number being a FK?  Really?  So it pointed from 2007 to PK value 2007, which may or may not be the street number 2007.  Was this in proper 5th normal form?  I really do not know, never took the time to check all the boxes, but I suspect that it was, if not some new 5.5th normal form.

Now some of you that have 6 and 6.5 experience might have caught on here:  Too normalized for SQL 6.5 to deal with, we ran out of joins.  What?  Yes, 6.5 had a limit of 16 joins per query.  I ended up spending all night re-designing the tables and procedures.  ETL'ing data into the new tables.

After many long hours designing and coding, we placed the new version in production....  At least it ran now.  However we very soon was starting to see corruption in the database, which we eventually traced back to a bug in the ODBC code for DEC Alpha.  YAY!  Ok now what?  The CIO, DBA, one of the NT guys and I scrambled to build an Intel based server overnight from scrap parts, BCP'ed the data over (backup/restores wouldn't work right between Intel and DEC Alpha based SQL), and had it up in production for the Sunday paper.

The next 36 hours was spent further de-normalizing the database to perform better.  After about 80 hours on site I had the system stable enough for their DBA to keep it running until the developers could sit down and take a more long term look at the application.

The database architect for this system turned out to be someone who had written several books on relational database design, but upon further investigation it was determined he had never had a job outside of academia, and had no idea how to design a database model that would work in a real setting.