Sunday, July 2, 2017

Archiving Data

Many times as a DBA I have taken over a system that has had data gathering for years.  The worst I took over had about 400 million records in the main table, with an average of 24 records in the secondary related table per primary record, data going back 12+ years.  In 2017, this might not be that bad, but back in 2002....

Seeing as the data had to be retrievable for any date, I could not simply delete the very old data.  These tables also had constant inserts and updates into them, so making sure the tables remained available became important, i.e. needed to have acceptable time that the table was being locked, with time for waiting transactions to finish.

The solution I came up with does this with variable size batches.  Now, with modern versions of SQL, there are other ways to do this, but the good thing about this method it works regardless of version of SQL, as well as edition.  Azure SQL DB would need some modification to make it work to archive to a separate database.

The following sample illustrates arching data from two tables, in a typical Sales header and detail tables, data consistency must be maintained.

CREATE OR ALTER PROCEDURE prc_ArchiveSalesData (@StartDate date@EndDate date@BatchSize INT)AS
@rowcount INT-- set end date to be one after the date sent in to account for time data. 
-- Better performance than converting data in where clause
SET @EndDate DATEADD(DAY,1,@EndDate)
-- check for existing temp tables , drop and create
IF OBJECT_ID('tempdb..#AllItems'IS NOT NULL

CREATE INDEX #idxAllItems ON #AllItems(SalesID--lots of joins to happen on this

IF OBJECT_ID('tempdb..#CurrentBatch'IS NOT NULL
DROP TABLE #CurrentBatch

CREATE TABLE #CurrentBatch (

CREATE INDEX #idx_Current ON #CurrentBatch(SalesID)
-- gather all items for the given date range
SELECT  S.SalesID FROM Sale S WHERE S.SaleDate >= @StartDate AND S.SaleDate @EndDate

-- loop outside transaction, that way if a failure or need to cancel, will commit the succesful ones
INSERT INTO #CurrentBatch
SELECT TOP (@BatchSizeSalesID FROM #AllItems ORDER BY SalesID ASC -- technically no need to do this order by, but makes debugging easier
WHILE @rowcount 0
-- copy to archive tables
INSERT INTO SalesArchive.dbo.Sale (SalesIDCustomerIDSalesPersonIDTotalSalesSaleDate)
SELECT S.*  -- this would make archive fail if we change the table, which we would want
           INNER JOIN #CurrentBatch CB
               ON S.SalesID CB.SalesID
           INSERT INTO SalesArchive.dbo.SalesDetail SalesDetailIDSalesIDItemIDQuantity)
FROM SalesDetail SD
           INNER JOIN #CurrentBatch CB
               ON SD.SalesID CB.SalesID

-- delete from production tables

           FROM SalesDetail SD
           INNER JOIN #CurrentBatch CB
               ON SD.SalesID CB.SalesID

           DELETE FROM Sale
           FROM Sale S
           INNER JOIN #CurrentBatch CB
               ON S.SalesID CB.SalesID
-- delete the current batch from all items
FROM #AllItems AI
           INNER JOIN #CurrentBatch CB
               ON AI.SalesID CB.SalesID
-- delete from current batch completed
DELETE FROM #CurrentBatch
COMMIT TRANSACTION       -- committing transaction here makes it possible to stop the archiving and not loose everything

WAITFOR delay '00:00:10' -- 10 second pause before continuing, adjust as needed

INSERT INTO #CurrentBatch
SELECT TOP (@BatchSizeSalesID FROM #AllItems ORDER BY SalesID ASC -- technically no need to do this order by, but makes debugging easier

END -- rowcount loop
END -- proc

Hope you find this useful if you ever find a need for archiving large amounts of data.

Thursday, March 2, 2017

SentryOne cleanup, PerformanceAnalysisDataTableAndIndexCounter table

SentryOne is a great tool for monitoring servers, but with a lot of "chatty" servers, the amount of data collected can become unwieldy.  With about 65 servers monitored, the delete functionality in SentryOne could not keep up with the deletes in the PerformanceAnalysisDataTableAndIndexCounter (hereafter just refered to as the table....) table.  At the point I got brought in to clean it up there was close to a billion records in this table, and we were about to add another 110+ servers to be monitored.  The table itself was about 320 GB, the cleanup function was not working, and the manual method provided by SentryOne support could not handle the deletes without causing a lot of blocking and time out issues.

The delete function provided had multiple manual steps, and the where clause was not well aligned with the indexes on the table.  To be able to handle cleanup of this size, and not cause undue blocking and timeouts in the collection of data, I decided to find a better way.

Looking at the code provided by SentryOne, and digging a bit in the database itself, found how I could get the deletes into smaller, adjustable batches.

The first problem was to find the boundaries for the delete, and this step is the single most time consuming step in the process:

   SELECT @EndID MAX(idFROM table WITH (nolockWHERE TIMESTAMP <= @EndTimeStamp -- this step can take time
Since data in this table is just written to and never updated, using nolock here is ok.

After this it becomes a simple loop until you meet a goal in batches.  The delete is done with a simple top statement in a CTE.

After I was done catching up to the 15 days retention rate we wanted, which is hardcoded into the procedure to avoid accidentally deleting more than we wanted and can be changed if your retention rate is different, the procedure was set up to  simply be called once a day.  We decided to set the retention rate in SentryOne to be 1 day longer than the procedure, so if anything was ever changed SentryOne delete routine would kick in if this one fails.

Hopefully this is useful for you if you run into this problem with SentryOne.  The same setup can be used for other tables as well, but PLEASE check with their support before you start to delete data that it is ok since some tables are detail "collector" tables, some are summaries that you need for your time lines.  The same format was used to delete in a couple of other tables, including the one that collects deadlock information.......

The code in its entirety is on Github, or below.  I would highly recommend getting it from Github if you are going to use it instead of here.

The parameters to use needs to be adjusted for your use and server performance, the defaults are just a safe small run of deletes.

@BatchSize increased as I got less data in the table, and how busy the system was.  The trick is to get each batch to be big enough you will get done in a reasonable time with your cleanup, but not so large that they cause time outs.  Only one way to figure out, run the procedure and keep a close eye on your system while it is running.

@MaxIterations is simply how many times to loop through running deletes.

-- use SentryOne -- depends if new install, or upgradeCREATE PROCEDURE dbo.BatchDeletePerformanceAnalysisDataTableAndIndexCounterData
   @EndDateTime DATETIME,
@MaxIterations INT = 1,
@BatchSize INT = 1000
Anders Pedersen

Made to batch delete data out of the PerformanceAnalysisDataTableAndIndexCounter table.
Delete statement provided by SentryOne could not handle a lot of data without
a lot of manual intervention (manual batches).

This uses the column with the primary key on it for deletes instead of the timestamp column,
this substantially speeds up deletes.

--set DEADLOCK_PRIORITY low -- if you see a lot of other processes being victim, set this
   --(unless it is more important to get it cleaned up than loosing a few records)
-- get first time stamp
DECLARE @BeginTimeStamp INT@EndTimeStamp INT
@BeginID bigint@EndID bigint
-- sanity check, do not let it delete to a date less than 15 days back.  Making it 16 to cover the entire day.  Close enough
SELECT @EndTimeStamp [dbo].[fnConvertDateTimeToTimestamp] (@EndDateTime)

-- get ID for that newest one to delete
SELECT @EndID MAX(idFROM PerformanceAnalysisDataTableAndIndexCounter WITH (nolockWHERE TIMESTAMP <= @EndTimeStamp -- this step can take time
SELECT @BeginID MIN(idFROM PerformanceAnalysisDataTableAndIndexCounter WITH (nolockWHERE ID <= @EndID

DECLARE @cnt  bigint 0
   WHILE @x <= @MaxIterations AND @BeginID @EndID
@x <> -- skip wait on first iteration. Save time if you only running small batches
WAITFOR DELAY '00:00:03'-- 3 second delay to let transactions through.  This might need adjusted
SELECT TOP (@BatchSize) * FROM PerformanceAnalysisDataTableAndIndexCounter WHERE ID BETWEEN @BeginID AND @EndID ORDER BY ID ASC
       SELECT @cnt +@@ROWCOUNT
SET @x +=1
@cnt AS RecordsDeleted