Addressing database bloat

Database grooming is a well documented topic when it comes to working with SCOM.

Only old state change data for monitors that are currently in a HEALTHY state, AT THE TIME that grooming runs are groomed out.  OLD state change events for monitors that are Disabled (unmonitored), in Maintenance Mode, in a Warning State, or Critical State are not groomed out of the database. So if you had an issue with a monitor in the past, and you solved it by disabling the monitor, this data NEVER gets groomed out of the database.Over time this can result in database bloat.

To address this you can run the following SQL query against the Operations database, this will groom out ALL statechange data, and only keep the number of days you have set in the UI  (The default setting is 7 days). 

USE [OperationsManager]

GO

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

BEGIN

    SET NOCOUNT ON

    DECLARE @Err int

    DECLARE @Ret int

    DECLARE @DaysToKeep tinyint

    DECLARE @GroomingThresholdLocal datetime

    DECLARE @GroomingThresholdUTC datetime

    DECLARE @TimeGroomingRan datetime

    DECLARE @MaxTimeGroomed datetime

    DECLARE @RowCount int

    SET @TimeGroomingRan = getutcdate()

    SELECT @GroomingThresholdLocal = dbo.fn_GroomingThreshold(DaysToKeep, getdate())

    FROM dbo.PartitionAndGroomingSettings

    WHERE ObjectName = ‘StateChangeEvent’

    EXEC dbo.p_ConvertLocalTimeToUTC @GroomingThresholdLocal, @GroomingThresholdUTC OUT

    SET @Err = @@ERROR

    IF (@Err <> 0)

    BEGIN

        GOTO Error_Exit

    END

     SET @RowCount = 1  

     — This is to update the settings table

    — with the max groomed data

    SELECT @MaxTimeGroomed = MAX(TimeGenerated)

    FROM dbo.StateChangeEvent

    WHERE TimeGenerated < @GroomingThresholdUTC

     IF @MaxTimeGroomed IS NULL

        GOTO Success_Exit

     — Instead of the FK DELETE CASCADE handling the deletion of the rows from

    — the MJS table, do it explicitly. Performance is much better this way.

    DELETE MJS

    FROM dbo.MonitoringJobStatus MJS

    JOIN dbo.StateChangeEvent SCE

        ON SCE.StateChangeEventId = MJS.StateChangeEventId

    JOIN dbo.State S WITH(NOLOCK)

        ON SCE.[StateId] = S.[StateId]

    WHERE SCE.TimeGenerated < @GroomingThresholdUTC

    AND S.[HealthState] in (0,1,2,3)

     SELECT @Err = @@ERROR

    IF (@Err <> 0)

    BEGIN

        GOTO Error_Exit

    END

     WHILE (@RowCount > 0)

    BEGIN

        — Delete StateChangeEvents that are older than @GroomingThresholdUTC

        — We are doing this in chunks in separate transactions on

        — purpose: to avoid the transaction log to grow too large.

        DELETE TOP (10000) SCE

        FROM dbo.StateChangeEvent SCE

        JOIN dbo.State S WITH(NOLOCK)

            ON SCE.[StateId] = S.[StateId]

        WHERE TimeGenerated < @GroomingThresholdUTC

        AND S.[HealthState] in (0,1,2,3)

         SELECT @Err = @@ERROR, @RowCount = @@ROWCOUNT

         IF (@Err <> 0)

        BEGIN

            GOTO Error_Exit

        END

    END   

     UPDATE dbo.PartitionAndGroomingSettings

    SET GroomingRunTime = @TimeGroomingRan,

        DataGroomedMaxTime = @MaxTimeGroomed

    WHERE ObjectName = ‘StateChangeEvent’

     SELECT @Err = @@ERROR, @RowCount = @@ROWCOUNT

     IF (@Err <> 0)

    BEGIN

        GOTO Error_Exit

    END 

Success_Exit:

Error_Exit:   

END

Advertisements
This entry was posted in SQL Queries and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s