Maintenance Plan Cleanup Task is Not Deleting the Old Backup Files

SQLServerF1

It is common in most environments to have maintenance plans used to perform regular maintenance tasks like Full, Differential and Log Backups, Rebuild Indexes, Statistics Update and Integrity Check.

Backups are considered one of the most important task and proper care needs to be taken while creating the maintenance plans for performing the database backups in SQL Server. There are many options available while creating the backup maintenance plan and need to choose options correctly to avoid any failures or problems in future.

Although there can be various problems related to configuring backup maintenance plans, one of the small but difficult one to identify for many DBA’s seem to that they enable the task “Maintenance Cleanup Task” to delete old backups older than 3 days or so and they later identify the old backups are not getting deleted. The backup job will run successfully and does not report any errors in this case, the only way DBA’s identify the issue mostly is when the disk drive where backups are stored gets full or has grown to large size and alerts are raised.

Coming to the solution of this problem is mostly due to incorrect values provided to the “Maintenance Cleanup Task”. In the maintenance cleanup task, verify the option “File extension” and see if the extension of the backup to delete is correct, for example if backup task is creating the backups with extension .bak, make sure “bak” is mentioned, if for transaction log backups, extension is .trn, then make sure that the option provided is “trn”. It is important to note to use “bak”(without quotes) instead of “.bak” as using .bak will not delete the old backup files. So, make sure there is no .(Dot) infront of the extension bak or trn.

This is applicable on below versions of SQL Server.

SQL Server 2005
SQL Server 2008 R2
SQL Server 2012
SQL Server 2014

Hope this was helpful.

Thanks,
SQLServerF1 Team
In-Depth Blogs on SQL Server, Information about SQL Server Conferences and Events, SQL Server Frequently asked questions, SQL Server Trainings.

 

Retrying and Running a Failed Job After Some Delay in SQL Server

There can be scenarios in certain where there can be requirement of automatically rerunning a failed job after a certain amount of delay. For example, a job which fails due to intermittent issue like a network glitch or some file currently been locked by other process, so it would require that the job need to be attempted to be run again after some time, by which there are possibilities of the issue getting resolved by then and our job will completely successfully and no manual intervention is required by a DBA.

There are multiple ways of making a job to run again itself after sometime and you can choose any approach which you prefer, after testing the solution on Test server.

Method 1: – There is an option available in SQL Server Agent Job Step properties, where we can specify, if that particular job step need to be retried if failed and how many times it should retry to run the failed job and how much delay should be there between each retry of the failed step.

You can configure this from SQL Server Agent -> Jobs -> Job Properties -> Steps Tab -> Edit the Step -> Click on Advanced tab -> Set appropriate value for “Retry Attempts” and “Retry Interval(minutes)”

Job_Step_Retry_Options

Job_Step_Retry_Options

Method 2: – This is applicable for jobs which run T-SQL code as part of job step. We can add multiple schedules to a job and Job will run on both the schedules. Modify the job, “so that it first check if the previous run of this job was failed and was it during the first scheduled time, if Yes, then run the T-SQL code again. If it was successful in its last run and time it was run was during its first schedule, then do not do anything. If it was successful in its last run and time it was run was during its second schedule, then run the T-SQL code”. This needs writing this custom logic and of-course need lot of testing and also is limited only if job consists of T-SQL steps.

Method 3 – Create another job which will monitor if out critical Job completed successfully or not, if it failed, then just reruns that particular job. This is much simpler to implement compared to method 2 and also works for any type of job steps like T-SQL or Maintenance plan related jobs, etc.

Below is a sample code which can be used in the new job which monitors other jobs and rerun them if they failed. This requires you to specify the name of the jobs which need to rerun.

/*

This Job step finds if any of the job with names as "Backup_Full" 
and "App_Job" failed in its previous run and If they have failed in the 
current day, this will rerun that job. Change jobs names as per your names 
of jobs in your environment.

*/
 
DECLARE
  @Value [varchar](2048),
  @JobName [varchar](2048),
  @CurrentDate [datetime],
  @Year [varchar](4),
  @Month [varchar](2),
  @MonthPre [varchar](2),
  @Day [varchar](2),
  @DayPre [varchar](2),
  @FinalDate [int]
 
-- Initialize Variables
SET @CurrentDate = GETDATE()
SET @Year = DATEPART(yyyy, @CurrentDate)
 
SELECT @MonthPre = CONVERT([varchar](2), DATEPART(mm, @CurrentDate)) 
SELECT @Month = RIGHT(CONVERT([varchar], (@MonthPre + 1000000000)), 2) 
SELECT @DayPre = CONVERT([varchar](2), DATEPART(dd, @CurrentDate)) 
SELECT @Day = RIGHT(CONVERT([varchar], (@DayPre + 1000000000)), 2) 
SET @FinalDate = CAST(@Year + @Month + @Day AS [int]) 

--Find failed jobs "Backup_Full" and "App_Job"
DECLARE FailedJobs CURSOR FOR
SELECT DISTINCT 
  j.[name]
FROM 
  [msdb].[dbo].[sysjobhistory] h INNER JOIN
  [msdb].[dbo].[sysjobs] j ON h.[job_id] = j.[job_id] INNER JOIN 
  [msdb].[dbo].[sysjobsteps] s ON j.[job_id] = s.[job_id] AND h.[step_id] = s.[step_id]
WHERE 
  h.[run_status] = 0 AND 
  h.[run_date] = @FinalDate AND 
  j.name IN ('Backup_Full', 'App_Job')
 
OPEN FailedJobs

FETCH NEXT FROM FailedJobs
INTO @JobName

WHILE @@FETCH_STATUS=0
BEGIN

-- Restarts the job
  EXECUTE msdb..sp_start_job @JobName

FETCH NEXT FROM FailedJobs into @JobName
END

CLOSE FailedJobs
DEALLOCATE FailedJobs

There may be other methods as well, but mentioned methods are practical and simple and should serve the purpose.

This is applicable on below versions of SQL Server

SQL Server 2005
SQL Server 2008 R2
SQL Server 2012
SQL Server 2014

Hope this was helpful.

Thanks,
SQLServerF1 Team
In-Depth Blogs on SQL Server, Information about SQL Server Conferences and Events, SQL Server Frequently asked questions, SQL Server Trainings.

 

Update Statistics Job Taking Long Time to Complete

Recently, I got a question from a client where they had a SQL Server Agent job which was performing update statistics with full scan. The job was running successfully and had been running since long time. Since past few they started noticing that this job is taking longer time, than what it used to take before. Run time of the job doubled in last few runs and it continued taking longer time to finish.

I have verified and found that this job was created from a maintenance plan. Maintenance plan has update statistics task which is basically performing statistics update of all the objects(all table statistics, all column Statistics, and all index statistics) with FULL SCAN.

Some of the reasons why there is an increase in the duration of this job could be

– Number of databases has increased on the SQL Server instance. As this job covers all databases, so increase in databases could increase the durations of this job.

– Increase in the size of the Data in few or all of the databases. It is common that databases grow over a period of time and this will lead to increase in time taken for update statistics to complete, thus increasing the duration of the job run.

– Changes in the CPU usage, Memory usage or IO(Reads/Writes) usage patterns on few or all databases which can increase the time taken to complete the job.

– Changes in the access Patterns of the data in few or all databases can show affect which may cause locking/blocking with the update statistics jobs.

– Any other jobs are running at the same time and now Update statistics and new jobs have to share the available resources on the system, thus can lead to increase in job duration.

To be able to understand or find which of these above mentioned is the cause of the increase in the run time of the job, we need to have historical data when the job was running fast. The historical data which we need would include performance monitor counters related CPU, Memory, Disk and Network. SQL trace files to understand what other queries are running while this job is running. Wait statistics, Locking and Blocking details to see if there are any waits for Update Statistics job. It is most often are not available unless there is some third party monitoring tool is installed or custom monitoring is configured.

Also, we can look into other options to see how we can improve the speed of Update Statistics job by making changes to it. As we know that this job is using maintenance plan Update Statistics task and performing with FULL SCAN, we can look into possibilities of changing from maintenance plan in to a customized solution, where we update statistics only if there are any changes to the rows data in a table or column. Also, if the number of changes are small, we may perform the update statistics with Sampling as well. There are already some automated tools/Scripts available to achieve this, one of which is a popular solution Ola Hallengren’s Index and Statistics Maintenance Script. Test this solution on a test server and see the results and then implement the same on the production, if you are satisfied with the results on the test server.

I have migrated the update statistics job from maintenance plan solution to Ola Hallengren’s Statistics Maintenance Script and noticed that the run time of the job was reduced by 70%.

Same issue can also happen with other maintenance jobs like rebuild indexes or for any application related jobs as well and the cause of slowness could be due to one of the above mentioned causes. Similar troubleshooting approach should help in identifying the cause and resolving the issue by fixing the identified problem or by improving the maintenance job itself.

Hope this was helpful.

Thanks,
SQLServerF1 Team
In-Depth Blogs on SQL Server, Information about SQL Server Conferences and Events, SQL Server Frequently asked questions, SQL Server Trainings.

 

Error while Editing the backup maintenance plan in SQL Server 2005

Recently while I was trying to edit a maintenance plan, when trying to edit the backup task using Management Studio resulted in an error. Below is the error message


TITLE: Microsoft SQL Server Management Studio
Cannot show the editor for this task.
Value of ’27/6/2014 12:00:00 AM’ is not valid for ‘Value’. ‘Value’ should be between ‘MinDate’ and ‘MaxDate’.

After some research, found that this has impacted other users as well and a connect bug Link has been filed for the same.

There was no fix released for this, but there is a work around available which resolved the issue and then I was able to edit the backup task in the maintenance plan.

Below is the work around

1. Modify Maintenance Plan.
2. Click On the backup task.
3. Click F4 button on the keyboard and you will find the properties window to the right corner of the SSMS.
4. Change the value in “expiredate” by adding 10 years or Clear out the value in “expiredate”

Backup Task Maintenance Plan Properties

Backup Task Maintenance Plan Properties


5. Save the maintenance plan.
6. Reopen the maintenance plan and now you should be able to edit the backup task.

Happy Reading,
SQLServerF1 Team