Came across weird situation where received alert for SQL Server transaction log file full and found log_reuse_wait_desc=replication from sys.databases
Now before mentioning troubleshooting step, setup of the database is question is little bit complex because
- Database is part of SQL Server replication where database publisher is at vendor side and we don’t have access to publisher DB
- Oracle Golden Gate replication is configured from this database to Oracle DB
- Database in question is in SIMPLE recovery model
- Change Data Capture (CDC) is found enabled
To start with resolution, below steps are taken to troubleshoot and fix this,
-
First ran script to find log_reuse_wait_desc and is_cdc_enabled
select log_reuse_wait_desc,is_cdc_enabled,
*
from
sys.databases
Output of this script told that, log reuse wait is “Replication” and CDC bit is “1” which means it is enabled for this database
-
Then ran following script to understand active subscriptions present on database as I don’t have access to publisher,
Note: run this query under the database context and not on master
SELECT publisher,publisher_db,publication,time, distribution_agent,transaction_timestamp FROM dbo.MSreplication_subscriptions
Output of this query told that the there are two active subscriptions for this database where time value told that one is running actively fine and other one is 3 years old, which seemed strange,
- Worked with vendor and vendor told that the active one is what they have at publisher side whereas other one is not present at their end, after providing the distribution_agent name to them and via SQL agent validation it was confirmed that job doesn’t exist anymore hence subscription is not valid
-
So, I went ahead to delete inactive subscriber to which I have access and deleted the inactive subscription from Replication > Local Subscriptions Folder in SSMS GUI
Important to note here is: We cannot execute sp_repldone
or sp_removedbreplication
because it may disrupt the active subscription as well, moreover the option to reconfigure publication again is also not possible because I don’t have access to publisher and involving vendor is bit difficult, so went ahead with just deleting the inactive subscription which went absolutely fineNote: Optional commands to remove the entire replciation setting on database which is not useful for me but may help someone reading this article,
–Execute
sp_repldone
@xactid =
NULL, @xact_segno =
NULL, @numtrans = 0, @time = 0, @reset = 1
–Execute
sp_removedbreplication
‘dbname’ - Now after doing step 4, ideally t-log space should have been released but NO, log_reuse_wait_desc was still “Replication”
-
So Ran DBCC OPENTRAN and it gave following output,
Replicated Transaction Information:
Oldest distributed LSN : (0:0:0)
Oldest non-distributed LSN : (10217:892:1)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
- You might have guessed by now that it may be because of CDC (change Data Capture), but to clarify, CDC Clean job was running absolutely fine and it was re-run to ensure anything pending gets cleaned up but NO, it didn’t worked
- So after some more research found that CDC should have two jobs, one to capture and other to clean and in this SQL server only cleanup job was present with Job name: cdc.dbname_cleanup
- Now the option to disable CDC and enable is out of scope because it seems that golden gate might be using CDC and if it is touched than it might get impact and it might need to set it up again which is quite a task in itself
-
So went ahead with some more research and took a call to CREATE CDC Capture job and try to see if it helps because it seemed less intrusive that other available options,
–Command to CREATE Capture job <run in respective database context>
EXEC
sys.sp_cdc_add_job
‘capture’
GO
–A job will create as Job ‘cdc.dbname_capture’ started successfully.
–Command to DROP Capture job <run in respective database context>
EXEC
sys.sp_cdc_drop_job
‘capture’
GO
–Just an FYI, below command are for those who are looking to create even Cleanup if CDC is enabled and neither of above jobs are found.
–Command to CREATE Cleanup job <run in respective database context>
EXEC
sys.sp_cdc_add_job
‘cleanup’
GO
–Command to DROP Cleanup job <run in respective database context>
EXEC
sys.sp_cdc_drop_job
‘cleanup’
GO
- After job ‘cdc.dbname_capture’ is created, it automatically started and kept running (I believe this is default setting for CDC to continuously run this job)
-
As log size was pretty much 50+GB hence ran DBCC OPENTRAN and found that the Replicated Transaction Information is now updating whereas relatively it was static when ran in step 6 above,
Oldest active transaction:
SPID (server process ID): 67
UID (user ID) : -1
Name : user_transaction
LSN : (12779:42734:1)
Start time : Jul 15 2016 1:08:23:143PM
SID : 0x010500000000000515000000efaec1579fa8a196f757550fcb251500
Replicated Transaction Information:
Oldest distributed LSN : (0:0:0)
Oldest non-distributed LSN : (10458:4850:1)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
- After running it for about 10 mins, observed that the t-log file internal free space started increasing which is a good sign
-
After running it for about 30-45 mins, observed that t-log was 99% free and validated the sys.databases and log_reuse_wait_Desc value is “NOTHING”
good news
- So, shrinked log file and released space and ran the DROP CAPTURE job command (refer step 10) to remove the job as pretty much unsure on whether it’s needed or not, will leave it for future to monitor and see if it is actually required for continuous run, it is anyway an ONLINE operation and can be done anytime
Above steps resolved the issue in my environment, all or some of steps may help reader of this blog to solve issue in their environment, feel free to leave your comments and refer the references link below which I used for my case and some of them may help understand concepts as well.
Adios for now!
Useful References:
http://www.sqlskills.com/blogs/paul/replication-preventing-log-reuse-but-no-replication-configured/
http://www.sqlservercentral.com/Forums/Topic695034-357-1.aspx
https://subhrosaha.wordpress.com/2011/12/17/sql-server-log_reuse_wait_desc-set-to-replication/
https://msdn.microsoft.com/en-us/library/cc645937(v=sql.105).aspx
https://msdn.microsoft.com/en-us/library/cc627396(v=sql.105).aspx
http://www.sqlservercentral.com/Forums/Topic1142599-391-1.aspx
https://www.brentozar.com/archive/2012/08/scary-sql-surprises-crouching-tiger-hidden-replication/
https://blogs.msdn.microsoft.com/repltalk/2010/11/17/how-to-cleanup-replication-bits/
