Over the last few days I have gone through the mill a bit with the Central Management Server Database, Pool Failover and Replication. This prompted me to think about a situation that strikes fear into the heart of every Lync / Skype for Business administrator. What if my CMS becomes corrupt and I don’t have a backup??
In this blog post I intend to find out, by purposely deleting my CMS and then attempt to recover it, somehow…
First let’s have a look at my lab setup (please forgive the blacked out text as using someone else’s lab for this), but I have 2 Standard Edition servers in 2 sites.
A couple of screen shots of some custom policies and enabled users
So now, let me delete my CMS and see what happens. I do this by running the following command (Warning don’t do this please!!)
Uninstall-CsDatabase –CentralManagementDatabase –SqlServerFqdn fe1.domain.local –SqlInstanceName RTC –Detach –Confirm:$false
Just to prove I really did this
Now let’s look at what damage we have caused
Running the Get-CsManagementStoreReplicationStatus command produces an alarming error that it cannot connect to the XDS database. Well, durrr, we know as we have removed it
Let’s try and download the topology in topology builder, oh no, now I am in trouble….
Let’s see what else is missing from the control panel.. Ok so we have lost the users
And we cannot retrieve the policies either
To name but a few.
What effect does this have on users? They will enter limited functionality mode.
Fixing and recovering the CMS
Now I am in a state where I have no CMS, users are complaining of loss of service and my heart rate increased and the sweat begins to pour. I quickly check my backups and to my horror, I don’t have a CMS backup. Oh ****! What do I do now?
A moments pause…. A slow and deep intake of breath and hold……. And exhale…… Right I am ready, let’s do this.
And then it comes to me! Each front end server copies the CMS to its RTCLOCAL SQL instance right? So If I can somehow extract the data from the XDS database on the RTCLOCAL instance of one front end, then I should have the last copy of the CMS?
Luckily, there is a parameter to the Export-CsConfiguration commandlet that will let you export the data from a front end’s local CMS replica, this is
Export-CsConfiguration –FileName c:\localcms.zip –LocalStore
So now I have the local replica copy of the CMS
Now to complete my recovery
- Run the command Install-CsDatabase –CentralManagementDatabase –SqlServerFqdn fe1.domain.local –SqlInstanceName RTC –Clean
- Once complete, import the data we extracted from the Local replica CMS by running Import-CsConfiguration –FileName c:\localcms.zip
- Stop all services on the front end servers using Stop-CsWindowsService –Force commandlet
- Start the pool back up running Start-CsPool –PoolFqdn fe1.domain.local commandlet
- Next, we need to run the Install Local Configuration store (Step 1 from the deployment wizard) on all front end servers
- Once completed, we then need to run step 2 of the deployment wizard to install / remove components.
- Check that all services are still in a started state on all servers especially the backup, file transfer agent, master replication agent and replica replication agent services
- Now run an Invoke-CsManagementStoreReplication command
wait for about 5 minutes for this to complete before continuing
- Now run a Get-CsManagementStoreReplicationStatus –CentralManagementStoreStatus command and this should now return some healthy results. We have an active master and active file transfer agent
- Now run Get-CsManagementStoreReplicationStatus and let’s make sure that we are replicating with all servers
- Now let’s try and download the topology again, do we see our expected topology?
This is a better sign…..
- Great… we now have our topology back.. Let’s check the control panel and see if we need to re-enable users, policies etc..
Our users are back!!!
- And the policies?
- And we are done………….. or are we??
We have successfully restored the CMS from the local cached copy from a front end server and restored full functionality to the users. Users have stopped complaining and everyone is happy. Now I am off to take a well-earned break and a cup of tea while I pat myself on the back for a job well done. Before I go I just wanted to leave you with some food for thought
- If you have lost the location database, or you have to install the CMS database using the clean switch then this will overwrite the Lis Database too. Therefore, if you can take a copy of the Lis Database before using this command Export-CsLisConfiguration –FileName c:\lis.zip, please do so. If not, then you will have to reconfigure E-911 and locations / subnets again manually.
- Please, Please, make sure you take regular backups of your Lync / Skype for Business servers or at the very least a backup of the CMS. You can use that one single command Export-CsConfiguration –FileName c:\cms.zip that could save your job one day. Or alternatively use a backup script designed for Skype for Business and schedule task it (here’s one I made earlier) https://gallery.technet.microsoft.com/Skype-for-Business-Backup-d7a3dfbd
- Don’t bury your head in the sand and make sure you proactively look after your deployment.
- If you are unfortunate enough to find yourself in this situation, then all is not lost, there is a recovery path, so while it is an immediate issue, it is not by any means impossible to recover
- Lastly, in the nicest possible way, I hope none of you will ever need to refer to this post in real life, as we all have backups….. right??