Thursday, 5 December 2013

Renaming a SharePoint Server 2013 machine and the issues that follow...

I have a SharePoint 2013 Server set up and running. It happened that I need to rename the server. Now, due to my own laziness, instead of renaming the server, I used a DNS CNAME record to do the job and reconfigured the alternative names in SharePoint. Cleaver? Nope, I get hundreds of the following error events daily.

Machine 'xxx ' failed ping validation and has been unavailable since 'xxx'.

So I decided to actually rename it. Once I did that, a few services start to fail. After googling the internet for solution, I found that I need to also run the PowerShell Rename-SPServer cmdlet. Running that cmdlet resolves almost everything, except the Distributed Cache. Ok, fine. I faced this problem before. What I need to do is to run the following cmdlets,

Remove-SPDistributedCacheServiceInstance
Add-SPDistributedCacheServiceInstance

The first cmdlet actually kill the current cache and the second cmdlet rebuild it. You can find more info regarding managing Distributed Cache here.

I then check the AppFabric Caching Service and it was running fine and I was happy with it. And then came a restart (due to Windows Update), and I was getting lots of warning in the event log complaining about the cache not working. As you can see, when the cache is not working, browsing the SharePoint is very slow. I opened up Services and found that AppFabric Caching Service is now disabled. I re-enabled it and started it and it runs fine, and I was happy. Then came the next restart and the issue re-appeared. "Darn it", I cried out loud.... in my mind of course. I tried to repeat the usual destroying the cache and rebuild it and this time the AppFabric Caching Service won't get enabled automatically. I then knew that something is really wrong.

It turned out that Remove-SPDistributedCacheServiceInstance failed to totally remove it. After googling the internet, I can't find anything solid on this, but I found something. So I started up SharePoint 2013 Management Shell and typed the following cmdlet,
Get-SPServiceInstance | Select TypeName,Id
I found the GUID for Distrubed Cache and delete it with the following cmdlets,

$s = Get-SPServiceInstance {GUID}
$s.delete()
Remove-SPDistributedCacheServiceInstance
Add-SPDistributedCacheServiceInstance
Please replace {GUID} with the GUID you get from the first cmdlet. Checked everything and everything ran fine.... Of course, until the next restart...

Now the World Wide Web Publishing Service failed to start. Without it, no web server and without a web server service, there's no SharePoint. Checked the event log and I found,
The HTTP Service service failed to start due to the following error:
Cannot create a file when that file already exists.
It's not helpful at all. I know that the HTTP service can't start and and it is due to a problem where it can't create a file. So I fired up ProcMon and dig through the log. I found nothing. Getting fed up, I was ready to restore from backup, but I opened up Regedit just in case. I browsed to,
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\HTTP\Parameters
and figured that somehow something here might be the culprit(s). I start renaming each of the parameters and start the HTTP service. It turned out that one of the component in SslSniBindingInfo is the culprit. A port was registered twice with the same content. Removing one of the registration solved the issue.

Ah, finally I can rest...

No comments:

Post a Comment