SQL Server properties changed on cluster resource failover

Leave a comment (0) Go to comments

One of my friend has recently had an issue on one of his SQL Server two node cluster. He was seeing different authentication modes for SQL Server depending on which node of the 2-node cluster that it was online on.

The only recent change to their SQL Server environment was that they applied Service Pack 3 couple of days back but they had not tested failover till this weekend – which is when they ran into this issue.

The Errorlogs document this behavior clearly as follows.

On Node 1 (P1), SQL Server Error log says

2010-06-08 08:17:40.12 Server Authentication mode is MIXED.
2010-06-08 08:17:40.60 spid4s The NETBIOS name of the local node that is running the server is ‘P1′.
2010-06-08 23:16:01.34 spid4s SQL Trace was stopped due to server shutdown. Trace ID = ‘1′.

after failing over SQL resource group to Node 2 (P2), SQL Server Error log says


2010-06-08 23:29:52.65 Server Authentication mode is WINDOWS-ONLY.
2010-06-08 23:29:53.27 spid5s The NETBIOS name of the local node that is running the server is ‘P2′.
2010-06-09 03:17:18.21 spid5s SQL Trace was stopped due to server shutdown. Trace ID = ‘1′

and if, he again failover the SQL resource group to Node 1 (P1) from Node 2 (P2) , SQL Server Error log says

2010-06-09 03:17:34.22 Server Authentication mode is MIXED.
2010-06-09 03:17:45.54 spid4s The NETBIOS name of the local node that is running the server is ‘P1’

they were unable to find the solution and a case is being logged with Microsoft.

The behavior clearly indicates that there is something amiss between the two nodes of the cluster. they had just started to compare the registry keys between the nodes, when Microsoft engineer suggested that this might be a problem with checkpoint not getting applied.

After some additional research Microsoft team found that Service Pack 3 relies on the checkpoint to be applied to the passive node on the first failover- hence making it a non -issue in most cases. In this case they needed to first determine whether the checkpoint file was even there to be applied to the passive node or not.

so to determine where the checkpoint files for SQL Server instance are located in a cluster. they start by first finding the GUID under HKEY_LOCAL_MACHINE\Cluster\Resources that corresponds to our SQL Server instance i.e. where the name=SQL Server

SQL Server properties changed on cluster resource failover sql server properties changed on failover SQL Server Instance properties changed on resource failover sql server configuration changed after resource failover SQL Server Clustered Instance changing authentication modes on resource failover SQL Server authentication mode is changed on resource failover sp configure output changed after failover authentication mode is changed on resource failover

Now expand the GUID folder and click on the RegSync key under this GUID and you should see seven keys  in it from 00000001 thru 00000007 as shown below.

SQL Server properties changed on cluster resource failover sql server properties changed on failover SQL Server Instance properties changed on resource failover sql server configuration changed after resource failover SQL Server Clustered Instance changing authentication modes on resource failover SQL Server authentication mode is changed on resource failover sp configure output changed after failover authentication mode is changed on resource failover

If you are curious as to where these files actually reside, they are under the Quorum drive in the folder Q:\MSCS\<SQL Server GUID>

SQL Server properties changed on cluster resource failover sql server properties changed on failover SQL Server Instance properties changed on resource failover sql server configuration changed after resource failover SQL Server Clustered Instance changing authentication modes on resource failover SQL Server authentication mode is changed on resource failover sp configure output changed after failover authentication mode is changed on resource failover

In this customer’s case we only had one key :- 00000007. Since the customer was clearly missing the checkpoint files, we had to recreate them by following these steps:-

a. Bring SQL Server online on working node with Mixed mode authentication

b. Take SQL Server resource group offline from within Cluster Administrator

c. Open regedit and add the following keys under HKEY_LOCAL_MACHINE\Cluster\Resources\<SQL Server GUID>\RegSync ( via Right click à New String Value in the right hand pane)

Value name Value Data
———  ———–
00000001   Software\Microsoft\Microsoft SQL Server\MSSQL.1\Replication
00000002   Software\Microsoft\Microsoft SQL Server\MSSQL.1\SQLserverAgent
00000003   Software\Microsoft\Microsoft SQL Server\MSSQL.1\Cluster
00000004   Software\Microsoft\Microsoft SQL Server\MSSQL.1\MSSQLSERVER
00000005   Software\Microsoft\Microsoft SQL Server\MSSQL.1\PROVIDERS
00000006   Software\Microsoft\Microsoft SQL Server\MSSQL.1\SQLServerSCP

d. Bring SQL Server resource group online from within Cluster Administrator

At this time we should see some new .CPT files created under Q:\MSCS\<GUID>\ indicating that the checkpointing is now working as expected. The next step is to attempt failover to the passive node, during which the checkpoint file shall get applied there and then SQL Server shall come up with Mixed Mode authentication on it as well.

They finally failed over the resource group and now both nodes are running on same authentication mode.

I was not supporting this environment, this was a experienced shared by one of my friend and believe me this is a great learning experience for me, specially on cluster. So I thought to put a small post on this incident.

I still suggest, if you ever get a chance to work on similar issues, do not apply this solution blindly, as these involvement of registry keys modifications on your production environment. We are not so much trained to solve these complicated issues. It is always good to take a support from Microsoft as they know inside of SQL Server architecture.

EOF - SQL Server properties changed on cluster resource failover, SQL Server 2012 , 2008 R2 , 2008 , 2005 , 2000

Leave a Reply

Your email address will not be published. Required fields are marked *


*


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Notify me of followup comments via e-mail. You can also subscribe without commenting.