Testing a DR solution is very critical to ensure DR setup will be available in case there is actual disaster.
There is few point that should be considered while planning/ testing a DR site. These points assumes that your organization has already practicing ITIL practices to this points cover some Incident /Change processes as well.
Things to be covered prior to test DR site
Hardware Check List - Check you hardware ensure all setting are same and synchronized
- DISK (Number of disks, Space individual disk)
- MEMORY(Physical memory and its settings )
- CPU (Number of processors)
- NIC Card Setting (speed)
- Host BUS Adapter Card (HBA Card)
Note: As we do proactive check for the hardware we will be ensuring that the server is ready for DR Program.
Software side Checklist (Operating System and MSSQL SERVER) – Check you Windows / SQL Server settings to ensure all setting / configured items are same and synchronized
- Windows OS Patch
- SQL SERVER Service Pack (MSSQL Version)
- SQL SERVER Service accounts (Domainname\SQLServiceAccount)
- Different driver are installed on DR server (Oracle Listener-as it is used in Link server connection)
- Backup Software client version if used (**will not affect the DR** ) like SQL light etc..
Server should be part of the 3DNS or CNAME
- What 3 DNS Name of the application?
- What are the IP address of the servers?
- What should be mark up and what should be mark down (you can find this information from Enterprise Administrator)?
So I am assuming that you are aware of 3DNS technology, if not here is small description on 3DNS - The 3DNS Controller is a high availability, intelligent load balancing solution for geographically distributed Internet sites and data centers. 3DNS manages and distributes user requests across multiple, redundant server sites – regardless of the platform type or combination and without requiring additional software on the servers. User requests are distributed according to factors such as round trip time, number of active servers, packet loss, etc. and, the 3DNS Controller can be configured to make traffic distribution decisions according to what is most important for an organization’s network. 3DNS enables enterprises and e-commerce companies to easily and reliably extend the reach of their business-critical web sites to additional users – remote workers, vendors, partners, and customers – while saving money by leveraging the investment they already have in existing systems and adding additional server sites as needed."
Simply – it takes a request (for a web site for instance) and looks at the best route to get there. More Information on 3DNS
Checks and Fixes on SQL SERVER Side
- Check the database file placement is proper as per the Principal server.
- Push missing Maintenance Jobs to DR.
- Transfer missing application Jobs.
- Transfer Missing logins and have same server level permissions.
- Check Mirroring State to be in Synchronized mode (before doing failover).
- Transfer missing SSIS packages.
- Transfer Folder of Configuration files which is used by SSIS Package.
- Open SSIS packages and check the connection String for every connection manager.
- Check any file which is getting extracted by SSIS package or jobs is having proper network path so that it will be accessible by all.
- Keep list of jobs which are disabled. We need to keep these disabled after the failover as well.
- Legato backup client is working for backing up database
- Check any Link server dependencies (collect user name and password from the application and test it out well in advance on DR side).
- List out the Replication Dependencies and change the failover partner name
- Any TNS Name is required to get copied and put in the DR server.
- Patch compliance on both the server(Primary and DR Server).
Manual Failover and using GUI (SSMS)
Change the Database Mirroring to Synchronous mode.
Failover using Mirroring GUI
Manual failover of DBMs (SSMS)
-- Run on principal USE master GO ALTER DATABASE Mirroring_Demo_Sumner SET SAFETY FULL GO ALTER DATABASE Mirroring_Demo_Sumner SET PARTNER FAILOVER GO --Run on new principal USE master GO ALTER DATABASE Mirroring_Demo_Sumner SET SAFETY OFF GO
Things to be covered at the time of DR Failover
- If possible plan a Bridge to ensure you have ready support from other various verticals like sysadmin, SAN admin and Network admins, application SME etc in case there is any requirement.
- Check the name of application for which you are doing the failover.
- Get start time for Database failover from application .
- Again make sure all the database on the servers are in “SET PARTNER SAFETY FULL” and are fully in sync.
- Once you get message from application coodinator “can start Database Failover” proceed for Database failover using manual or using the GUI.
- As it is required to capture Date/Time DBA switchover started and Date/Time DBA switchover completed take a screenshot of the screen and share with the application team.(update the time in link also of the DR Program .
- As a DBA you need to capture following information.
- Date/Time DBA switchover started
- Date/Time DBA switchover completed
- Date/Time Backup Client Validated
- Date/Time Monitoring Set up on New Primary Database
- Date/Time DBA switchback started
- Date/Time DBA switchback completed
Things to be consider After Failover
- Check if database Mirroring state in is in Synchronized mode for all databases failed over.
- Ensure with application team that SQL Server Agent service account has required privileges to flat file shares, if used by SSIS packages.
- Fix orphaned users using sp_change_users_login. Listing orphaned users sp_change_users_login ‘report’
- If connectivity error occurs ensure 3 DNS is changed if app uses 3 DNS. Enterprise Adminstrators manages 3DNS.
select hostname,loginame, count(*) from master.dbo.sysprocesses with (nolock) group by hostname, loginame
- Monitor error log
- Ensure SCOM service is running, raise a dummy incident.
- Get the incident # for reference as it a part of DR Check
- Change the User DB Mirroring Mode back to Asynchronous Mode.
- Get a confirmation from application SME for application availability.