sql server alwayson availablity groups

first step is to set up windows server failover clustering (WSFC) on two or more nodes. the setup is fairly straight forward…install the feature on nodes you want participating in the cluster. shared storage is not actually required for alwayson availability groups, but you may want to configure a file share witness (~512MB) if you have an even number of nodes.

my setup looks like:
SQL1 – 192.168.1.10 (WSFC node 1 w/ standalone SQL server instance)
SQL2 – 192.168.1.11 (WSFC node 2 w/ standalone SQL server instance)
SQL-CLUSTER – 192.168.1.20 (the WSFC cluster object)
SQL-AG – 192.168.1.21 (availability group which applications should connect to)

important: The SQL-CLUSTER computer account needs to have the ability to create computer objects in the same OU it resides in so that it can create the AG object. by default this is usually the Computers container, but in my case it was not. if the permissions are not set up correctly then the create availability group listener step will fail.
related info:
https://support.microsoft.com/en-us/kb/2829783
https://ittreasure.wordpress.com/2012/08/14/hello-world/

failover clustering info:
http://technet.microsoft.com/en-us/library/cc770620(v=ws.10).aspx
http://blogs.technet.com/b/askpfeplat/archive/2012/06/27/clustering-what-exactly-is-a-file-share-witness-and-when-should-i-use-one.aspx

second, install .net framework 3.5.1 if you haven’t already (needed by sql server 2012)
save yourself a hassle and make sure to set the SQL server service to run as a domain user during the installation.

if you change it later from a local service account to a domain user you’ll have to fix the SPN by following the ADSIedit instructions here: http://support.microsoft.com/kb/811889

further info:
https://social.technet.microsoft.com/Forums/sqlserver/en-US/2d4d6a0f-de2c-4dc3-b813-46167f3ade0c/the-target-principal-name-is-incorrect-cannot-generate-sspi-context-microsoft-sql-server-error?forum=sqlsecurity
https://cmatskas.com/fixing-error-cannot-generate-sspi-context-after-changing-sql-service-account/

if you receive a “joining database on secondary replica resulted in an error” when running the AG wizard then run this against the master database on each node:
USE master
GO
CREATE LOGIN [mydomain\sql-service] FROM WINDOWS
GO
GRANT CONNECT on ENDPOINT::Hadr_endpoint TO [mydomain\sql-service]
GO
^lines other than GO require a semi-colon at the end. wordpress freaked out about including actual t-sql statements…
source: http://blogs.msdn.com/b/svarukala/archive/2014/03/31/sql-alwayson-failed-to-join-the-database-to-the-availability-group-error-35250.aspx

more resources:
http://blogs.technet.com/b/scvmm/archive/2012/10/24/how-to-configure-sql-2012-alwayson-availability-groups-in-system-center-2012-virtual-machine-manager-service-pack-1.aspx
http://blogs.technet.com/b/canitpro/archive/2013/08/20/step-by-step-creating-a-sql-server-2012-alwayson-availability-group.aspx
http://www.mssqltips.com/sqlservertip/2518/sql-server-alwayson-availability-groups–part-2-availability-groups-setup/
http://sharepointengineer.com/2014/02/17/sql-server-2012-step-by-step-create-the-alwayson-availability-group/
http://stevenpoitras.com/2014/02/configure-sql-db-availability-group/
http://www.sql-server-performance.com/2013/alwayson-clustering-failover/
http://www.davidklee.net/2012/04/10/sql-server-2012-alwayson-availability-group-creation/

update 9/23/15:
some differences in managing availability groups…
i had some ADFS servers complaining about the service broker feature not being enabled. unfortunately, you cannot enable the feature on a database that’s been added to an AG. the only solution i could find was to remove the database from the AG and enable the feature on the now standalone (non-synchronized) database. then finally re-add the database back to the AG.

something else i noticed was that AG requires the full recovery model be set on databases and this of course can result in rather large .ldf files. however, in a normal scenario a backup and shrink job usually reduces the .ldf size. i was not experiencing this with the AG database. i suppose i could have used a similar solution as mentioned above where i would remove the database from the AG and then shrink it and readd it to the AG. however, i didn’t consider that at the time so i ended up moving the affected DB to its own volume by removing it from the AG, detaching/reattaching and adding it back to the AG. i’d like to look into shrinking the .ldf again further when i have time. also new volumes that SQL DBs reside on will need to have full control NTFS permissions given to “nt service\mssqlserver”.

update 10/12/16:
i ran into a few issues recently when attempting to add another secondary replica. the third node mostly added to the WSFC cluster fine except i experienced an issue where the cluster decided to take ownership of the local storage on the nodes. this caused the disks to disappear from windows on each node. it’s possible i missed something within the wizard that would have prevented this or perhaps not? anyways, i resolved the issue by going to the failover cluster manager->storage->disks and removing the disks listed there from the cluster.

next, i was experiencing difficulty with adding the new replica with the AG “add replica” wizard. it took me longer than it should have to determine what the issue was (i wasn’t paying close enough attention to the error messages & logs). the issue was that the new node wasn’t agreeing with the encryption algorithm used by the other nodes.

i resolved this with the following (AES seems to be the standard for sql 2012+):
CREATE ENDPOINT [Hadr_endpoint]   
STATE = STARTED
AS TCP ( LISTENER_PORT = 5022 )
FOR DATABASE_MIRRORING (
AUTHENTICATION = WINDOWS NEGOTIATE,
ENCRYPTION = REQUIRED ALGORITHM AES,
ROLE = ALL)
^the last line should end with a semi-colon

if you’ve already created the endpoint you can use the following to delete it:
DROP ENDPOINT [Hadr_endpoint]                    <— semi-colon
GO

or even better – alter the settings of an existing endpoint:
USE [master]
GO
ALTER ENDPOINT [Hadr_endpoint]
STATE=STARTED
AS TCP (LISTENER_PORT = 5022, LISTENER_IP = ALL)
FOR DATA_MIRRORING (ROLE = ALL, AUTHENTICATION = WINDOWS NEGOTIATE
, ENCRYPTION = REQUIRED ALGORITHM AES)
GO

if recreating the endpoint from scratch you’ll need to repeat the step mentioned above:
USE master              <— semi-colon
GO
GRANT CONNECT on ENDPOINT::Hadr_endpoint TO [mydomain\sql-service]             <— semi-colon
GO

source:
https://sqlperformance.com/2013/04/system-configuration/troubleshooting-alwayson

once the encryption algorithm problem was resolved i was much closer. the next issue i encountered

This entry was written by resinblade , posted on Monday November 17 2014at 08:11 pm , filed under IT . Bookmark the permalink . Post a comment below or leave a trackback: Trackback URL.

Comments are closed.