SQL Server DR with AlwaysOn AVG between different Azure data centers
This is the first of a series of articles where you can find instructions to reproduce the lab environment I used for my demo @ Technical Cloud Day 2017 conference.
Other parts of this series are available here:
Environment implementation |
Securing lab environment |
Create an automation runbook to force a SQL Server AVG failover |
In my session I talked about SQL Server disaster recovery technologies available for hybrid scenarios, with main data center on premises and DR data center on Azure, focusing on AlwaysOn Availability Groups.
Since I was concerned about possible connectivity issues for an hybrid demo environment, I preferred to implement a cloud only disaster recovery scenario, with an AlwaysOn AVG cluster which span over two Azure regions; apart from implementation steps required to deploy a cloud only environment, the AlwaysOn experience have no relevant differences compared to the one you can observe on an hybrid infrastructure.
In order to make a demo environment easy to be reproduced by attendees, I decided to write a JSON template; I extended an existing one I found on GITHub in order to be able to deploy:
- One vnet on main data center (West Europe), which is connected via a vnet-to-vnet VPN to the disaster recovery data center (East US) vnet
- An Active directory infrastructure composed by two domain controllers, one for each data center
- An AlwaysOn AVG infrastructure composed by two sync replicas on main data center and one async replica on dr data center, with a listener which is able to move between regions
- A VM on main datacenter which expose a file share witness for the cluster described above.
I was thinking about a modular JSON,
able to generate a different kind of infrastructures based on customer needs, in order to make it easily reusable; I started to face some difficulties due to template limitations and PowerShell DSC debugging, so I ended up creating a template which provision all the resources and a PowerShell post-deployment script which finalize the environment configuration.
Since my main intent was to obtain a demo laboratory, I took some shortcuts in the deployment which are not so adherents with normally applied best practices for Azure infrastructures; i.e. I deployed all the resource in a single resource group, and used the same storage account for all the SQL Server machines in the same site. Please take this disclaimer in mind if you’d like to reuse my work for your prod environment.
You can find the template on my GitHub repo.
To deploy it you can download the package on your machine and invoke it via powershell, with the following statements:
Login-AzureRMAccount $resourceGroupName = <Assign a name here> New-AzureRmResourceGroupDeployment -ResourceGroupName $resourceGroupName -TemplateParameterFile .\azuredeploy.parameters.json -TemplateFile .\azuredeploy.json
As a simpler alternative, you can directly deploy it clicking on the following button:
Template will ask you for a lot of parameters, described in the following table:
Parameter | Description | Default Value |
---|---|---|
templatesBaseUrl | Repository URL where nested template reside. Normally you want to change it if you fork my original repo. | https://raw.githubusercontent.com/OmegaMadLab/SQLScripts/master/tcdit2017/sqlvm-alwayson-cluster/nested |
scriptsBaseUrl | Repository URL where DSC Extension reside. Normally you want to change it if you fork my original repo. | https://raw.githubusercontent.com/OmegaMadLab/SQLScripts/master/tcdit2017/sqlvm-alwayson-cluster/scripts |
mainDCLocation | Main data center location | West Europe |
drDCLocation | DR data center location | East US |
environmentPrefix | Prefix for all the resources created in the resource group | tcdit |
mainResSuffix | Suffix for all the resources created in the main data center | main |
drResSuffix | Suffix for all the resources created in the dr data center | dr |
sqlStorageAccountType | Type for SQL Server VM storage accounts | Standard_LRS |
adStorageAccountType | Type for AD VM storage accounts | Standard_LRS |
sqlStorageAccountName | Name for SQL Server VM storage accounts; it will be completed with prefix and appropriate suffix | sqlsa |
adStorageAccountName | Name for AD VM storage accounts; it will be completed with prefix and appropriate suffix | adsa |
fswStorageAccountName | Name for File Share Witness VM storage accounts; it will be completed with prefix and appropriate suffix | fswsa |
diagStorageAccountName | Name for diagnostic storage accounts; it will be completed with prefix and appropriate suffix | diagsa |
mainVirtualNetworkAddressRange | Address range for main data center vnet | 10.0.0.0/16 |
mainStaticSubnet | Subnet dedicated to AD on main data center vnet | 10.0.0.0/24 |
mainSqlSubnet | Subnet dedicated to SQL Server on main data center vnet | 10.0.1.0/24 |
mainGWSubnet | Subnet dedicated to Vnet Gateway on main data center vnet | 10.0.2.0/28 |
drVirtualNetworkAddressRange | Address range for DR data center vnet | 172.16.0.0/16 |
drStaticSubnet | Subnet dedicated to AD on DR data center vnet | 172.16.0.0/24 |
drSqlSubnet | Subnet dedicated to SQL Server on DR data center vnet | 172.16.1.0/24 |
drGWSubnet | Subnet dedicated to Vnet Gateway on DR data center vnet | 172.16.2.0/28 |
mainSqlLBIPAddress | Internal load balancer IP create on main data center | 10.0.1.9 |
drSqlLBIPAddress | Internal load balancer IP create on DR data center | 172.16.1.9 |
vpnSharedKey | Shared key for vnet-to-vnet VPN | |
adPDCNICIPAddress | IP for domain controller on main data center | 10.0.0.4 |
adBDCNICIPAddress | IP for domain controller on DR data center | 172.16.0.5 |
domainName | Domain name | tcdit17demo.local |
adminUsername | Admin username - it will be used also as Domain Admin | |
adminPassword | Password for Admin | |
adVMSize | VM Size for domain controllers - choose VM family accordingly to the type of storage account you choose | Standard_D1_v2 |
witnessVMSize | VM Size for File Share Witness server - choose VM family accordingly to the type of storage account you choose for SQL Server SA | Standard_D1_v2 |
sqlVMSize | VM Size for SQL Server - choose VM family accordingly to the type of storage account you choose | Standard_D2_v2 |
sqlServerVersion | SQL Server version | SQL2016SP1-WS2016 |
sqlServerEdition | SQL Server Edition | SQLDEV |
workloadType | Type of workload, as asked on standard SQL VM templates. It defines optimized startup parameters for the instance | OLTP |
sqlServerServiceAccountUserName | SQL Server service account name; it will also be sysadmin of the instance | |
sqlServerServiceAccountPassword | Password for SQL Server service account name | |
autoPatchingDay | Day for SQL autopatching, as asked on standard SQL VM templates | Sunday |
autoPatchingStartHour | Start hour for SQL autopatching, as asked on standard SQL VM templates | 2 |
For my demo environment, I chose to define standard storage VMs in order to contain costs; all of them were based on Windows Server 2016 images, while the SQL Server version I opted for was SQL Server 2016 SP1 Developer Edition.
Once started, the deployment should be completed in less than two hours, and it will deploy all the infrastructure components and the VMs with software prerequisites onboard.
You can gain access to the environment connecting via RDP to the public IPs assigned to domain controller, and then jump on the other servers from there. In another articles of this series I talk about how to add a minimum of security restrinctions on this kind of access.
To fully configure the environment, you must launch the post deployment configuration script you can find on the GitHub repo.
So, connect on one of the DCs, jump on the first SQL Server machine in the main data center (in my example tcdit-sql-main0) and go on with the following steps.
This PowerShell script will complete the following tasks:
- Configure AD topology
- Configure Windows Server Failover Cluster
- Configure SQL Server AlwaysOn Availability Group
- Configure Availability Group Listener.
Let’s see the macro steps above one by one.
First of all, you have to compile variables section, using the same parameter you already specified for JSON template
#========================================================================== # Variables #========================================================================== $mainDatacenterSubnet = '10.0.0.0/16' $drDatacenterSubnet = '172.16.0.0/16' $mainADDC = 'tcdit-addc-main' $drADDC = 'tcdit-addc-dr' $ilbIpMainSubnet = '10.0.1.9' $ilbIpDrSubnet = '172.16.1.9' $primaryReplicaName = 'tcdit-sql-main0' $secondaryReplicaName = 'tcdit-sql-main1' $drReplicaName = 'tcdit-sql-dr0' $clusterName = 'sqlhademo' $fswPath = '\\tcdit-fsw-main\cluster-fsw' $mainClusterNetName = 'Main datacenter network' $drClusterNetName = 'DR datacenter network' $mainClusterIpName = 'Main datacenter cluster IP' $drClusterIpName = 'DR datacenter cluster IP' $sqlNodes = @() $sqlNodes += $primaryReplicaName $sqlNodes += $secondaryReplicaName $sqlNodes += $drReplicaName $sqlAgent = 'SQLSERVERAGENT' $sqlSrv = 'MSSQLSERVER' $backupShare = "F:\Backup" $sqlSvcAcctn = "TCDIT17DEMO\sqlAdmin" $db = 'TCDIT17DEMO' $ag = 'AlwaysOn-AG' $agListener = 'aglistener'
Active Directory configuration
This section configure AD Topology, with different sites, subnets and related domain controller. Since we’re using the environment for demo purposes, I enabled the Change Notification between site in order to have intersite replication which act at the same speed of intrasite replication; I also lowered the schedule which govern the reloading of AD integrated zones on DNS; with this configuration the remote site should be able to receive DNS record changes in less then a minute.
#========================================================================== # Create AD Topology #========================================================================== $domainFqdn = (get-addomain).DNSRoot #Rename default-first-site-name pushd cd 'ad:CN=Sites,CN=Configuration*' ren cn=def* cn=MainDatacenter popd #Add subnet to main AD site $mainADSite = Get-ADReplicationSite New-ADReplicationSubnet -Name $mainDatacenterSubnet -Site $mainADSite -Server $mainADDC #Add dr AD site with appropriate subnet New-ADReplicationSite -Name 'DrDatacenter' -Server $mainADDC $drADSite = Get-ADReplicationSite -Identity 'DrDatacenter' -Server $mainADDC New-ADReplicationSubnet -Name $drDatacenterSubnet -Site $drADSite -Server $mainADDC #Create new site link and remove the default one New-ADReplicationSiteLink -Name Main2DR -Cost 100 -InterSiteTransportProtocol IP -ReplicationFrequencyInMinutes 15 -SitesIncluded $mainADSite,$drADSite Get-ADReplicationSiteLink -Identity DEFAULTIPSITELINK | Remove-ADReplicationSiteLink -Confirm:$false #Move DR ADDC to appropriate site Move-ADDirectoryServer -Identity $drADDC -Site $drADSite #Enable intersite Change Notification - Be carefull on production environment! $siteLink = Get-ADReplicationSiteLink -Identity Main2DR Get-adobject -Identity $siteLink.DistinguishedName -properties options | set-adobject –replace @{options=$($_.options –bor 1)} #Reduce DNS -> AD polling interval - Be carefull on production environment! invoke-command -Computername $mainADDC, $drADDC -scriptBlock{set-DnsServerDsSetting -PollingInterval 30|Restart-Service -Force dns}
Windows Server Failover Cluster configuration
This section generate a three node multi subnet failover cluster.
To avoid conflicts with Azure DHCP I extended the actual best practices for single subnet WSFC on Azure, assigning to the cluster network name a couple of Link Local IP Addresses, one for each subnet.
#========================================================================== # Create multisubnet WSFC #========================================================================== #Create multisubnet WSFC New-Cluster -Name $clusterName -Node $sqlNodes $cluster = Get-Cluster $clusterGroup = $cluster | Get-ClusterGroup $clusterNameRes = $clusterGroup | Get-ClusterResource "Cluster Name" $clusterNameRes | Stop-ClusterResource | Out-Null #Rename cluster networks and change IPs to Link Local IP addresses in order to avoid errors due to Azure DHCP, and bring cluster network name online $clusterNetworks = Get-ClusterNetwork ForEach($clusterNetwork in $clusterNetworks) { if($clusterNetwork.Address.contains($mainDatacenterSubnet.Substring(0,$mainDatacenterSubnet.IndexOf('.')))){ $clusterNetwork.Name = $mainClusterNetName } else { $clusterNetwork.Name = $drClusterNetName } } $clusterIpAddrRes = $clusterGroup | Get-ClusterResource | Where-Object { $_.ResourceType.Name -eq "IP Address"} ForEach($clusterIpAddr in $clusterIpAddrRes) { $clusterIpAddr | Stop-ClusterResource | Out-Null if(($clusterIpAddr | Get-ClusterParameter -Name 'Network').Value -eq $mainClusterNetName) { $clusterIpAddr.Name = $mainClusterIpName $IpAddr = '169.254.1.10' } else { $clusterIpAddr.Name = $drClusterIpName $IpAddr = '169.254.10.10' } $clusterIpAddr | Set-ClusterParameter -Multiple @{ "Address" = "$ipAddr" "SubnetMask" = "255.255.0.0" "EnableDhcp" = 0 "OverrideAddressMatch" = 1 } -ErrorAction Stop } Start-ClusterResource $clusterNameRes #Change cluster quorum configuration to node and file share majority Set-ClusterQuorum -FileShareWitness $fswPath #Remove vote from dr node to avoid unwanted failover due to remote site connectivity issues (Get-ClusterNode -Name $drReplicaName).NodeWeight = 0
SQL AlwaysOn Availability Group configuration
This section creates an availability group with a database inside, configuring also replicas properties (type of replication choosen, read only routing list, etc.).
#========================================================================== # Configure SQL AlwaysOn AG #========================================================================== #Enable TCP protocol and HADR feature on all instances $smo = 'Microsoft.SqlServer.Management.Smo.' $timeout = New-Object System.TimeSpan -ArgumentList 0, 0, 30 Add-Type -AssemblyName System.ServiceProcess ForEach ($sqlNode in $sqlNodes) { $wmi = new-object ($smo + 'Wmi.ManagedComputer')$sqlNode $uri = "ManagedComputer[@Name='" + $sqlNode + "']/ ServerInstance[@Name='MSSQLSERVER']/ServerProtocol[@Name='Tcp']" $Tcp = $wmi.GetSmoObject($uri) $Tcp.IsEnabled = $true $Tcp.Alter() Enable-SqlAlwaysOn ` -Path SQLSERVER:\SQL\$sqlNode\Default ` -NoServiceRestart $sqlSvc = get-service -ComputerName $sqlNode -Name $sqlSrv if($sqlSvc.Status -eq "Running") { $sqlSvc.Stop() } $sqlSvc.WaitForStatus("Stopped", $timeout) $sqlSvc.Start() $sqlSvc.WaitForStatus("Running", $timeout) } #Create backup share New-Item $backupShare -ItemType directory net share backup=$backupShare "/grant:$sqlSvcAcctn,FULL" icacls.exe "$backupShare" /grant:r ("$sqlSvcAcctn" + ":(OI)(CI)F") #Create database on every instance Invoke-SqlCmd -Query "CREATE database $db" Backup-SqlDatabase -Database $db -BackupFile "$backupShare\$db.bak" -ServerInstance $primaryReplicaName Backup-SqlDatabase -Database $db -BackupFile "$backupShare\$db.log" -ServerInstance $primaryReplicaName -BackupAction Log Restore-SqlDatabase -Database $db -BackupFile "\\$primaryReplicaName\backup\$db.bak" -ServerInstance $secondaryReplicaName -NoRecovery Restore-SqlDatabase -Database $db -BackupFile "\\$primaryReplicaName\backup\$db.log" -ServerInstance $secondaryReplicaName -RestoreAction Log -NoRecovery Restore-SqlDatabase -Database $db -BackupFile "\\$primaryReplicaName\backup\$db.bak" -ServerInstance $drReplicaName -NoRecovery Restore-SqlDatabase -Database $db -BackupFile "\\$primaryReplicaName\backup\$db.log" -ServerInstance $drReplicaName -RestoreAction Log -NoRecovery #Mirroring endpoint creation ForEach ($sqlNode in $sqlNodes) { $endpoint = New-SqlHadrEndpoint SqlHaEndpoint ` -Port 5022 ` -Path "SQLSERVER:\SQL\$sqlNode\Default" Set-SqlHadrEndpoint ` -InputObject $endpoint ` -State "Started" Invoke-SqlCmd -Query "CREATE LOGIN [$sqlSvcAcctn] FROM WINDOWS" -ServerInstance $sqlNode Invoke-SqlCmd -Query "GRANT CONNECT ON ENDPOINT::[SqlHaEndpoint] TO [$sqlSvcAcctn]" -ServerInstance $sqlNode } #Availability Replicas creation $primaryReplica = New-SqlAvailabilityReplica ` -Name $primaryReplicaName ` -EndpointURL "TCP://$primaryReplicaName.$domainFqdn`:5022" ` -AvailabilityMode "SynchronousCommit" ` -FailoverMode "Automatic" ` -Version 11 ` -AsTemplate $secondaryReplica = New-SqlAvailabilityReplica ` -Name $secondaryReplicaName ` -EndpointURL "TCP://$secondaryReplicaName.$domainFqdn`:5022" ` -AvailabilityMode "SynchronousCommit" ` -FailoverMode "Automatic" ` -Version 11 ` -AsTemplate $drReplica = New-SqlAvailabilityReplica ` -Name $drReplicaName ` -EndpointURL "TCP://$drReplicaName.$domainFqdn`:5022" ` -AvailabilityMode "AsynchronousCommit" ` -FailoverMode "Manual" ` -Version 11 ` -AsTemplate #AVG Creation New-SqlAvailabilityGroup ` -Name $ag ` -Path "SQLSERVER:\SQL\$primaryReplicaName\Default" ` -AvailabilityReplica @($primaryReplica,$secondaryReplica,$drReplica) ` -Database $db Join-SqlAvailabilityGroup ` -Path "SQLSERVER:\SQL\$secondaryReplicaName\Default" ` -Name $ag Add-SqlAvailabilityDatabase ` -Path "SQLSERVER:\SQL\$secondaryReplicaName\Default\AvailabilityGroups\$ag" ` -Database $db Join-SqlAvailabilityGroup ` -Path "SQLSERVER:\SQL\$drReplicaName\Default" ` -Name $ag Add-SqlAvailabilityDatabase ` -Path "SQLSERVER:\SQL\$drReplicaName\Default\AvailabilityGroups\$ag" ` -Database $db #ReadOnly Replicas and routing list $primaryReplica = Get-Item "SQLSERVER:\SQL\$primaryReplicaName\Default\availabilityGroups\$ag\availabilityReplicas\$primaryReplicaName" $secondaryReplica = Get-Item "SQLSERVER:\SQL\$primaryReplicaName\Default\availabilityGroups\$ag\availabilityReplicas\$secondaryReplicaName" $drReplica = Get-Item "SQLSERVER:\SQL\$primaryReplicaName\Default\availabilityGroups\$ag\availabilityReplicas\$drReplicaName" Set-SqlAvailabilityReplica -ConnectionModeInPrimaryRole "AllowAllConnections" -InputObject $primaryReplica Set-SqlAvailabilityReplica -ConnectionModeInSecondaryRole "AllowReadIntentConnectionsOnly" -InputObject $primaryReplica Set-SqlAvailabilityReplica -ReadOnlyRoutingConnectionUrl "TCP://$primaryReplicaName.$domainFqdn`:1433" -InputObject $primaryReplica Set-SqlAvailabilityReplica -ConnectionModeInPrimaryRole "AllowAllConnections" -InputObject $secondaryReplica Set-SqlAvailabilityReplica -ConnectionModeInSecondaryRole "AllowReadIntentConnectionsOnly" -InputObject $secondaryReplica Set-SqlAvailabilityReplica -ReadOnlyRoutingConnectionUrl "TCP://$secondaryReplicaName.$domainFqdn`:1433" -InputObject $secondaryReplica Set-SqlAvailabilityReplica -ConnectionModeInPrimaryRole "AllowAllConnections" -InputObject $drReplica Set-SqlAvailabilityReplica -ConnectionModeInSecondaryRole "AllowReadIntentConnectionsOnly" -InputObject $drReplica Set-SqlAvailabilityReplica -ReadOnlyRoutingConnectionUrl "TCP://$drReplicaName.$domainFqdn`:1433" -InputObject $drReplica Set-SqlAvailabilityReplica -ReadOnlyRoutingList "$secondaryReplicaName","$drReplicaName","$primaryReplicaName" -InputObject $primaryReplica Set-SqlAvailabilityReplica -ReadOnlyRoutingList "$primaryReplicaName","$drReplicaName","$secondaryReplicaName" -InputObject $secondaryReplica Set-SqlAvailabilityReplica -ReadOnlyRoutingList "$primaryReplicaName","$secondaryReplicaName","$drReplicaName" -InputObject $drReplica
AlwaysOn Listener configuration
This section creates the availability group listener, and configure its properties (i.e. RegisterAllProvidersIP, HostRecordTTL).
#========================================================================== # Configure SQL AlwaysOn AG Listener #========================================================================== #Client Access Point creation $agClusterGroup = Get-ClusterGroup -Name $ag #1. Add a network name named as the AG Listener $cap = Add-ClusterResource -Name $agListener -ResourceType "Network Name" -Group $agClusterGroup #2. Set paramters $cap | Set-ClusterParameter -Multiple @{ "DnsName" = "$agListener" "HostRecordTTL" = 30 "RegisterAllProvidersIP" = 1 } -ErrorAction Stop #3. Create two IP Addresses, assigning them the Azure ILB ip created on each vnet $capIpMain = Add-ClusterResource -Name "ILB IP main datacenter" -ResourceType "IP Address" -Group $agClusterGroup $capIpMain | Set-ClusterParameter -Multiple @{ "Address" = "$ilbIpMainSubnet" "SubnetMask" = "255.255.255.255" "Network" = "$mainClusterNetName" "EnableDHCP" = 0 "ProbePort" = "59999" } -ErrorAction Stop $capIpDr = Add-ClusterResource -Name "ILB IP dr datacenter" -ResourceType "IP Address" -Group $agClusterGroup $capIpDr | Set-ClusterParameter -Multiple @{ "Address" = "$ilbIpDrSubnet" "SubnetMask" = "255.255.255.255" "Network" = "$drClusterNetName" "EnableDHCP" = 0 "ProbePort" = "59999" } -ErrorAction Stop #4. Add dependencies on IPs to Network Name Set-ClusterResourceDependency -InputObject $cap -Dependency "[$capIpMain] or [$capIpDr]" #5. Add AG resource dependency on CAP network name $agResource = Get-ClusterResource -Name $ag $agResource | Stop-ClusterResource Add-ClusterResourceDependency -InputObject $agResource -Resource $cap $cap|Start-ClusterResource $agResource | Start-ClusterResource #6. Assign port 1433 to SQL AG Listener Set-SqlAvailabilityGroupListener -path "SQLSERVER:\SQL\$primaryReplicaName\Default\availabilityGroups\$ag\availabilityGroupListeners\$agListener" -Port 1433
That’s all for now; in a couple of hours you should be able to replicate my environment.
During next days I’ll try to review this article in order to make it more readable, and I’ll insert new section for failover runbook creation and so on.
0 Comments