SQL Server DR with AlwaysOn AVG between different Azure data centers

Published by Marco Obinu on

This is the first of a series of articles where you can find instructions to reproduce the lab environment I used for my demo @ Technical Cloud Day 2017 conference.

Other parts of this series are available here:

Environment implementation
Securing lab environment
Create an automation runbook to force a SQL Server AVG failover

In my session I talked about SQL Server disaster recovery technologies available for hybrid scenarios, with main data center on premises and DR data center on Azure, focusing on AlwaysOn Availability Groups.
Since I was concerned about possible connectivity issues for an hybrid demo environment, I preferred to implement a cloud only disaster recovery scenario, with an AlwaysOn AVG cluster which span over two Azure regions; apart from implementation steps required to deploy a cloud only environment, the AlwaysOn experience have no relevant differences compared to the one you can observe on an hybrid infrastructure.

In order to make a demo environment easy to be reproduced by attendees, I decided to write a JSON template; I extended an existing one I found on GITHub in order to be able to deploy:

  • One vnet on main data center (West Europe), which is connected via a vnet-to-vnet VPN to the disaster recovery data center (East US) vnet
  • An Active directory infrastructure composed by two domain controllers, one for each data center
  • An AlwaysOn AVG infrastructure composed by two sync replicas on main data center and one async replica on dr data center, with a listener which is able to move between regions
  • A VM on main datacenter which expose a file share witness for the cluster described above.

Lab environment schema

 

I was thinking about a modular JSON,

able to generate a different kind of infrastructures based on customer needs, in order to make it easily reusable; I started to face some difficulties due to template limitations and PowerShell DSC debugging, so I ended up creating a template which provision all the resources and a PowerShell post-deployment script which finalize the environment configuration.

Since my main intent was to obtain a demo laboratory, I took some shortcuts in the deployment which are not so adherents with normally applied best practices for Azure infrastructures; i.e. I deployed all the resource in a single resource group, and used the same storage account for all the SQL Server machines in the same site. Please take this disclaimer in mind if you’d like to reuse my work for your prod environment.

You can find the template on my GitHub repo.
To deploy it you can download the package on your machine and invoke it via powershell, with the following statements:

Login-AzureRMAccount

$resourceGroupName = <Assign a name here>

New-AzureRmResourceGroupDeployment -ResourceGroupName $resourceGroupName -TemplateParameterFile .\azuredeploy.parameters.json -TemplateFile .\azuredeploy.json

As a simpler alternative, you can directly deploy it clicking on the following button:

Template will ask you for a lot of parameters, described in the following table:

ParameterDescriptionDefault Value
templatesBaseUrlRepository URL where nested template reside. Normally you want to change it if you fork my original repo.https://raw.githubusercontent.com/OmegaMadLab/SQLScripts/master/tcdit2017/sqlvm-alwayson-cluster/nested
scriptsBaseUrlRepository URL where DSC Extension reside. Normally you want to change it if you fork my original repo.https://raw.githubusercontent.com/OmegaMadLab/SQLScripts/master/tcdit2017/sqlvm-alwayson-cluster/scripts
mainDCLocationMain data center locationWest Europe
drDCLocationDR data center locationEast US
environmentPrefixPrefix for all the resources created in the resource group tcdit
mainResSuffixSuffix for all the resources created in the main data centermain
drResSuffixSuffix for all the resources created in the dr data centerdr
sqlStorageAccountType Type for SQL Server VM storage accountsStandard_LRS
adStorageAccountType Type for AD VM storage accountsStandard_LRS
sqlStorageAccountName Name for SQL Server VM storage accounts; it will be completed with prefix and appropriate suffixsqlsa
adStorageAccountName Name for AD VM storage accounts; it will be completed with prefix and appropriate suffixadsa
fswStorageAccountName Name for File Share Witness VM storage accounts; it will be completed with prefix and appropriate suffixfswsa
diagStorageAccountName Name for diagnostic storage accounts; it will be completed with prefix and appropriate suffixdiagsa
mainVirtualNetworkAddressRange Address range for main data center vnet10.0.0.0/16
mainStaticSubnet Subnet dedicated to AD on main data center vnet10.0.0.0/24
mainSqlSubnet Subnet dedicated to SQL Server on main data center vnet10.0.1.0/24
mainGWSubnet Subnet dedicated to Vnet Gateway on main data center vnet10.0.2.0/28
drVirtualNetworkAddressRange Address range for DR data center vnet172.16.0.0/16
drStaticSubnet Subnet dedicated to AD on DR data center vnet172.16.0.0/24
drSqlSubnet Subnet dedicated to SQL Server on DR data center vnet172.16.1.0/24
drGWSubnet Subnet dedicated to Vnet Gateway on DR data center vnet172.16.2.0/28
mainSqlLBIPAddressInternal load balancer IP create on main data center10.0.1.9
drSqlLBIPAddressInternal load balancer IP create on DR data center172.16.1.9
vpnSharedKeyShared key for vnet-to-vnet VPN
adPDCNICIPAddressIP for domain controller on main data center10.0.0.4
adBDCNICIPAddressIP for domain controller on DR data center172.16.0.5
domainNameDomain nametcdit17demo.local
adminUsername Admin username - it will be used also as Domain Admin
adminPassword Password for Admin
adVMSizeVM Size for domain controllers - choose VM family accordingly to the type of storage account you chooseStandard_D1_v2
witnessVMSizeVM Size for File Share Witness server - choose VM family accordingly to the type of storage account you choose for SQL Server SAStandard_D1_v2
sqlVMSizeVM Size for SQL Server - choose VM family accordingly to the type of storage account you chooseStandard_D2_v2
sqlServerVersionSQL Server versionSQL2016SP1-WS2016
sqlServerEditionSQL Server EditionSQLDEV
workloadTypeType of workload, as asked on standard SQL VM templates. It defines optimized startup parameters for the instanceOLTP
sqlServerServiceAccountUserNameSQL Server service account name; it will also be sysadmin of the instance
sqlServerServiceAccountPasswordPassword for SQL Server service account name
autoPatchingDayDay for SQL autopatching, as asked on standard SQL VM templatesSunday
autoPatchingStartHourStart hour for SQL autopatching, as asked on standard SQL VM templates2

For my demo environment, I chose to define standard storage VMs in order to contain costs; all of them were based on Windows Server 2016 images, while the SQL Server version I opted for was SQL Server 2016 SP1 Developer Edition.

Once started, the deployment should be completed in less than two hours, and it will deploy all the infrastructure components and the VMs with software prerequisites onboard.

You can gain access to the environment connecting via RDP to the public IPs assigned to domain controller, and then jump on the other servers from there. In another articles of this series I talk about how to add a minimum of security restrinctions on this kind of access.

To fully configure the environment, you must launch the post deployment configuration script you can find on the GitHub repo.
So, connect on one of the DCs, jump on the first SQL Server machine in the main data center (in my example tcdit-sql-main0) and go on with the following steps.

This PowerShell script will complete the following tasks:

  • Configure AD topology
  • Configure Windows Server Failover Cluster
  • Configure SQL Server AlwaysOn Availability Group
  • Configure Availability Group Listener.

Let’s see the macro steps above one by one.

First of all, you have to compile variables section, using the same parameter you already specified for JSON template

#==========================================================================
# Variables
#==========================================================================

$mainDatacenterSubnet = '10.0.0.0/16'
$drDatacenterSubnet = '172.16.0.0/16'

$mainADDC = 'tcdit-addc-main'
$drADDC = 'tcdit-addc-dr'

$ilbIpMainSubnet = '10.0.1.9'
$ilbIpDrSubnet = '172.16.1.9'

$primaryReplicaName = 'tcdit-sql-main0'
$secondaryReplicaName = 'tcdit-sql-main1'
$drReplicaName = 'tcdit-sql-dr0'

$clusterName = 'sqlhademo'
$fswPath = '\\tcdit-fsw-main\cluster-fsw'
$mainClusterNetName = 'Main datacenter network'
$drClusterNetName = 'DR datacenter network'
$mainClusterIpName = 'Main datacenter cluster IP'
$drClusterIpName = 'DR datacenter cluster IP'

$sqlNodes = @()
$sqlNodes += $primaryReplicaName
$sqlNodes += $secondaryReplicaName
$sqlNodes += $drReplicaName

$sqlAgent = 'SQLSERVERAGENT'
$sqlSrv = 'MSSQLSERVER'
$backupShare = "F:\Backup"
$sqlSvcAcctn = "TCDIT17DEMO\sqlAdmin"
$db = 'TCDIT17DEMO'
$ag = 'AlwaysOn-AG'
$agListener = 'aglistener'

Active Directory configuration

This section configure AD Topology, with different sites, subnets and related domain controller. Since we’re using the environment for demo purposes, I enabled the Change Notification between site in order to have intersite replication which act at the same speed of intrasite replication; I also lowered the schedule which govern the reloading of AD integrated zones on DNS; with this configuration the remote site should be able to receive DNS record changes in less then a minute.

 

#==========================================================================
#        Create AD Topology
#==========================================================================

$domainFqdn = (get-addomain).DNSRoot

#Rename default-first-site-name
pushd
cd 'ad:CN=Sites,CN=Configuration*'
ren cn=def* cn=MainDatacenter
popd

#Add subnet to main AD site
$mainADSite = Get-ADReplicationSite
New-ADReplicationSubnet -Name $mainDatacenterSubnet -Site $mainADSite -Server $mainADDC

#Add dr AD site with appropriate subnet
New-ADReplicationSite -Name 'DrDatacenter' -Server $mainADDC
$drADSite = Get-ADReplicationSite -Identity 'DrDatacenter' -Server $mainADDC
New-ADReplicationSubnet -Name $drDatacenterSubnet -Site $drADSite -Server $mainADDC

#Create new site link and remove the default one
New-ADReplicationSiteLink -Name Main2DR -Cost 100 -InterSiteTransportProtocol IP -ReplicationFrequencyInMinutes 15 -SitesIncluded $mainADSite,$drADSite
Get-ADReplicationSiteLink -Identity DEFAULTIPSITELINK | Remove-ADReplicationSiteLink -Confirm:$false

#Move DR ADDC to appropriate site
Move-ADDirectoryServer -Identity $drADDC -Site $drADSite

#Enable intersite Change Notification - Be carefull on production environment!
$siteLink = Get-ADReplicationSiteLink -Identity Main2DR
Get-adobject -Identity $siteLink.DistinguishedName -properties options | set-adobject –replace @{options=$($_.options –bor 1)} 

#Reduce DNS -> AD polling interval - Be carefull on production environment!
invoke-command -Computername $mainADDC, $drADDC -scriptBlock{set-DnsServerDsSetting -PollingInterval 30|Restart-Service -Force dns}

Windows Server Failover Cluster configuration

This section generate a three node multi subnet failover cluster.
To avoid conflicts with Azure DHCP I extended the actual best practices for single subnet WSFC on Azure, assigning to the cluster network name a couple of Link Local IP Addresses, one for each subnet.

#==========================================================================
#        Create multisubnet WSFC
#==========================================================================

#Create multisubnet WSFC
New-Cluster -Name $clusterName -Node $sqlNodes
$cluster = Get-Cluster

$clusterGroup = $cluster | Get-ClusterGroup
$clusterNameRes = $clusterGroup | Get-ClusterResource "Cluster Name"
$clusterNameRes | Stop-ClusterResource | Out-Null

#Rename cluster networks and change IPs to Link Local IP addresses in order to avoid errors due to Azure DHCP, and bring cluster network name online
$clusterNetworks = Get-ClusterNetwork

ForEach($clusterNetwork in $clusterNetworks) {
   
    if($clusterNetwork.Address.contains($mainDatacenterSubnet.Substring(0,$mainDatacenterSubnet.IndexOf('.')))){
        $clusterNetwork.Name = $mainClusterNetName
    }
    else {
        $clusterNetwork.Name = $drClusterNetName
    }
}

$clusterIpAddrRes = $clusterGroup | Get-ClusterResource | Where-Object { $_.ResourceType.Name -eq "IP Address"}          

ForEach($clusterIpAddr in $clusterIpAddrRes) {
    
    $clusterIpAddr | Stop-ClusterResource | Out-Null
    
    if(($clusterIpAddr | Get-ClusterParameter -Name 'Network').Value -eq $mainClusterNetName) {
        $clusterIpAddr.Name = $mainClusterIpName
        $IpAddr = '169.254.1.10'
    }
    else {
        $clusterIpAddr.Name = $drClusterIpName
        $IpAddr = '169.254.10.10'
    }

    $clusterIpAddr | Set-ClusterParameter -Multiple @{
        "Address" = "$ipAddr"
        "SubnetMask" = "255.255.0.0"
        "EnableDhcp" = 0
        "OverrideAddressMatch" = 1
    } -ErrorAction Stop
}

Start-ClusterResource $clusterNameRes

#Change cluster quorum configuration to node and file share majority
Set-ClusterQuorum -FileShareWitness $fswPath

#Remove vote from dr node to avoid unwanted failover due to remote site connectivity issues
(Get-ClusterNode -Name $drReplicaName).NodeWeight = 0

SQL AlwaysOn Availability Group configuration

This section creates an availability group with a database inside, configuring also replicas properties (type of replication choosen, read only routing list, etc.).

#==========================================================================
#        Configure SQL AlwaysOn AG
#==========================================================================

#Enable TCP protocol and HADR feature on all instances
$smo = 'Microsoft.SqlServer.Management.Smo.'
$timeout = New-Object System.TimeSpan -ArgumentList 0, 0, 30
Add-Type -AssemblyName System.ServiceProcess

ForEach ($sqlNode in $sqlNodes) {
    $wmi = new-object ($smo + 'Wmi.ManagedComputer')$sqlNode
    $uri = "ManagedComputer[@Name='" + $sqlNode + "']/ ServerInstance[@Name='MSSQLSERVER']/ServerProtocol[@Name='Tcp']"
    $Tcp = $wmi.GetSmoObject($uri)
    $Tcp.IsEnabled = $true
    $Tcp.Alter()

    Enable-SqlAlwaysOn `
     -Path SQLSERVER:\SQL\$sqlNode\Default `
     -NoServiceRestart

    $sqlSvc = get-service -ComputerName $sqlNode -Name $sqlSrv
    if($sqlSvc.Status -eq "Running") {
        $sqlSvc.Stop()
    }
    $sqlSvc.WaitForStatus("Stopped", $timeout)
    $sqlSvc.Start()
    $sqlSvc.WaitForStatus("Running", $timeout)
}

#Create backup share
New-Item $backupShare -ItemType directory
net share backup=$backupShare "/grant:$sqlSvcAcctn,FULL"
icacls.exe "$backupShare" /grant:r ("$sqlSvcAcctn" + ":(OI)(CI)F")

#Create database on every instance
Invoke-SqlCmd -Query "CREATE database $db"
Backup-SqlDatabase -Database $db -BackupFile "$backupShare\$db.bak" -ServerInstance $primaryReplicaName
Backup-SqlDatabase -Database $db -BackupFile "$backupShare\$db.log" -ServerInstance $primaryReplicaName -BackupAction Log
Restore-SqlDatabase -Database $db -BackupFile "\\$primaryReplicaName\backup\$db.bak" -ServerInstance $secondaryReplicaName -NoRecovery
Restore-SqlDatabase -Database $db -BackupFile "\\$primaryReplicaName\backup\$db.log" -ServerInstance $secondaryReplicaName -RestoreAction Log -NoRecovery
Restore-SqlDatabase -Database $db -BackupFile "\\$primaryReplicaName\backup\$db.bak" -ServerInstance $drReplicaName -NoRecovery
Restore-SqlDatabase -Database $db -BackupFile "\\$primaryReplicaName\backup\$db.log" -ServerInstance $drReplicaName -RestoreAction Log -NoRecovery

#Mirroring endpoint creation
ForEach ($sqlNode in $sqlNodes) {
    $endpoint =
        New-SqlHadrEndpoint SqlHaEndpoint `
        -Port 5022 `
        -Path "SQLSERVER:\SQL\$sqlNode\Default"
    Set-SqlHadrEndpoint `
        -InputObject $endpoint `
        -State "Started"
    Invoke-SqlCmd -Query "CREATE LOGIN [$sqlSvcAcctn] FROM WINDOWS" -ServerInstance $sqlNode
    Invoke-SqlCmd -Query "GRANT CONNECT ON ENDPOINT::[SqlHaEndpoint] TO [$sqlSvcAcctn]" -ServerInstance $sqlNode
}

#Availability Replicas creation
$primaryReplica =
    New-SqlAvailabilityReplica `
    -Name $primaryReplicaName `
    -EndpointURL "TCP://$primaryReplicaName.$domainFqdn`:5022" `
    -AvailabilityMode "SynchronousCommit" `
    -FailoverMode "Automatic" `
    -Version 11 `
    -AsTemplate
$secondaryReplica =
    New-SqlAvailabilityReplica `
    -Name $secondaryReplicaName `
    -EndpointURL "TCP://$secondaryReplicaName.$domainFqdn`:5022" `
    -AvailabilityMode "SynchronousCommit" `
    -FailoverMode "Automatic" `
    -Version 11 `
    -AsTemplate
$drReplica =
    New-SqlAvailabilityReplica `
    -Name $drReplicaName `
    -EndpointURL "TCP://$drReplicaName.$domainFqdn`:5022" `
    -AvailabilityMode "AsynchronousCommit" `
    -FailoverMode "Manual" `
    -Version 11 `
    -AsTemplate

#AVG Creation
New-SqlAvailabilityGroup `
     -Name $ag `
     -Path "SQLSERVER:\SQL\$primaryReplicaName\Default" `
     -AvailabilityReplica @($primaryReplica,$secondaryReplica,$drReplica) `
     -Database $db
Join-SqlAvailabilityGroup `
    -Path "SQLSERVER:\SQL\$secondaryReplicaName\Default" `
    -Name $ag
Add-SqlAvailabilityDatabase `
    -Path "SQLSERVER:\SQL\$secondaryReplicaName\Default\AvailabilityGroups\$ag" `
    -Database $db
Join-SqlAvailabilityGroup `
    -Path "SQLSERVER:\SQL\$drReplicaName\Default" `
    -Name $ag
Add-SqlAvailabilityDatabase `
    -Path "SQLSERVER:\SQL\$drReplicaName\Default\AvailabilityGroups\$ag" `
    -Database $db

#ReadOnly Replicas and routing list
$primaryReplica = Get-Item "SQLSERVER:\SQL\$primaryReplicaName\Default\availabilityGroups\$ag\availabilityReplicas\$primaryReplicaName"
$secondaryReplica = Get-Item "SQLSERVER:\SQL\$primaryReplicaName\Default\availabilityGroups\$ag\availabilityReplicas\$secondaryReplicaName"
$drReplica = Get-Item "SQLSERVER:\SQL\$primaryReplicaName\Default\availabilityGroups\$ag\availabilityReplicas\$drReplicaName"

Set-SqlAvailabilityReplica -ConnectionModeInPrimaryRole "AllowAllConnections" -InputObject $primaryReplica
Set-SqlAvailabilityReplica -ConnectionModeInSecondaryRole "AllowReadIntentConnectionsOnly" -InputObject $primaryReplica
Set-SqlAvailabilityReplica -ReadOnlyRoutingConnectionUrl "TCP://$primaryReplicaName.$domainFqdn`:1433" -InputObject $primaryReplica

Set-SqlAvailabilityReplica -ConnectionModeInPrimaryRole "AllowAllConnections" -InputObject $secondaryReplica
Set-SqlAvailabilityReplica -ConnectionModeInSecondaryRole "AllowReadIntentConnectionsOnly" -InputObject $secondaryReplica
Set-SqlAvailabilityReplica -ReadOnlyRoutingConnectionUrl "TCP://$secondaryReplicaName.$domainFqdn`:1433" -InputObject $secondaryReplica

Set-SqlAvailabilityReplica -ConnectionModeInPrimaryRole "AllowAllConnections" -InputObject $drReplica
Set-SqlAvailabilityReplica -ConnectionModeInSecondaryRole "AllowReadIntentConnectionsOnly" -InputObject $drReplica
Set-SqlAvailabilityReplica -ReadOnlyRoutingConnectionUrl "TCP://$drReplicaName.$domainFqdn`:1433" -InputObject $drReplica

Set-SqlAvailabilityReplica -ReadOnlyRoutingList "$secondaryReplicaName","$drReplicaName","$primaryReplicaName" -InputObject $primaryReplica
Set-SqlAvailabilityReplica -ReadOnlyRoutingList "$primaryReplicaName","$drReplicaName","$secondaryReplicaName" -InputObject $secondaryReplica
Set-SqlAvailabilityReplica -ReadOnlyRoutingList "$primaryReplicaName","$secondaryReplicaName","$drReplicaName" -InputObject $drReplica

AlwaysOn Listener configuration

This section creates the availability group listener, and configure its properties (i.e. RegisterAllProvidersIP, HostRecordTTL).

#==========================================================================
#        Configure SQL AlwaysOn AG Listener
#==========================================================================

#Client Access Point creation
$agClusterGroup = Get-ClusterGroup -Name $ag
#1. Add a network name named as the AG Listener
$cap = Add-ClusterResource -Name $agListener -ResourceType "Network Name" -Group $agClusterGroup
#2. Set paramters
$cap | Set-ClusterParameter -Multiple @{
    "DnsName" = "$agListener"
    "HostRecordTTL" = 30
    "RegisterAllProvidersIP" = 1
} -ErrorAction Stop
#3. Create two IP Addresses, assigning them the Azure ILB ip created on each vnet
$capIpMain = Add-ClusterResource -Name "ILB IP main datacenter" -ResourceType "IP Address" -Group $agClusterGroup
$capIpMain | Set-ClusterParameter -Multiple @{
        "Address" = "$ilbIpMainSubnet"
        "SubnetMask" = "255.255.255.255"
        "Network" = "$mainClusterNetName"
        "EnableDHCP" = 0
        "ProbePort" = "59999"
    } -ErrorAction Stop

$capIpDr = Add-ClusterResource -Name "ILB IP dr datacenter" -ResourceType "IP Address" -Group $agClusterGroup
$capIpDr | Set-ClusterParameter -Multiple @{
        "Address" = "$ilbIpDrSubnet"
        "SubnetMask" = "255.255.255.255"
        "Network" = "$drClusterNetName"
        "EnableDHCP" = 0
        "ProbePort" = "59999"
    } -ErrorAction Stop
#4. Add dependencies on IPs to Network Name
Set-ClusterResourceDependency -InputObject $cap -Dependency "[$capIpMain] or [$capIpDr]"
#5. Add AG resource dependency on CAP network name
$agResource = Get-ClusterResource -Name $ag 
$agResource | Stop-ClusterResource
Add-ClusterResourceDependency -InputObject $agResource -Resource $cap
$cap|Start-ClusterResource
$agResource | Start-ClusterResource
#6. Assign port 1433 to SQL AG Listener
Set-SqlAvailabilityGroupListener -path "SQLSERVER:\SQL\$primaryReplicaName\Default\availabilityGroups\$ag\availabilityGroupListeners\$agListener" -Port 1433

That’s all for now; in a couple of hours you should be able to replicate my environment.
During next days I’ll try to review this article in order to make it more readable, and I’ll insert new section for failover runbook creation and so on.

Categories: ITSQL Server

Marco Obinu

Curious by nature, talkative geek who can speak in front of a public or a camera, in love with technology, especially SQL Server. I want to understand how things work and to solve problems by myself during work as in my many hobbies.

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: