Edit

Share via


High availability for SAP HANA scale-out system with HSR on SUSE Linux Enterprise Server

This article describes how to deploy a highly available SAP HANA system in a scale-out configuration with HANA system replication (HSR) and Pacemaker on Azure SUSE Linux Enterprise Server virtual machines (VMs). The shared file systems in the presented architecture are NFS mounted and are provided by Azure NetApp Files or NFS share on Azure Files.

In the example configurations, installation commands, and so on, the HANA instance is 03 and the HANA system ID is HN1.

Before you begin, refer to the following SAP notes and papers:

Overview

One method to achieve HANA high availability for HANA scale-out installations, is to configure HANA system replication and protect the solution with Pacemaker cluster to allow automatic failover. When an active node fails, the cluster fails over the HANA resources to the other site.
The presented configuration shows three HANA nodes on each site, plus majority maker node to prevent split-brain scenario. The instructions can be adapted, to include more VMs as HANA database (DB) nodes.

In the presented architecture, you can deploy the HANA shared file system /hana/shared by using Azure NetApp Files or NFS share on Azure Files. Each HANA node within the same HANA system replication site mounts the HANA shared file system via NFS. File systems /hana/data and /hana/log are local file systems and aren't shared between the HANA DB nodes. You will install SAP HANA in non-shared mode.

For recommended SAP HANA storage configurations, see SAP HANA Azure VMs storage configurations.

Important

If deploying all HANA file systems on Azure NetApp Files, for production systems, where performance is a key, we recommend that you evaluate and consider using Azure NetApp Files application volume group for SAP HANA.

Warning

Deploying /hana/data and /hana/log on NFS on Azure Files isn't supported.

SAP HANA scale-out with HSR and Pacemaker cluster on SLES

In the preceding diagram, three subnets are represented within one Azure virtual network, following the SAP HANA network recommendations:

  • for client communication - client 10.23.0.0/24
  • for internal HANA inter-node communication - inter 10.23.1.128/26
  • for HANA system replication - hsr 10.23.1.192/26

As /hana/data and /hana/log are deployed on local disks, it isn't necessary to deploy separate subnet and separate virtual network cards for communication to the storage.

If you're using Azure NetApp Files, the NFS volumes for /hana/shared, are deployed in a separate subnet, delegated to Azure NetApp Files: anf 10.23.1.0/26.

Prepare the infrastructure

In the following instructions, it's assumed that you've already created the resource group and the Azure virtual network with three subnets: client, inter, and hsr.

Deploy Linux virtual machines via the Azure portal

  1. Deploy the Azure VMs.

    For the configuration presented in this document, deploy seven virtual machines:

    • three virtual machines to serve as HANA DB nodes for HANA replication site 1: hana-s1-db1, hana-s1-db2 and hana-s1-db3
    • three virtual machines to serve as HANA DB nodes for HANA replication site 2: hana-s2-db1, hana-s2-db2 and hana-s2-db3
    • a small virtual machine to serve as majority maker: hana-s-mm

Deploy the VMs as SAP DB nodes using VM sizes certified for SAP HANA, as listed in SAP HANA certified IaaS platforms. Ensure that Accelerated Networking is enabled when deploying the HANA DB nodes.

For the majority maker node, you can deploy a small VM, as this VM doesn't run any of the SAP HANA resources. The majority maker VM is used in the cluster configuration to achieve odd number of cluster nodes in a split-brain scenario. The majority maker VM only needs one virtual network interface in the client subnet in this example.

Deploy local managed disks for /hana/data and /hana/log. The minimum recommended storage configuration for /hana/data and /hana/log is described in SAP HANA Azure VMs storage configurations.

Deploy the primary network interface for each VM in the client virtual network subnet.
When the VM is deployed via Azure portal, the network interface name is automatically generated. In these instructions for simplicity we'll refer to the automatically generated, primary network interfaces, which are attached to the client Azure virtual network subnet as hana-s1-db1-client, hana-s1-db2-client, hana-s1-db3-client, and so on.

Important

  • Make sure that the OS you select is SAP-certified for SAP HANA on the specific VM types you're using. For a list of SAP HANA certified VM types and OS releases for those types, go to the SAP HANA certified IaaS platforms site. Click into the details of the listed VM type to get the complete list of SAP HANA-supported OS releases for that type.
  • If you choose to deploy /hana/shared on NFS on Azure Files, we recommend that you deploy on SUSE Linux Enterprise Server (SLES) 15 SP2 and later.
  1. Create six network interfaces, one for each HANA DB virtual machine, in the inter virtual network subnet (in this example, hana-s1-db1-inter, hana-s1-db2-inter, hana-s1-db3-inter, hana-s2-db1-inter, hana-s2-db2-inter, and hana-s2-db3-inter).

  2. Create six network interfaces, one for each HANA DB virtual machine, in the hsr virtual network subnet (in this example, hana-s1-db1-hsr, hana-s1-db2-hsr, hana-s1-db3-hsr, hana-s2-db1-hsr, hana-s2-db2-hsr, and hana-s2-db3-hsr).

  3. Attach the newly created virtual network interfaces to the corresponding virtual machines:

    1. Go to the virtual machine in the Azure portal.
    2. In the left pane, select Virtual Machines. Filter on the virtual machine name (for example, hana-s1-db1), and then select the virtual machine.
    3. In the Overview pane, select Stop to deallocate the virtual machine.
    4. Select Networking, and then attach the network interface. In the Attach network interface drop-down list, select the already created network interfaces for the inter and hsr subnets.
    5. Select Save.
    6. Repeat steps b through e for the remaining virtual machines (in our example, hana-s1-db2, hana-s1-db3, hana-s2-db1, hana-s2-db2 and hana-s2-db3).
    7. Leave the virtual machines in stopped state for now. Next, we'll enable accelerated networking for all newly attached network interfaces.
  4. Enable accelerated networking for the additional network interfaces for the inter and hsr subnets by doing the following steps:

    1. Open Azure Cloud Shell in the Azure portal.

    2. Execute the following commands to enable accelerated networking for the additional network interfaces, which are attached to the inter and hsr subnets.

      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s1-db1-inter --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s1-db2-inter --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s1-db3-inter --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s2-db1-inter --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s2-db2-inter --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s2-db3-inter --accelerated-networking true
      
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s1-db1-hsr --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s1-db2-hsr --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s1-db3-hsr --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s2-db1-hsr --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s2-db2-hsr --accelerated-networking true
      az network nic update --id /subscriptions/your subscription/resourceGroups/your resource group/providers/Microsoft.Network/networkInterfaces/hana-s2-db3-hsr --accelerated-networking true
      

      Note

      You don’t have to install the Azure CLI package on your HANA nodes to run az command. You can run it from any machine that has the CLI installed, or use Azure Cloud Shell.

  5. Start the HANA DB virtual machines

Configure Azure load balancer

During VM configuration, you have an option to create or select exiting load balancer in networking section. Follow below steps to set up standard load balancer for high availability setup of HANA database.

Note

  • For HANA scale out, select the network interface for the client subnet when adding the virtual machines in the backend pool.
  • The full set of commands in Azure CLI and PowerShell adds the VMs with primary Network interface in the backend pool.

Follow the steps in Create load balancer to set up a standard load balancer for a high-availability SAP system by using the Azure portal. During the setup of the load balancer, consider the following points:

  1. Frontend IP Configuration: Create a front-end IP. Select the same virtual network and subnet name as your database virtual machines.
  2. Backend Pool: Create a back-end pool and add database VMs.
  3. Inbound rules: Create a load-balancing rule. Follow the same steps for both load-balancing rules.
    • Frontend IP address: Select a front-end IP.
    • Backend pool: Select a back-end pool.
    • High-availability ports: Select this option.
    • Protocol: Select TCP.
    • Health Probe: Create a health probe with the following details:
      • Protocol: Select TCP.
      • Port: For example, 625<instance-no.>.
      • Interval: Enter 5.
      • Probe Threshold: Enter 2.
    • Idle timeout (minutes): Enter 30.
    • Enable Floating IP: Select this option.

Note

The health probe configuration property numberOfProbes, otherwise known as Unhealthy threshold in the portal, isn't respected. To control the number of successful or failed consecutive probes, set the property probeThreshold to 2. It's currently not possible to set this property by using the Azure portal, so use either the Azure CLI or the PowerShell command.

Note

When VMs without public IP addresses are placed in the backend pool of an internal (no public IP address) Standard Azure load balancer, there's no outbound internet connectivity unless additional configuration is performed to enable routing to public endpoints. For details on how to configure outbound connectivity, see Public endpoint connectivity for Virtual Machines using Azure Standard Load Balancer in SAP high-availability scenarios.

Important

  • Don't enable TCP timestamps on Azure VMs placed behind Azure Load Balancer. Enabling TCP timestamps will cause the health probes to fail. Set parameter net.ipv4.tcp_timestamps to 0. For details see Load Balancer health probes and SAP note 2382421.
  • To prevent saptune from changing the manually set net.ipv4.tcp_timestamps value from 0 back to 1, update saptune version to 3.1.1 or higher. For more details, see saptune 3.1.1 – Do I Need to Update?.

Deploy NFS

There are two options for deploying Azure native NFS for /hana/shared. You can deploy NFS volume on Azure NetApp Files or NFS share on Azure Files. Azure files support NFSv4.1 protocol, NFS on Azure NetApp files supports both NFSv4.1 and NFSv3.

The next sections describe the steps to deploy NFS - you'll need to select only one of the options.

Tip

You chose to deploy /hana/shared on NFS share on Azure Files or NFS volume on Azure NetApp Files.

Deploy the Azure NetApp Files infrastructure

Deploy Azure NetApp Files volumes for the /hana/shared file system. You need a separate /hana/shared volume for each HANA system replication site. For more information, see Set up the Azure NetApp Files infrastructure.

In this example, the following Azure NetApp Files volumes were used:

  • volume HN1-shared-s1 (nfs://10.23.1.7/HN1-shared-s1)
  • volume HN1-shared-s2 (nfs://10.23.1.7/HN1-shared-s2)

Deploy the NFS on Azure Files infrastructure

Deploy Azure Files NFS shares for the /hana/shared file system. You'll need a separate /hana/shared Azure Files NFS share for each HANA system replication site. For more information, see How to create an NFS share.

In this example, the following Azure Files NFS shares were used:

  • share hn1-shared-s1 (sapnfsafs.file.core.windows.net:/sapnfsafs/hn1-shared-s1)
  • share hn1-shared-s2 (sapnfsafs.file.core.windows.net:/sapnfsafs/hn1-shared-s2)

Operating system configuration and preparation

The instructions in the next sections are prefixed with one of the following abbreviations:

  • [A]: Applicable to all nodes, including majority maker
  • [AH]: Applicable to all HANA DB nodes
  • [M]: Applicable to the majority maker node only
  • [AH1]: Applicable to all HANA DB nodes on SITE 1
  • [AH2]: Applicable to all HANA DB nodes on SITE 2
  • [1]: Applicable only to HANA DB node 1, SITE 1
  • [2]: Applicable only to HANA DB node 1, SITE 2

Configure and prepare your OS by doing the following steps:

  1. [A] Maintain the host files on the virtual machines. Include entries for all subnets. The following entries were added to /etc/hosts for this example.

    # Client subnet
    10.23.0.19      hana-s1-db1
    10.23.0.20      hana-s1-db2
    10.23.0.21      hana-s1-db3
    10.23.0.22      hana-s2-db1
    10.23.0.23      hana-s2-db2
    10.23.0.24      hana-s2-db3
    10.23.0.25      hana-s-mm    
    
    # Internode subnet
    10.23.1.132     hana-s1-db1-inter
    10.23.1.133     hana-s1-db2-inter
    10.23.1.134     hana-s1-db3-inter
    10.23.1.135     hana-s2-db1-inter
    10.23.1.136     hana-s2-db2-inter
    10.23.1.137     hana-s2-db3-inter
    
    # HSR subnet
    10.23.1.196     hana-s1-db1-hsr
    10.23.1.197     hana-s1-db2-hsr
    10.23.1.198     hana-s1-db3-hsr
    10.23.1.199     hana-s2-db1-hsr
    10.23.1.200     hana-s2-db2-hsr
    10.23.1.201     hana-s2-db3-hsr
    
  2. [A] Create configuration file /etc/sysctl.d/ms-az.conf with Microsoft for Azure configuration settings.

    vi /etc/sysctl.d/ms-az.conf
    
    # Add the following entries in the configuration file
    net.ipv6.conf.all.disable_ipv6 = 1
    net.ipv4.tcp_max_syn_backlog = 16348
    net.ipv4.conf.all.rp_filter = 0
    sunrpc.tcp_slot_table_entries = 128
    vm.swappiness=10
    

    Tip

    Avoid setting net.ipv4.ip_local_port_range and net.ipv4.ip_local_reserved_ports explicitly in the sysctl configuration files to allow SAP Host Agent to manage the port ranges. For more information, see SAP note 2382421.

  3. [AH] Prepare the VMs - apply the recommended settings per SAP note 2205917 for SUSE Linux Enterprise Server for SAP Applications.

Prepare the file systems

You chose to deploy the SAP shared directories on NFS share on Azure Files or NFS volume on Azure NetApp Files.

Mount the shared file systems (Azure NetApp Files NFS)

In this example, the shared HANA file systems are deployed on Azure NetApp Files and mounted over NFSv4.1. Follow the steps in this section, only if you're using NFS on Azure NetApp Files.

  1. [AH] Prepare the OS for running SAP HANA on NetApp Systems with NFS, as described in SAP note 3024346 - Linux Kernel Settings for NetApp NFS. Create configuration file /etc/sysctl.d/91-NetApp-HANA.conf for the NetApp configuration settings.

    vi /etc/sysctl.d/91-NetApp-HANA.conf
    
    # Add the following entries in the configuration file
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    net.ipv4.tcp_rmem = 4096 131072 16777216
    net.ipv4.tcp_wmem = 4096 16384 16777216
    net.core.netdev_max_backlog = 300000
    net.ipv4.tcp_slow_start_after_idle=0
    net.ipv4.tcp_no_metrics_save = 1
    net.ipv4.tcp_moderate_rcvbuf = 1
    net.ipv4.tcp_window_scaling = 1
    net.ipv4.tcp_sack = 1
    
  2. [AH] Adjust the sunrpc settings, as recommended in SAP note 3024346 - Linux Kernel Settings for NetApp NFS.

    vi /etc/modprobe.d/sunrpc.conf
    
    # Insert the following line
    options sunrpc tcp_max_slot_table_entries=128
    
  3. [AH] Create mount points for the HANA database volumes.

    mkdir -p /hana/shared
    
  4. [AH] Verify the NFS ___domain setting. Make sure that the ___domain is configured as the default Azure NetApp Files ___domain, that is, defaultv4iddomain.com and the mapping is set to nobody.
    This step is only needed, if using Azure NetAppFiles NFSv4.1.

    Important

    Make sure to set the NFS ___domain in /etc/idmapd.conf on the VM to match the default ___domain configuration on Azure NetApp Files: defaultv4iddomain.com. If there's a mismatch between the ___domain configuration on the NFS client (i.e. the VM) and the NFS server, i.e. the Azure NetApp configuration, then the permissions for files on Azure NetApp volumes that are mounted on the VMs will be displayed as nobody.

    sudo cat /etc/idmapd.conf
    # Example
    [General]
    Domain = defaultv4iddomain.com
    [Mapping]
    Nobody-User = nobody
    Nobody-Group = nobody
    
  5. [AH] Verify nfs4_disable_idmapping. It should be set to Y. To create the directory structure where nfs4_disable_idmapping is located, execute the mount command. You won't be able to manually create the directory under /sys/modules, because access is reserved for the kernel / drivers.
    This step is only needed, if using Azure NetAppFiles NFSv4.1.

    # Check nfs4_disable_idmapping 
    cat /sys/module/nfs/parameters/nfs4_disable_idmapping
    # If you need to set nfs4_disable_idmapping to Y
    mkdir /mnt/tmp
    mount 10.23.1.7:/HN1-share-s1 /mnt/tmp
    umount  /mnt/tmp
    echo "Y" > /sys/module/nfs/parameters/nfs4_disable_idmapping
    # Make the configuration permanent
    echo "options nfs nfs4_disable_idmapping=Y" >> /etc/modprobe.d/nfs.conf
    
  6. [AH1] Mount the shared Azure NetApp Files volumes on the SITE1 HANA DB VMs.

    sudo vi /etc/fstab
    # Add the following entry
    10.23.1.7:/HN1-shared-s1 /hana/shared nfs rw,nfsvers=4.1,hard,timeo=600,rsize=262144,wsize=262144,noatime,lock,_netdev,sec=sys  0  0
    # Mount all volumes
    sudo mount -a 
    
  7. [AH2] Mount the shared Azure NetApp Files volumes on the SITE2 HANA DB VMs.

    sudo vi /etc/fstab
    # Add the following entry
    10.23.1.7:/HN1-shared-s2 /hana/shared nfs rw,nfsvers=4.1,hard,timeo=600,rsize=262144,wsize=262144,noatime,lock,_netdev,sec=sys  0  0
    # Mount the volume
    sudo mount -a 
    
  8. [AH] Verify that the corresponding /hana/shared/ file systems are mounted on all HANA DB VMs with NFS protocol version NFSv4.1.

    sudo nfsstat -m
    # Verify that flag vers is set to 4.1 
    # Example from SITE 1, hana-s1-db1
    /hana/shared from 10.23.1.7:/HN1-shared-s1
     Flags: rw,noatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.23.0.19,local_lock=none,addr=10.23.1.7
    # Example from SITE 2, hana-s2-db1
    /hana/shared from 10.23.1.7:/HN1-shared-s2
     Flags: rw,noatime,vers=4.1,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.23.0.22,local_lock=none,addr=10.23.1.7
    

Mount the shared file systems (Azure Files NFS)

In this example, the shared HANA file systems are deployed on NFS on Azure Files. Follow the steps in this section, only if you're using NFS on Azure Files.

  1. [AH] Create mount points for the HANA database volumes.

    mkdir -p /hana/shared
    
  2. [AH1] Mount the shared Azure NetApp Files volumes on the SITE1 HANA DB VMs.

    sudo vi /etc/fstab
    # Add the following entry
    sapnfsafs.file.core.windows.net:/sapnfsafs/hn1-shared-s1 /hana/shared  nfs nfsvers=4.1,sec=sys  0  0
    # Mount all volumes
    sudo mount -a 
    
  3. [AH2] Mount the shared Azure NetApp Files volumes on the SITE2 HANA DB VMs.

    sudo vi /etc/fstab
    # Add the following entries
    sapnfsafs.file.core.windows.net:/sapnfsafs/hn1-shared-s2 /hana/shared  nfs nfsvers=4.1,sec=sys  0  0
    # Mount the volume
    sudo mount -a 
    
  4. [AH] Verify that the corresponding /hana/shared/ file systems are mounted on all HANA DB VMs with NFS protocol version NFSv4.1.

    sudo nfsstat -m
    # Example from SITE 1, hana-s1-db1
    sapnfsafs.file.core.windows.net:/sapnfsafs/hn1-shared-s1
     Flags: rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.23.0.19,local_lock=none,addr=10.23.0.35
    # Example from SITE 2, hana-s2-db1
    sapnfsafs.file.core.windows.net:/sapnfsafs/hn1-shared-s2
     Flags: rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.23.0.22,local_lock=none,addr=10.23.0.35
    

Prepare the data and log local file systems

In the presented configuration, file systems /hana/data and /hana/log are deployed on managed disk and are locally attached to each HANA DB VM. You need to execute the steps to create the local data and log volumes on each HANA DB virtual machine.

Set up the disk layout with Logical Volume Manager (LVM). The following example assumes that each HANA virtual machine has three data disks attached, that are used to create two volumes.

  1. [AH] List all of the available disks:

    ls /dev/disk/azure/scsi1/lun*
    

    Example output:

    /dev/disk/azure/scsi1/lun0  /dev/disk/azure/scsi1/lun1  /dev/disk/azure/scsi1/lun2 
    
  2. [AH] Create physical volumes for all of the disks that you want to use:

    sudo pvcreate /dev/disk/azure/scsi1/lun0
    sudo pvcreate /dev/disk/azure/scsi1/lun1
    sudo pvcreate /dev/disk/azure/scsi1/lun2
    
  3. [AH] Create a volume group for the data files. Use one volume group for the log files and one for the shared directory of SAP HANA:\

    sudo vgcreate vg_hana_data_HN1 /dev/disk/azure/scsi1/lun0 /dev/disk/azure/scsi1/lun1
    sudo vgcreate vg_hana_log_HN1 /dev/disk/azure/scsi1/lun2
    
  4. [AH] Create the logical volumes.

    A linear volume is created when you use lvcreate without the -i switch. We suggest that you create a striped volume for better I/O performance, and align the stripe sizes to the values documented in SAP HANA VM storage configurations. The -i argument should be the number of the underlying physical volumes and the -I argument is the stripe size. In this document, two physical volumes are used for the data volume, so the -i switch argument is set to 2. The stripe size for the data volume is 256 KiB. One physical volume is used for the log volume, so no -i or -I switches are explicitly used for the log volume commands.

    Important

    Use the -i switch and set it to the number of the underlying physical volume when you use more than one physical volume for each data or log volumes. Use the -I switch to specify the stripe size, when creating a striped volume.
    See SAP HANA VM storage configurations for recommended storage configurations, including stripe sizes and number of disks.

    sudo lvcreate -i 2 -I 256 -l 100%FREE -n hana_data vg_hana_data_HN1
    sudo lvcreate -l 100%FREE -n hana_log vg_hana_log_HN1
    sudo mkfs.xfs /dev/vg_hana_data_HN1/hana_data
    sudo mkfs.xfs /dev/vg_hana_log_HN1/hana_log
    
  5. [AH] Create the mount directories and copy the UUID of all of the logical volumes:

    sudo mkdir -p /hana/data/HN1
    sudo mkdir -p /hana/log/HN1
    # Write down the ID of /dev/vg_hana_data_HN1/hana_data and /dev/vg_hana_log_HN1/hana_log
    sudo blkid
    
  6. [AH] Create fstab entries for the logical volumes and mount:

    sudo vi /etc/fstab
    

    Insert the following line in the /etc/fstab file:

    /dev/disk/by-uuid/UUID of /dev/mapper/vg_hana_data_HN1-hana_data /hana/data/HN1 xfs  defaults,nofail  0  2
    /dev/disk/by-uuid/UUID of /dev/mapper/vg_hana_log_HN1-hana_log /hana/log/HN1 xfs  defaults,nofail  0  2
    

    Mount the new volumes:

    sudo mount -a
    

Create a Pacemaker cluster

Follow the steps in Setting up Pacemaker on SUSE Linux Enterprise Server in Azure to create a basic Pacemaker cluster for this HANA server. Include all virtual machines, including the majority maker in the cluster.

For a scale-out cluster, ensure the following parameters are set correctly:

  • Don't set quorum expected-votes to 2, as this isn't a two node cluster.
  • Make sure that cluster property concurrent-fencing=true is set, so that node fencing is deserialized.
  • The stonith-sbd resource should include parameter pcmk_action_limit=-1 with value negative 1 (unlimited) to allow deserialized stonith actions.

Installation

In this example for deploying SAP HANA in scale-out configuration with HSR on Azure VMs, we've used HANA 2.0 SP5.

Prepare for HANA installation

  1. [AH] Before the HANA installation, set the root password. You can disable the root password after the installation completes. Execute as root command passwd.

  2. [1,2] Change the permissions on /hana/shared

    chmod 775 /hana/shared
    
  3. [1] Verify that you can log in via SSH to the HANA DB VMs in this site hana-s1-db2 and hana-s1-db3, without being prompted for a password. If that isn't the case, exchange ssh keys as described in Enable SSH Access via Public Key.

    ssh root@hana-s1-db2
    ssh root@hana-s1-db3
    
  4. [2] Verify that you can log in via SSH to the HANA DB VMs in this site hana-s2-db2 and hana-s2-db3, without being prompted for a password.
    If that isn't the case, exchange ssh keys.

    ssh root@hana-s2-db2
    ssh root@hana-s2-db3
    
  5. [AH] Install additional packages, which are required for HANA 2.0 SP4 and later. For more information, see SAP Note 2593824 for your SLES version.

    # In this example, using SLES12 SP5
    sudo zypper install libgcc_s1 libstdc++6 libatomic1
    

HANA installation on the first node on each site

  1. [1] Install SAP HANA by following the instructions in the SAP HANA 2.0 Installation and Update guide. In the instructions that follow, we show the SAP HANA installation on the first node on SITE 1.

    a. Start the hdblcm program as root from the HANA installation software directory. Use the internal_network parameter and pass the address space for subnet, which is used for the internal HANA inter-node communication.

    ./hdblcm --internal_network=10.23.1.128/26
    

    b. At the prompt, enter the following values:

    • For Choose an action: enter 1 (for install)
    • For Additional components for installation: enter 2, 3
    • For installation path: press Enter (defaults to /hana/shared)
    • For Local Host Name: press Enter to accept the default
    • For Do you want to add hosts to the system?: enter n
    • For SAP HANA System ID: enter HN1
    • For Instance number [00]: enter 03
    • For Local Host Worker Group [default]: press Enter to accept the default
    • For Select System Usage / Enter index [4]: enter 4 (for custom)
    • For Location of Data Volumes [/hana/data/HN1]: press Enter to accept the default
    • For Location of Log Volumes [/hana/log/HN1]: press Enter to accept the default
    • For Restrict maximum memory allocation? [n]: enter n
    • For Certificate Host Name For Host hana-s1-db1 [hana-s1-db1]: press Enter to accept the default
    • For SAP Host Agent User (sapadm) Password: enter the password
    • For Confirm SAP Host Agent User (sapadm) Password: enter the password
    • For System Administrator (hn1adm) Password: enter the password
    • For System Administrator Home Directory [/usr/sap/HN1/home]: press Enter to accept the default
    • For System Administrator Login Shell [/bin/sh]: press Enter to accept the default
    • For System Administrator User ID [1001]: press Enter to accept the default
    • For Enter ID of User Group (sapsys) [79]: press Enter to accept the default
    • For System Database User (system) Password: enter the system's password
    • For Confirm System Database User (system) Password: enter system's password
    • For Restart system after machine reboot? [n]: enter n
    • For Do you want to continue (y/n): validate the summary and if everything looks good, enter y
  2. [2] Repeat the preceding step to install SAP HANA on the first node on SITE 2.

  3. [1,2] Verify global.ini

    Display global.ini, and ensure that the configuration for the internal SAP HANA inter-node communication is in place. Verify the communication section. It should have the address space for the inter subnet, and listeninterface should be set to .internal. Verify the internal_hostname_resolution section. It should have the IP addresses for the HANA virtual machines that belong to the inter subnet.

      sudo cat /usr/sap/HN1/SYS/global/hdb/custom/config/global.ini
      # Example from SITE1 
      [communication]
      internal_network = 10.23.1.128/26
      listeninterface = .internal
      [internal_hostname_resolution]
      10.23.1.132 = hana-s1-db1
      10.23.1.133 = hana-s1-db2
      10.23.1.134 = hana-s1-db3
    
  4. [1,2] Prepare global.ini for installation in non-shared environment, as described in SAP note 2080991.

     sudo vi /usr/sap/HN1/SYS/global/hdb/custom/config/global.ini
     [persistence]
     basepath_shared = no
    
  5. [1,2] Restart SAP HANA to activate the changes.

     sudo -u hn1adm /usr/sap/hostctrl/exe/sapcontrol -nr 03 -function StopSystem
     sudo -u hn1adm /usr/sap/hostctrl/exe/sapcontrol -nr 03 -function StartSystem
    
  6. [1,2] Verify that the client interface will be using the IP addresses from the client subnet for communication.

    # Execute as hn1adm
    /usr/sap/HN1/HDB03/exe/hdbsql -u SYSTEM -p "password" -i 03 -d SYSTEMDB 'select * from SYS.M_HOST_INFORMATION'|grep net_publicname
    # Expected result - example from SITE 2
    "hana-s2-db1","net_publicname","10.23.0.22"
    

    For information about how to verify the configuration, see SAP Note 2183363 - Configuration of SAP HANA internal network.

  7. [AH] Change permissions on the data and log directories to avoid HANA installation error.

     sudo chmod o+w -R /hana/data /hana/log
    
  8. [1] Install the secondary HANA nodes. The example instructions in this step are for SITE 1.

    a. Start the resident hdblcm program as root.

     cd /hana/shared/HN1/hdblcm
     ./hdblcm 
    

    b. At the prompt, enter the following values:

    • For Choose an action: enter 2 (for add hosts)
    • For Enter comma separated host names to add: hana-s1-db2, hana-s1-db3
    • For Additional components for installation: enter 2, 3
    • For Enter Root User Name [root]: press Enter to accept the default
    • For Select roles for host 'hana-s1-db2' [1]: 1 (for worker)
    • For Enter Host Failover Group for host 'hana-s1-db2' [default]: press Enter to accept the default
    • For Enter Storage Partition Number for host 'hana-s1-db2' [<<assign automatically>>]: press Enter to accept the default
    • For Enter Worker Group for host 'hana-s1-db2' [default]: press Enter to accept the default
    • For Select roles for host 'hana-s1-db3' [1]: 1 (for worker)
    • For Enter Host Failover Group for host 'hana-s1-db3' [default]: press Enter to accept the default
    • For Enter Storage Partition Number for host 'hana-s1-db3' [<<assign automatically>>]: press Enter to accept the default
    • For Enter Worker Group for host 'hana-s1-db3' [default]: press Enter to accept the default
    • For System Administrator (hn1adm) Password: enter the password
    • For Enter SAP Host Agent User (sapadm) Password: enter the password
    • For Confirm SAP Host Agent User (sapadm) Password: enter the password
    • For Certificate Host Name For Host hana-s1-db2 [hana-s1-db2]: press Enter to accept the default
    • For Certificate Host Name For Host hana-s1-db3 [hana-s1-db3]: press Enter to accept the default
    • For Do you want to continue (y/n): validate the summary and if everything looks good, enter y
  9. [2] Repeat the preceding step to install the secondary SAP HANA nodes on SITE 2.

Configure SAP HANA 2.0 System Replication

  1. [1] Configure System Replication on SITE 1:

    Back up the databases as hn1adm:

    hdbsql -d SYSTEMDB -u SYSTEM -p "passwd" -i 03 "BACKUP DATA USING FILE ('initialbackupSYS')"
    hdbsql -d HN1 -u SYSTEM -p "passwd" -i 03 "BACKUP DATA USING FILE ('initialbackupHN1')"
    

    Copy the system secure storage key files to the secondary site:

    scp /usr/sap/HN1/SYS/global/security/rsecssfs/data/SSFS_HN1.DAT hana-s2-db1:/usr/sap/HN1/SYS/global/security/rsecssfs/data/
    scp /usr/sap/HN1/SYS/global/security/rsecssfs/key/SSFS_HN1.KEY  hana-s2-db1:/usr/sap/HN1/SYS/global/security/rsecssfs/key/
    

    Create the primary site:

    hdbnsutil -sr_enable --name=HANA_S1
    
  2. [2] Configure System Replication on SITE 2:

    Register the second site to start the system replication. Run the following command as <hanasid>adm:

    sapcontrol -nr 03 -function StopWait 600 10
    hdbnsutil -sr_register --remoteHost=hana-s1-db1 --remoteInstance=03 --replicationMode=sync --name=HANA_S2
    sapcontrol -nr 03 -function StartSystem
    
  3. [1] Check replication status

    Check the replication status and wait until all databases are in sync.

    sudo su - hn1adm -c "python /usr/sap/HN1/HDB03/exe/python_support/systemReplicationStatus.py"
    
    # | Database | Host          | Port  | Service Name | Volume ID | Site ID | Site Name | Secondary     | Secondary | Secondary | Secondary | Secondary     | Replication | Replication | Replication    |
    # |          |               |       |              |           |         |           | Host          | Port      | Site ID   | Site Name | Active Status | Mode        | Status      | Status Details |
    # | -------- | ------------- | ----- | ------------ | --------- | ------- | --------- | ------------- | --------- | --------- | --------- | ------------- | ----------- | ----------- | -------------- |
    # | HN1      | hana-s1-db3   | 30303 | indexserver  |         5 |       1 | HANA_S1   | hana-s2-db3   |     30303 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    # | SYSTEMDB | hana-s1-db1   | 30301 | nameserver   |         1 |       1 | HANA_S1   | hana-s2-db1   |     30301 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    # | HN1      | hana-s1-db1   | 30307 | xsengine     |         2 |       1 | HANA_S1   | hana-s2-db1   |     30307 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    # | HN1      | hana-s1-db1   | 30303 | indexserver  |         3 |       1 | HANA_S1   | hana-s2-db1   |     30303 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    # | HN1      | hana-s1-db2   | 30303 | indexserver  |         4 |       1 | HANA_S1   | hana-s2-db2   |     30303 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    #
    # status system replication site "2": ACTIVE
    # overall system replication status: ACTIVE
    #
    # Local System Replication State
    #
    # mode: PRIMARY
    # site id: 1
    # site name: HANA_S1
    
  4. [1,2] Change the HANA configuration so that communication for HANA system replication is directed through the HANA system replication virtual network interfaces.

    • Stop HANA on both sites

      sudo -u hn1adm /usr/sap/hostctrl/exe/sapcontrol -nr 03 -function StopSystem HDB
      
    • Edit global.ini to add the host mapping for HANA system replication: use the IP addresses from the hsr subnet.

      sudo vi /usr/sap/HN1/SYS/global/hdb/custom/config/global.ini
      #Add the section
      [system_replication_hostname_resolution]
      10.23.1.196 = hana-s1-db1
      10.23.1.197 = hana-s1-db2
      10.23.1.198 = hana-s1-db3
      10.23.1.199 = hana-s2-db1
      10.23.1.200 = hana-s2-db2
      10.23.1.201 = hana-s2-db3
      
    • Start HANA on both sites

      sudo -u hn1adm /usr/sap/hostctrl/exe/sapcontrol -nr 03 -function StartSystem HDB
      

    For more information, see Host Name resolution for System Replication.

Implement HANA resource agents

SUSE provides two different software packages for the Pacemaker resource agent to manage SAP HANA. Software packages SAPHanaSR-ScaleOut and SAPHanaSR-angi are using slightly different syntax and parameters and aren't compatible. See SUSE release notes and documentation for details and differences between SAPHanaSR-angi and SAPHanaSR-ScaleOut. This document covers both packages in separate tabs in the respective sections.

Warning

Don't replace the package SAPHanaSR-ScaleOut by SAPHanaSR-angi in an already configured cluster. Upgrading from SAPHanaSR to SAPHanaSR-angi requires a specific procedure. For more details, see SUSE's blog post How to upgrade to SAPHanaSR-angi.

  • [A] Install the SAP HANA high availability packages:

Note

SAPHanaSR-angi has a minimum version requirement of SAP HANA 2.0 SPS 05 and SUSE SLES for SAP Applications 15 SP4 or higher.

Run the following command on all cluster VMs, including the majority maker to install the high availability packages:

sudo zypper install SAPHanaSR-angi
sudo zypper in -t pattern ha_sles

Set up SAP HANA HA/DR providers

The SAP HANA HA/DR providers optimize the integration with the cluster and improve detection when a cluster failover is needed. The main hook script is susHanaSR (for SAPHanaSR-angi) or SAPHanaSrMultiTarget (for SAPHanaSR-ScaleOut package). It is mandatory for cluster integration that you configure the susHanaSR/SAPHanaSrMultiTarget python hook. For HANA 2.0 SPS 05 and later, we recommend that you implement both susHanaSR/SAPHanaSrMultiTarget and the susChkSrv hooks.

The susChkSrv hook extends the functionality of the main susHanaSR/SAPHanaSrMultiTarget HA provider. It acts when the HANA process hdbindexserver crashes. If a single process crashes, HANA typically tries to restart it. Restarting the indexserver process can take a long time, during which the HANA database isn't responsive.

With susChkSrv implemented, an immediate and configurable action is executed. The action triggers a failover in the configured timeout period instead of waiting for the hdbindexserver process to restart on the same node. In HANA scale-out, susChkSrv acts for every cluster node running HANA independently. The configured action kills HANA or fences the affected VM, which triggers a failover in the configured timeout period.

  1. [1,2] Stop HANA on both system replication sites. Execute as <sid>adm:

    sapcontrol -nr 03 -function StopSystem
    
  2. [1,2] Install the HANA HA provider hooks. The hooks must be installed on both HANA database sites.

    1. [1,2] Adjust global.ini on each cluster site. If the prerequisites for susChkSrv hook aren't met, entire block [ha_dr_provider_suschksrv] shouldn't be configured.
      You can adjust the behavior of susChkSrv with parameter action_on_lost. Valid values are [ ignore | stop | kill | fence ].

      # add to global.ini on both sites. Do not copy global.ini between sites.
      [ha_dr_provider_sushanasr]
      provider = susHanaSR
      path = /usr/share/SAPHanaSR-angi
      execution_order = 1
      
      [ha_dr_provider_suschksrv]
      provider = susChkSrv
      path = /usr/share/SAPHanaSR-angi
      execution_order = 3
      action_on_lost = kill
      
      [trace]
      ha_dr_sushanasr = info
      ha_dr_suschksrv = info
      

      SUSE delivers the HA hooks by default in the /usr/share/SAPHanaSR-angi directory. Using the standard ___location ensures that OS package updates automatically update the python hook code, and HANA uses the updated code at the next restart. Alternatively, you can specify your own path, such as /hana/shared/myHooks, to decouple OS updates from the hook version you use.

    2. [AH] The cluster requires sudoers configuration on the cluster nodes for <sid>adm. In this example that is achieved by creating a new file. Run the following command as root. Replace <sid> by lowercase SAP system ID, <SID> by uppercase SAP system ID and <siteA/B> with HANA site names chosen.

      cat << EOF > /etc/sudoers.d/20-saphana
      # SAPHanaSR-angi requirements for HA/DR hook scripts
      Cmnd_Alias SOK_SITEA    = /usr/sbin/crm_attribute -n hana_<sid>_site_srHook_<siteA> -v SOK   -t crm_config -s SAPHanaSR
      Cmnd_Alias SFAIL_SITEA  = /usr/sbin/crm_attribute -n hana_<sid>_site_srHook_<siteA> -v SFAIL -t crm_config -s SAPHanaSR
      Cmnd_Alias SOK_SITEB    = /usr/sbin/crm_attribute -n hana_<sid>_site_srHook_<siteB> -v SOK   -t crm_config -s SAPHanaSR
      Cmnd_Alias SFAIL_SITEB  = /usr/sbin/crm_attribute -n hana_<sid>_site_srHook_<siteB> -v SFAIL -t crm_config -s SAPHanaSR
      Cmnd_Alias HELPER_TAKEOVER  = /usr/bin/SAPHanaSR-hookHelper --sid=<SID> --case=checkTakeover
      Cmnd_Alias HELPER_FENCE     = /usr/bin/SAPHanaSR-hookHelper --sid=<SID> --case=fenceMe
      
      <sid>adm ALL=(ALL) NOPASSWD: SOK_SITEA, SFAIL_SITEA, SOK_SITEB, SFAIL_SITEB, HELPER_TAKEOVER, HELPER_FENCE
      

      For details about implementing the SAP HANA system replication hook, see Set up HANA HA/DR providers.


  1. [1,2] Start SAP HANA on both replication sites. Execute as <sid>adm.
sapcontrol -nr 03 -function StartSystem 
  1. [1] Verify the hook installation. Run the following command as <sap-sid>adm on the active HANA system replication site:
cdtrace    
grep HADR.*load.*susHanaSR nameserver_*.trc | tail -3
# Example output
# nameserver_hana-s1-db1.30301.453.trc:[140145]{-1}[-1/-1] 2025-05-26 07:51:34.677221 i ha_dr_provider   HADRProviderManager.cpp(00083) : loading HA/DR Provider 'susHanaSR' from /usr/share/SAPHanaSR-angi
grep susHanaSR.*init nameserver_*.trc | tail -3
# Example output
# nameserver_hana-s1-db1.30301.453.trc:[140157]{-1}[-1/-1] 2025-05-26 07:51:34.724422 i ha_dr_susHanaSR  susHanaSR.py(00042) : susHanaSR.init() version 1.001.1
  1. [AH] Verify the susChkSrv hook installation. Run the following command as <sap-sid>adm on any HANA node:
cdtrace
egrep '(LOST:|STOP:|START:|DOWN:|init|load|fail)' nameserver_suschksrv.trc
# Example output
# 2023-01-19 08:23:10.581529  [1674116590-10005] susChkSrv.init() version 0.7.7, parameter info: action_on_lost=fence stop_timeout=20 kill_signal=9
# 2023-01-19 08:23:31.553566  [1674116611-14022] START: indexserver event looks like graceful tenant start
# 2023-01-19 08:23:52.834813  [1674116632-15235] START: indexserver event looks like graceful tenant start (indexserver started)

Create SAP HANA cluster resources

  1. [1] Create the HANA Topology resource. Make sure the cluster is in maintenance mode.
sudo crm configure property maintenance-mode=true

# Replace <placeholders> with your instance number and HANA system ID

sudo crm configure primitive rsc_SAPHanaTopology_<SID>_HDB<InstNum> ocf:suse:SAPHanaTopology \
  op monitor interval="50" timeout="600" \
  op start interval="0" timeout="600" \
  op stop interval="0" timeout="300" \
  params SID="<SID>" InstanceNumber="<InstNum>"

sudo crm configure clone cln_SAPHanaTopology_<SID>_HDB<InstNum> rsc_SAPHanaTopology_<SID>_HDB<InstNum> \
  meta clone-node-max="1" interleave="true"
  1. [1] Next, create the HANA instance resource.
# Replace <placeholders> with your instance number and HANA system ID

sudo crm configure primitive rsc_SAPHanaController_<SID>_HDB<InstNum> ocf:suse:SAPHanaController \
  op start interval="0" timeout="3600" \
  op stop interval="0" timeout="3600" \
  op promote interval="0" timeout="900" \
  op demote interval="0" timeout="320" \
  op monitor interval="60" role="Promoted" timeout="700" \
  op monitor interval="61" role="Unpromoted" timeout="700" \
  params SID="<SID>" InstanceNumber="<InstNum>" PREFER_SITE_TAKEOVER="true" \
  DUPLICATE_PRIMARY_TIMEOUT="7200" AUTOMATED_REGISTER="false" \
  HANA_CALL_TIMEOUT="120"

sudo crm configure clone mst_SAPHanaController_<SID>_HDB<InstNum> rsc_SAPHanaController_<SID>_HDB<InstNum> \
  meta clone-node-max="1" interleave="true" promotable="true"

Important

We recommend as a best practice that you only set AUTOMATED_REGISTER to no, while performing thorough fail-over tests, to prevent failed primary instance to automatically register as secondary. Once the fail-over tests completed successfully, set AUTOMATED_REGISTER to yes, so that after takeover system replication can resume automatically.

  1. [1] Create file system resource agents for /hana/shared

SAPHanaSR-angi adds a new resource agent SAPHanaFilesystem to monitor read/write access to /hana/shared/SID. OS static mounts the /hana/shared/SID filesystem with each host having entries in /etc/fstab. SAPHanaFilesystem and Pacemaker doesn't mount the filesystem for HANA.

# Replace <placeholders> with your instance number and HANA system ID

sudo crm configure primitive rsc_SAPHanaFilesystem_<SID>_HDB<InstNum> ocf:suse:SAPHanaFilesystem \
  op start interval="0" timeout="10" \
  op stop interval="0" timeout="20" \
  op monitor interval="120" timeout="120" \
  params SID="<SID>" InstanceNumber="<InstNum>" ON_FAIL_ACTION="fence"

sudo crm configure clone cln_SAPHanaFilesystem_<SID>_HDB<InstNum> rsc_SAPHanaFilesystem_<SID>_HDB<InstNum> \
  meta clone-node-max="1" interleave="true"

# Add a ___location constraint to not run filesystem check on majority maker VM
sudo crm configure ___location loc_SAPHanaFilesystem_not_on_majority_maker cln_SAPHanaFilesystem_<SID>_HDB<InstNum> -inf: hana-s-mm
  1. [1] Continue with cluster resources for virtual IPs and constraints.
# Replace <placeholders> with your instance number and HANA system ID, and respective IP address and load balancer port  

sudo crm configure primitive rsc_ip_<SID>_HDB<InstNum> ocf:heartbeat:IPaddr2 \
  op start timeout=60s on-fail=fence \
  op monitor interval="10s" timeout="20s" \
  params ip="10.23.0.27"
  
sudo crm configure primitive rsc_nc_<SID>_HDB<InstNum> azure-lb port=62503 \
  op monitor timeout=20s interval=10 \
  meta resource-stickiness=0
  
sudo crm configure group g_ip_<SID>_HDB<InstNum> rsc_ip_<SID>_HDB<InstNum> rsc_nc_<SID>_HDB<InstNum>

Create the cluster constraints

# Colocate the IP with primary HANA node
sudo crm configure colocation col_saphana_ip_<SID>_HDB<InstNum> 4000: g_ip_<SID>_HDB<InstNum>:Started \
  mst_SAPHanaController_<SID>_HDB<InstNum>:Promoted  
  
# Start HANA Topology before HANA  instance
sudo crm configure order ord_SAPHana_<SID>_HDB<InstNum> Optional: cln_SAPHanaTopology_<SID>_HDB<InstNum> \
  mst_SAPHanaController_<SID>_HDB<InstNum>
  
# HANA resources don't run on the majority maker node
sudo crm configure ___location loc_SAPHanaController_not_on_majority_maker mst_SAPHanaController_<SID>_HDB<InstNum> -inf: hana-s-mm
sudo crm configure ___location loc_SAPHanaTopology_not_on_majority_maker cln_SAPHanaTopology_<SID>_HDB<InstNum> -inf: hana-s-mm
  1. [1] Configure additional cluster properties
sudo crm configure rsc_defaults resource-stickiness=1000
sudo crm configure rsc_defaults migration-threshold=50
  1. [1] Place the cluster out of maintenance mode. Make sure that the cluster status is ok and that all of the resources are started.
# Cleanup any failed resources - the following command is example 
sudo crm resource cleanup rsc_SAPHana_HN1_HDB03

# Place the cluster out of maintenance mode
sudo crm configure property maintenance-mode=false
  1. [1] Verify the communication between the HANA HA hook and the cluster, showing status SOK for SID and both replication sites with status P(rimary) or S(econdary).
sudo SAPHanaSR-showAttr
Global cib-update dcid prim       sec        sid topology
----------------------------------------------------------
global 0.165361.0 7    HANA_S2 HANA_S1    HN1 ScaleOut

Resource                        promotable
-------------------------------------------
msl_SAPHanaController_HN1_HDB03 true
cln_SAPHanaTopology_HN1_HDB03

Site        lpt        lss mns     opMode    srHook srMode srPoll srr
----------------------------------------------------------------------
HANA_S2  1748611494 4   hana-s2-db1 logreplay PRIM   sync   PRIM   P
HANA_S1  10         4   hana-s1-db1 logreplay SOK    sync   SFAIL  S

Host     clone_state roles                        score  site       srah version     vhost
----------------------------------------------------------------------------------------------
hana-s1-db1  DEMOTED     master1:master:worker:master 100    HANA_S1 -    2.00.074.00 hana-s1-db1
hana-s1-db2  DEMOTED     slave:slave:worker:slave     -12200 HANA_S1 -    2.00.074.00 hana-s1-db2
hana-s1-db3  DEMOTED     slave:slave:worker:slave     -12200 HANA_S1 -    2.00.074.00 hana-s1-db3
hana-s2-db1  PROMOTED    master1:master:worker:master 150    HANA_S2 -    2.00.074.00 hana-s2-db1
hana-s2-db2  DEMOTED     slave:slave:worker:slave     -10000 HANA_S2 -    2.00.074.00 hana-s2-db2
hana-s2-db3  DEMOTED     slave:slave:worker:slave     -10000 HANA_S2 -    2.00.074.00 hana-s2-db3
hana-mm                                                                               hana-mm

Note

The timeouts in the above configuration are just examples and may need to be adapted to the specific HANA setup. For instance, you may need to increase the start timeout, if it takes longer to start the SAP HANA database. SAPHanaSR-angi allows further options for quicker action during a cluster event. See SUSE documentation for details about SAPHanaController's ON_FAIL_ACTION parameter, optional agent SAPHanaSR-alert-fencing and other options. Implementation should be followed by additional extensive cluster testing in your environment.

Test SAP HANA failover

Note

This article contains references to terms that Microsoft no longer uses. When these terms are removed from the software, we’ll remove them from this article.

  1. Before you start a test, check the cluster and SAP HANA system replication status.

    a. Verify that there are no failed cluster actions

    #Verify that there are no failed cluster actions
    crm status
    # Example 
    #7 nodes configured
    #24 resource instances configured
    #
    #Online: [ hana-s-mm hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    #
    #Full list of resources:
    #
    # stonith-sbd    (stonith:external/sbd): Started hana-s-mm
    # Clone Set: cln_fs_HN1_HDB03_fscheck [fs_HN1_HDB03_fscheck]
    #     Started: [ hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    #     Stopped: [ hana-s-mm ]
    # Clone Set: cln_SAPHanaTopology_HN1_HDB03 [rsc_SAPHanaTopology_HN1_HDB03]
    #     Started: [ hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    #     Stopped: [ hana-s-mm ]
    # Master/Slave Set: msl_SAPHana_HN1_HDB03 [rsc_SAPHana_HN1_HDB03]
    #     Masters: [ hana-s1-db1 ]
    #     Slaves: [ hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    #     Stopped: [ hana-s-mm ]
    # Resource Group: g_ip_HN1_HDB03
    #     rsc_ip_HN1_HDB03   (ocf::heartbeat:IPaddr2):       Started hana-s1-db1
    #     rsc_nc_HN1_HDB03   (ocf::heartbeat:azure-lb):      Started hana-s1-db1
    

    b. Verify that SAP HANA system replication is in sync

    # Verify HANA HSR is in sync
    sudo su - hn1adm -c "python /usr/sap/HN1/HDB03/exe/python_support/systemReplicationStatus.py"
    #| Database | Host         | Port  | Service Name | Volume ID | Site ID | Site Name | Secondary    | Secondary | Secondary | Secondary | Secondary     | Replication | Replication | Replication    |
    #|          |              |       |              |           |         |           | Host         | Port      | Site ID   | Site Name | Active Status | Mode        | Status      | Status Details |
    #| -------- | ------------ | ----- | ------------ | --------- | ------- | --------- | ------------ | --------- | --------- | --------- | ------------- | ----------- | ----------- | -------------- |
    #| SYSTEMDB | hana-s1-db1  | 30301 | nameserver   |         1 |       1 | HANA_S1   | hana-s2-db1  |     30301 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    #| HN1      | hana-s1-db1  | 30307 | xsengine     |         2 |       1 | HANA_S1   | hana-s2-db1  |     30307 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    #| HN1      | hana-s1-db1  | 30303 | indexserver  |         3 |       1 | HANA_S1   | hana-s2-db1  |     30303 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    #| HN1      | hana-s1-db3  | 30303 | indexserver  |         4 |       1 | HANA_S1   | hana-s2-db3  |     30303 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    #| HN1      | hana-s1-db2  | 30303 | indexserver  |         5 |       1 | HANA_S1   | hana-s2-db2  |     30303 |         2 | HANA_S2   | YES           | SYNC        | ACTIVE      |                |
    #
    #status system replication site "1": ACTIVE
    #overall system replication status: ACTIVE
    #
    #Local System Replication State
    #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #
    #mode: PRIMARY
    #site id: 1
    #site name: HANA_S1
    
  2. We recommend to thoroughly validate the SAP HANA cluster configuration, by performing the tests, documented in HA for SAP HANA on Azure VMs on SLES and in SLES Replication scale-out Performance Optimized Scenario.

  3. Verify the cluster configuration for a failure scenario, when a node loses access to the NFS share (/hana/shared).

    The SAP HANA resource agents depend on binaries, stored on /hana/shared to perform operations during failover. File system /hana/shared is mounted over NFS in the presented configuration. A test that can be performed, is to create a temporary firewall rule to block access to the /hana/shared NFS mounted file system on one of the primary site VMs. This approach validates that the cluster will fail over, if access to /hana/shared is lost on the active system replication site.

    Expected result: When you block the access to the /hana/shared NFS mounted file system on one of the primary site VMs, the monitoring operation that performs read/write operation on file system, will fail, as it isn't able to access the file system and will trigger HANA resource failover. The same result is expected when your HANA node loses access to the NFS share.

    You can check the state of the cluster resources by executing crm_mon or crm status. Resource state before starting the test:

    # Output of crm_mon
    #7 nodes configured
    #24 resource instances configured
    #
    #Online: [ hana-s-mm hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    #
    #Active resources:
    #
    #stonith-sbd     (stonith:external/sbd): Started hana-s-mm
    # Clone Set: cln_fs_HN1_HDB03_fscheck [fs_HN1_HDB03_fscheck]
    #     Started: [ hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    # Clone Set: cln_SAPHanaTopology_HN1_HDB03 [rsc_SAPHanaTopology_HN1_HDB03]
    #     Started: [ hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    # Master/Slave Set: msl_SAPHana_HN1_HDB03 [rsc_SAPHana_HN1_HDB03]
    #     Masters: [ hana-s1-db1 ]
    #     Slaves: [ hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    # Resource Group: g_ip_HN1_HDB03
    #     rsc_ip_HN1_HDB03   (ocf::heartbeat:IPaddr2):       Started hana-s2-db1
    #     rsc_nc_HN1_HDB03   (ocf::heartbeat:azure-lb):      Started hana-s2-db1     
    

    To simulate failure for /hana/shared:

    • If using NFS on Azure NetApp Files, first confirm the IP address for the /hana/shared Azure NetApp Files volume on the primary site. You can do that by running df -kh|grep /hana/shared.
    • If using NFS on Azure Files, first determine the IP address of the private end point for your storage account.

    Then, set up a temporary firewall rule to block access to the IP address of the /hana/shared NFS file system by executing the following command on one of the primary HANA system replication site VMs.

    In this example, the command was executed on hana-s1-db1 for Azure NetApp Files volume /hana/shared.

    iptables -A INPUT -s 10.23.1.7 -j DROP; iptables -A OUTPUT -d 10.23.1.7 -j DROP
    

    The cluster resources are migrated to the other HANA system replication site.

    If you set AUTOMATED_REGISTER="false", you need to configure SAP HANA system replication on secondary site after takeover. In this case, you can execute these commands to reconfigure SAP HANA as secondary.

    # Execute on the secondary 
    su - hn1adm
    # Make sure HANA is not running on the secondary site. If it is started, stop HANA
    sapcontrol -nr 03 -function StopWait 600 10
    # Register the HANA secondary site
    hdbnsutil -sr_register --name=HANA_S1 --remoteHost=hana-s2-db1 --remoteInstance=03 --replicationMode=sync
    # Switch back to root and cleanup failed resources
    crm resource cleanup SAPHana_HN1_HDB03
    

    The state of the resources, after the test:

    # Output of crm_mon
    #7 nodes configured
    #24 resource instances configured
    #
    #Online: [ hana-s-mm hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    #
    #Active resources:
    #
    #stonith-sbd     (stonith:external/sbd): Started hana-s-mm
    # Clone Set: cln_fs_HN1_HDB03_fscheck [fs_HN1_HDB03_fscheck]
    #     Started: [ hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    # Clone Set: cln_SAPHanaTopology_HN1_HDB03 [rsc_SAPHanaTopology_HN1_HDB03]
    #     Started: [ hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db1 hana-s2-db2 hana-s2-db3 ]
    # Master/Slave Set: msl_SAPHana_HN1_HDB03 [rsc_SAPHana_HN1_HDB03]
    #     Masters: [ hana-s2-db1 ]
    #     Slaves: [ hana-s1-db1 hana-s1-db2 hana-s1-db3 hana-s2-db2 hana-s2-db3 ]
    # Resource Group: g_ip_HN1_HDB03
    #     rsc_ip_HN1_HDB03   (ocf::heartbeat:IPaddr2):       Started hana-s2-db1
    #     rsc_nc_HN1_HDB03   (ocf::heartbeat:azure-lb):      Started hana-s2-db1
    

Next steps