Virtualizing SAP HANA on vSphere 5 Best Practices

Today, we will explore an application that requires high levels of hardware specs to host and highest level of performance, It’s SAP HANA.
SAP HANA is one of the most powerful Data Processing software in the world. It’s an in-memory, column-oriented, relational database management system developed and marketed by SAP SE. SAP HANA was supported on vSphere 5.1 for non-production environment and became supported for production in Q2 2014 on vSphere 5.5.

Best practices here are gathered from few sources found that are mentioned in References section below. I’ll follow the same schema of my previous posts and relate these best practices to our five Design Qualifiers (AMPRS – Availability, Manageability, Performance, Recoverability and Security) in addition to Scalability.

Availability:
1-) Leverage vMotion with your SAP HANA VMs. Make sure that destination host has the required resources to run migrated VMs.

2-) Make sure to enable DRS in Fully Automated Mode on the cluster hosting SAP HANA VMs. SAP HANA support migration of its VMs using vMotion.

3-) Use DRS Anti-affinity rules for separating SAP HANA VMs apart and use VM-Host Should Affinity rules to keep SAP HANA VMs on their certified ESXi Hosts only.

4-) Make sure to add the automatic SAP HANA start parameter to the SAP HANA configuration file to enable SAP HANA automatic restart after reboot in case of HA event.

5-) Try to leverage VM Monitoring to mitigate the risk of Guest OS failure. VMware Tools inside Oracle VMs will send heartbeats to HA driver on the host. If it’s stopped because Guest OS failure, the host will monitor IO and network activity of the VM for certain period. If there’s also no activity, the host will restart the VM. This add additional layer of availability for SAP HANA VMs. For more information, check this: vSphere HA VM Monitoring – Back to Basics | VMware vSphere Blog – VMware Blogs.

6-) Try to leverage Symantec Application HA Agent for SAP HANA VMs with vSphere HA for max. availability. Using Application HA, the monitoring agent will monitor SAP HANA instance and its services, sending heartbeats to HA driver on ESXi host. In case of application failure, it may restart services and any dependent resources. If Application HA Agent can’t recover the application from that failure, it’ll stop sending heartbeats and the host will initiate a VM restart as a HA action. This adds another layer of availability to your SAP HANA instances. For more informations, check this pdf from VMware.

Performance:
1-) Follow all VMware best practices for Latency-Sensitive Applications in this pdf: Tuning Latency-Sensitive Workloads on vSphere.

2-) Configure the following BIOS Settings on each ESXi Host:

Settings Recommended Value Description
Virtualization Technology Yes Necessary to run 64-bit guest operating systems.
Turbo Mode Yes Balanced workload over unused cores.
Node Interleaving No Disables NUMA benefits if set to Yes.
VT-x, AMD-V, EPT, RVI Yes Hardware-based virtualization support.
C1E Halt State No Disable if performance is more important than saving power.
Power-Saving No Disable if performance is more important than saving power.
Virus Warning No Disables warning messages when writing to the master boot record.
Hyperthreading Yes For use with some Intel processors. Hyperthreading is always recommended with Intel’s newer Core i7 processors such as the Xeon 5500 series.
Video BIOS Cacheable No Not necessary for database virtual machine.
Wake On LAN Yes Required for VMware vSphere Distributed Power Management feature.
Execute Disable Yes Required for vMotion and VMware vSphere Distributed Resource Scheduler (DRS) features.
Video BIOS Shadowable No Not necessary for database virtual machine.
Video RAM Cacheable No Not necessary for database virtual machine.
On-Board Audio No Not necessary for database virtual machine.
On-Board Modem No Not necessary for database virtual machine.
On-Board Firewire No Not necessary for database virtual machine.
On-Board Serial Ports No Not necessary for database virtual machine.
On-Board Parallel Ports No Not necessary for database virtual machine.
On-Board Game Port No Not necessary for database virtual machine.

2-) Remove unnecessary services from the Guest OS, which is SUSE Linux, for example on Linux: IPTables, Autofs and cups.

3-) Turn off the SLES kernel dump function (kdump) if it is not needed for specific reasons, for example: a root cause analysis.

4-) Configure the SLES kernel parameter as described below:
“net.ipv4.tcp_slow_start_after_idle=0”

5-) Adhere to the shared memory settings as described below:

Deployment Size Shmmni Value Physical Memory Size
Small 4GB ≥24 G & ≤64GB
Medium 64GB ≥64 G & ≤256GB
Large 53488 MB > 256GB

6-) Set VM settings to “Automatically Choose Best CPU/MMU Virtualization Mode”.

7-) CPU Sizing:
a- Assign vCPUs as required –using Hot Add feature- and don’t over-allocate to the VM to prevent CPU Scheduling issues at hypervisor level and high RDY time.
b- Don’t over-commit CPUs. It’s better to keep Virtual: Physical Cores nearly 1:1 for mission-critical SAP HANA VMs. In some cases like test environments, over-commit is allowed after establishing a performance baseline.
c- Enable Hyperthreading when available. It won’t double the processing power –in opposite to what shown on ESXi host as double number of logical cores- but it’ll give a CPU processing boost up to 10-20%. Don’t consider it when calculating Virtual: Physical Cores ratio.
d- ESXi Hypervisor is NUMA aware and it leverages the NUMA topology to gain a significant performance boost. Try to size your SAP HANA VMs to fit inside single NUMA node to gain the performance boost of NUMA node locality.
e- For large SAP HANA VMs, SAP is NUMA-aware, so enabling vNUMA on the wide VMs –that spans multiple NUMA nodes- will give better performance. In addition, pin each vCPU to its NUMA noda to prevent migrations from physical NUMA node to another one by setting the following adv. setting in VM Configuration Parameters:
“sched.vcpu0.affinity = “0-19”
sched.vcpu1.affinity = “0-19”

sched.vcpu9.affinity = “0-19”
sched.vcpu10.affinity = “20-39”
sched.vcpu11.affinity = “20-39”
..
sched.vcpu19.affinity = “20-39””

😎 Memory Sizing:
a- Don’t over-commit memory, as SAP HANA is a memory-intensive application. If needed, reserve the configured memory to provide the required performance level. Keep in mind that memory reservation affects as aspects, like: HA Slot Size, vMotion chances and time. In addition, reservation of memory removes VM swapfiles from datastores and hence, its space is usable for adding more VMs. For some cases, like testing environments, over-commitment is allowed to get higher consolidation ratios. Performance monitoring is mandatory in this case to maintain a baseline of normal-state utilization.
b- Use Large Memory Pages (aka HugePages feature in SUSE Linux 11) to give a 10% performance boost to your SAP HANA VMs. It’s enabled by default since SUSE Linux 11 SP2.
c- As Linux VMs just touches the needed memory pages when booting, setting memory reservation for it won’t allocate all the reserved memory during the booting process. It’ll just allocate and reserve the touched memory only. For SAP HANA Linux VMs, all memory configured should be per-allocated using the following adv. setting in VM Configuration Parameters:
“sched.mem.prealloc=True
sched.swap.vmxSwapEnabled=False”
d- In order to achieve the absolute lowest possible latency for SAP HANA, it recommended to set the latency to in VM adv. setting.
e- As SAP HANA instances usually need large memory reservation, don’t forget memory overhead to be calculated and accounted for. For large-memory VMs, memory overhead can be several GBs of memory.

😎 Storage Sizing:
**Check the following link by Frank Denneman: Storage requirements of SAP HANA of vSphere 5.5
a-Separate different SAP HANA VMs’ disks on different –dedicated if needed- datastores to avoid IOps contention, as SAP HANAis an IO-intensive application with many components, each with different IOps requirements.
b- Provide at least 4 paths, through two HBAs, between each ESXi host and the Storage Array for max. availability.
c- RDM can be used in many cases, like: P2V migration or to leverage 3rd Party array-based backup tool. Choosing RDM disks or VMFS-based disks are based on your technical requirements. No performance difference between these two types of disks.
d- Don’t use IBM GPFS with your virtualized SAP HANA instances, as it won’t support the following:
– VMware vMotion, Distributed Resource Scheduler (DRS), Fault Tolerance (FT) and Cloning.
– N_Port ID virtualization (NPIV).
– Running on mixed VMware ESXi versions.
Keep in mind that, IBM GPFS supports only running with Physical-mode RDM.
e- Use Paravirtual SCSI Driver in all of your SAP HANA VMs for max. performance, least latency and least CPU overhead.
f- Distribute any SAP HANA VM disks on the four allowed SCSI drivers for max. performance paralleling and higher IOps. It’s recommended to use Eager-zeroed Thick disks for DB and Logs disks.
g- Partition Alignment gives a performance boost to your backend storage, as spindles will not make two reads or writes to process single request. Datastores created using vSphere (Web) Client is natively aligned.
h- It’s recommended to use “NOOP Scheduler” as your IO scheduler in your SAP HANA Linux VMs. For more information: Linux 2.6 kernel-based virtual machines experience slow disk I/O performance.

9-) Network Sizing:
a- Use VMXNet3 vNIC in all SAP HANA VMs for max. performance and throughput and least CPU overhead.
b- Try to leverage vSphere Distributed Switch (vDS) to preserve consistency in your network configuration between all ESXi Hosts. vDS also provides many advanced features –that don’t exist in Standard Switch-, like: Private VLANs and NetFlow.

10-) Monitoring:
Try to establish a performance baseline for your SQL VMs and VI by monitoring the following:
– ESXi Hosts and VMs counters:

Resource Metric (esxtop/resxtop) Metric (vSphere Client) Description
CPU %USED Used CPU used over the collection interval (%)
%RDY Ready CPU time spent in ready state
%CSTP Co-Stop Percentage of time a vCPU spent in read, co-descheduled state. Only meaningful for SMP virtual machines.
%MLMTD Percentage of time a vCPU was ready to run but was deliberately not scheduled due to CPU limits.
%SWPWT Virtual machine waiting on swapped pages to be read from disk. This can indicate overcommitted memory.
%SYS System Percentage of time spent in the ESX/ESXi Server VMKernel
Memory Swapin,
Swapout
Swapinrate, Swapoutrate Memory ESX/ESXi host swaps in/out from/to disk (per virtual machine, or cumulative over host)
MCTLSZ (MB) vmmemctl Amount of memory reclaimed from resource pool by way of ballooning
N%L If less than 80, the virtual machine is experiencing poor NUMA locality. If the virtual machine has memory size greater than the amount of memorylocal to each processor, the ESXi scheduler does not attempt to use NUMA optimizations for that virtual machine.
Disk READs/s, WRITEs/s NumberRead, NumberWrite Reads and Writes issued in the collection interval
DAVG/cmd deviceLatency Average latency (ms) of the device (LUN)
KAVG/cmd KernelLatency Average latency (ms) in the VMkernel, also known as Queuing Time‖
ABRTS/s Aborts are issued by the virtual machine because the storage is notresponding. For Windows virtual machines, this happens after a 60-seconddefault. This issue can be caused by path failure, or when the storage arrayis not accepting I/O.
RESET/s The number of command resets per second.
Network MbRX/s, MbTX/s Received, Transmitted Amount of data received/transmitted per second
PKTRX/s, PKTTX/s PacketsRx, PacketsTx Received/Transmitted Packets per second
%DRPRX, %DRPTX DroppedRx, DroppedTx Receive/Transmit Dropped packets per second

Manageability:
1-) SAP HANA instance virtualization is supported for production with vSphre 5.5 and SAP HANA SPS 7. Check the support not in References section below.

2-) SAP has released use of parallel SAP HANA VMs on VMware vSphere 5.5 into controlled availability, allowing selected customers, depending on their scenarios and system sizes to go live with this configuration.

3-) It’s recommended to use vSphere Host Profiles while configuring ESXi Hosts that will host SAP HANA instances. Host Profiles preserve configuration consistency between ESXi Hosts in the cluster which is crucial for a cluster hosting some SAP HANA instances to achieve high performance.

Recoverability:
1-) Use VMware Site Recovery Manager (SRM) if available for Disaster Recovery. With SRM, automated failover to a replicated copy of the VMs in your DR site can be carried over in case of a disaster or even a failure of single critical SQL VM in your environment.

Security:
1-) All security procedures done for securing physical SAP HANA environments should be done in virtual environment, like: Role-based Access Policy.

2-) Follow VMware Hardening Guide (v5.1/v5.5) for more security procedures to secure both of your VMs and vCenter Server.

Scalability:
1-) Try to leverage vSphere Templates in your environment. Create your Golden Template for every tier of your VMs. This reduces the time required for deploying or scaling your SharePoint environment as well as preserve consistency of configuration throughout your environment.

I hope that this small guide can help with virtualizing SAP HANA instances. Resources for this are very limited and many SAP document that may conatin useful instructions and best practices are only available for SAP costumers and I’m not one unfortunately. All available documents are mentioned below.

References:
** SAP HANA on VMware – Best Practices Guide.
** SAP HANA on VMware – Support Note.
** vSphere Design Sybex 2nd Edition by Scott Lowe, Kendrick Coleman and Forbes Guthrie.

Update Log: 
** 06/02/2015: Added Frank Denneman’s Article to Storage Sizing Section

Be the first to comment

Leave a Reply