Today, we will explore an application that requires high levels of hardware specs to host and highest level of performance, It’s SAP HANA.
SAP HANA is one of the most powerful Data Processing software in the world. It’s an in-memory, column-oriented, relational database management system developed and marketed by SAP SE. SAP HANA was supported on vSphere 5.1 for non-production environment and became supported for production in Q2 2014 on vSphere 5.5.
Best practices here are gathered from few sources found that are mentioned in References section below. I’ll follow the same schema of my previous posts and relate these best practices to our five Design Qualifiers (AMPRS – Availability, Manageability, Performance, Recoverability and Security) in addition to Scalability.
1-) Leverage vMotion with your SAP HANA VMs. Make sure that destination host has the required resources to run migrated VMs.
2-) Make sure to enable DRS in Fully Automated Mode on the cluster hosting SAP HANA VMs. SAP HANA support migration of its VMs using vMotion.
3-) Use DRS Anti-affinity rules for separating SAP HANA VMs apart and use VM-Host Should Affinity rules to keep SAP HANA VMs on their certified ESXi Hosts only.
4-) Make sure to add the automatic SAP HANA start parameter to the SAP HANA configuration file to enable SAP HANA automatic restart after reboot in case of HA event.
5-) Try to leverage VM Monitoring to mitigate the risk of Guest OS failure. VMware Tools inside Oracle VMs will send heartbeats to HA driver on the host. If it’s stopped because Guest OS failure, the host will monitor IO and network activity of the VM for certain period. If there’s also no activity, the host will restart the VM. This add additional layer of availability for SAP HANA VMs. For more information, check this: vSphere HA VM Monitoring – Back to Basics | VMware vSphere Blog – VMware Blogs.
6-) Try to leverage Symantec Application HA Agent for SAP HANA VMs with vSphere HA for max. availability. Using Application HA, the monitoring agent will monitor SAP HANA instance and its services, sending heartbeats to HA driver on ESXi host. In case of application failure, it may restart services and any dependent resources. If Application HA Agent can’t recover the application from that failure, it’ll stop sending heartbeats and the host will initiate a VM restart as a HA action. This adds another layer of availability to your SAP HANA instances. For more informations, check this pdf from VMware.
1-) Follow all VMware best practices for Latency-Sensitive Applications in this pdf: Tuning Latency-Sensitive Workloads on vSphere.
2-) Configure the following BIOS Settings on each ESXi Host:
|Virtualization Technology||Yes||Necessary to run 64-bit guest operating systems.|
|Turbo Mode||Yes||Balanced workload over unused cores.|
|Node Interleaving||No||Disables NUMA benefits if set to Yes.|
|VT-x, AMD-V, EPT, RVI||Yes||Hardware-based virtualization support.|
|C1E Halt State||No||Disable if performance is more important than saving power.|
|Power-Saving||No||Disable if performance is more important than saving power.|
|Virus Warning||No||Disables warning messages when writing to the master boot record.|
|Hyperthreading||Yes||For use with some Intel processors. Hyperthreading is always recommended with Intel’s newer Core i7 processors such as the Xeon 5500 series.|
|Video BIOS Cacheable||No||Not necessary for database virtual machine.|
|Wake On LAN||Yes||Required for VMware vSphere Distributed Power Management feature.|
|Execute Disable||Yes||Required for vMotion and VMware vSphere Distributed Resource Scheduler (DRS) features.|
|Video BIOS Shadowable||No||Not necessary for database virtual machine.|
|Video RAM Cacheable||No||Not necessary for database virtual machine.|
|On-Board Audio||No||Not necessary for database virtual machine.|
|On-Board Modem||No||Not necessary for database virtual machine.|
|On-Board Firewire||No||Not necessary for database virtual machine.|
|On-Board Serial Ports||No||Not necessary for database virtual machine.|
|On-Board Parallel Ports||No||Not necessary for database virtual machine.|
|On-Board Game Port||No||Not necessary for database virtual machine.|
2-) Remove unnecessary services from the Guest OS, which is SUSE Linux, for example on Linux: IPTables, Autofs and cups.
3-) Turn off the SLES kernel dump function (kdump) if it is not needed for specific reasons, for example: a root cause analysis.
4-) Configure the SLES kernel parameter as described below:
5-) Adhere to the shared memory settings as described below:
|Deployment Size||Shmmni Value||Physical Memory Size|
|Small||4GB||≥24 G & ≤64GB|
|Medium||64GB||≥64 G & ≤256GB|
|Large||53488 MB||> 256GB|
6-) Set VM settings to “Automatically Choose Best CPU/MMU Virtualization Mode”.
7-) CPU Sizing:
a- Assign vCPUs as required –using Hot Add feature- and don’t over-allocate to the VM to prevent CPU Scheduling issues at hypervisor level and high RDY time.
b- Don’t over-commit CPUs. It’s better to keep Virtual: Physical Cores nearly 1:1 for mission-critical SAP HANA VMs. In some cases like test environments, over-commit is allowed after establishing a performance baseline.
c- Enable Hyperthreading when available. It won’t double the processing power –in opposite to what shown on ESXi host as double number of logical cores- but it’ll give a CPU processing boost up to 10-20%. Don’t consider it when calculating Virtual: Physical Cores ratio.
d- ESXi Hypervisor is NUMA aware and it leverages the NUMA topology to gain a significant performance boost. Try to size your SAP HANA VMs to fit inside single NUMA node to gain the performance boost of NUMA node locality.
e- For large SAP HANA VMs, SAP is NUMA-aware, so enabling vNUMA on the wide VMs –that spans multiple NUMA nodes- will give better performance. In addition, pin each vCPU to its NUMA noda to prevent migrations from physical NUMA node to another one by setting the following adv. setting in VM Configuration Parameters:
“sched.vcpu0.affinity = “0-19”
sched.vcpu1.affinity = “0-19”
sched.vcpu9.affinity = “0-19”
sched.vcpu10.affinity = “20-39”
sched.vcpu11.affinity = “20-39”
sched.vcpu19.affinity = “20-39””
😎 Memory Sizing:
a- Don’t over-commit memory, as SAP HANA is a memory-intensive application. If needed, reserve the configured memory to provide the required performance level. Keep in mind that memory reservation affects as aspects, like: HA Slot Size, vMotion chances and time. In addition, reservation of memory removes VM swapfiles from datastores and hence, its space is usable for adding more VMs. For some cases, like testing environments, over-commitment is allowed to get higher consolidation ratios. Performance monitoring is mandatory in this case to maintain a baseline of normal-state utilization.
b- Use Large Memory Pages (aka HugePages feature in SUSE Linux 11) to give a 10% performance boost to your SAP HANA VMs. It’s enabled by default since SUSE Linux 11 SP2.
c- As Linux VMs just touches the needed memory pages when booting, setting memory reservation for it won’t allocate all the reserved memory during the booting process. It’ll just allocate and reserve the touched memory only. For SAP HANA Linux VMs, all memory configured should be per-allocated using the following adv. setting in VM Configuration Parameters:
d- In order to achieve the absolute lowest possible latency for SAP HANA, it recommended to set the latency to in VM adv. setting.
e- As SAP HANA instances usually need large memory reservation, don’t forget memory overhead to be calculated and accounted for. For large-memory VMs, memory overhead can be several GBs of memory.
😎 Storage Sizing:
**Check the following link by Frank Denneman: Storage requirements of SAP HANA of vSphere 5.5
a-Separate different SAP HANA VMs’ disks on different –dedicated if needed- datastores to avoid IOps contention, as SAP HANAis an IO-intensive application with many components, each with different IOps requirements.
b- Provide at least 4 paths, through two HBAs, between each ESXi host and the Storage Array for max. availability.
c- RDM can be used in many cases, like: P2V migration or to leverage 3rd Party array-based backup tool. Choosing RDM disks or VMFS-based disks are based on your technical requirements. No performance difference between these two types of disks.
d- Don’t use IBM GPFS with your virtualized SAP HANA instances, as it won’t support the following:
– VMware vMotion, Distributed Resource Scheduler (DRS), Fault Tolerance (FT) and Cloning.
– N_Port ID virtualization (NPIV).
– Running on mixed VMware ESXi versions.
Keep in mind that, IBM GPFS supports only running with Physical-mode RDM.
e- Use Paravirtual SCSI Driver in all of your SAP HANA VMs for max. performance, least latency and least CPU overhead.
f- Distribute any SAP HANA VM disks on the four allowed SCSI drivers for max. performance paralleling and higher IOps. It’s recommended to use Eager-zeroed Thick disks for DB and Logs disks.
g- Partition Alignment gives a performance boost to your backend storage, as spindles will not make two reads or writes to process single request. Datastores created using vSphere (Web) Client is natively aligned.
h- It’s recommended to use “NOOP Scheduler” as your IO scheduler in your SAP HANA Linux VMs. For more information: Linux 2.6 kernel-based virtual machines experience slow disk I/O performance.
9-) Network Sizing:
a- Use VMXNet3 vNIC in all SAP HANA VMs for max. performance and throughput and least CPU overhead.
b- Try to leverage vSphere Distributed Switch (vDS) to preserve consistency in your network configuration between all ESXi Hosts. vDS also provides many advanced features –that don’t exist in Standard Switch-, like: Private VLANs and NetFlow.
Try to establish a performance baseline for your SQL VMs and VI by monitoring the following:
– ESXi Hosts and VMs counters:
|Resource||Metric (esxtop/resxtop)||Metric (vSphere Client)||Description|
|CPU||%USED||Used||CPU used over the collection interval (%)|
|%RDY||Ready||CPU time spent in ready state|
|%CSTP||Co-Stop||Percentage of time a vCPU spent in read, co-descheduled state. Only meaningful for SMP virtual machines.|
|%MLMTD||Percentage of time a vCPU was ready to run but was deliberately not scheduled due to CPU limits.|
|%SWPWT||Virtual machine waiting on swapped pages to be read from disk. This can indicate overcommitted memory.|
|%SYS||System||Percentage of time spent in the ESX/ESXi Server VMKernel|
|Swapinrate, Swapoutrate||Memory ESX/ESXi host swaps in/out from/to disk (per virtual machine, or cumulative over host)|
|MCTLSZ (MB)||vmmemctl||Amount of memory reclaimed from resource pool by way of ballooning|
|N%L||If less than 80, the virtual machine is experiencing poor NUMA locality. If the virtual machine has memory size greater than the amount of memorylocal to each processor, the ESXi scheduler does not attempt to use NUMA optimizations for that virtual machine.|
|Disk||READs/s, WRITEs/s||NumberRead, NumberWrite||Reads and Writes issued in the collection interval|
|DAVG/cmd||deviceLatency||Average latency (ms) of the device (LUN)|
|KAVG/cmd||KernelLatency||Average latency (ms) in the VMkernel, also known as Queuing Time‖|
|ABRTS/s||Aborts are issued by the virtual machine because the storage is notresponding. For Windows virtual machines, this happens after a 60-seconddefault. This issue can be caused by path failure, or when the storage arrayis not accepting I/O.|
|RESET/s||The number of command resets per second.|
|Network||MbRX/s, MbTX/s||Received, Transmitted||Amount of data received/transmitted per second|
|PKTRX/s, PKTTX/s||PacketsRx, PacketsTx||Received/Transmitted Packets per second|
|%DRPRX, %DRPTX||DroppedRx, DroppedTx||Receive/Transmit Dropped packets per second|
1-) SAP HANA instance virtualization is supported for production with vSphre 5.5 and SAP HANA SPS 7. Check the support not in References section below.
2-) SAP has released use of parallel SAP HANA VMs on VMware vSphere 5.5 into controlled availability, allowing selected customers, depending on their scenarios and system sizes to go live with this configuration.
3-) It’s recommended to use vSphere Host Profiles while configuring ESXi Hosts that will host SAP HANA instances. Host Profiles preserve configuration consistency between ESXi Hosts in the cluster which is crucial for a cluster hosting some SAP HANA instances to achieve high performance.
1-) Use VMware Site Recovery Manager (SRM) if available for Disaster Recovery. With SRM, automated failover to a replicated copy of the VMs in your DR site can be carried over in case of a disaster or even a failure of single critical SQL VM in your environment.
1-) All security procedures done for securing physical SAP HANA environments should be done in virtual environment, like: Role-based Access Policy.
2-) Follow VMware Hardening Guide (v5.1/v5.5) for more security procedures to secure both of your VMs and vCenter Server.
1-) Try to leverage vSphere Templates in your environment. Create your Golden Template for every tier of your VMs. This reduces the time required for deploying or scaling your SharePoint environment as well as preserve consistency of configuration throughout your environment.
I hope that this small guide can help with virtualizing SAP HANA instances. Resources for this are very limited and many SAP document that may conatin useful instructions and best practices are only available for SAP costumers and I’m not one unfortunately. All available documents are mentioned below.
** 06/02/2015: Added Frank Denneman’s Article to Storage Sizing Section