«

»

VMware stretched cluster and VM swap file placement

A few days ago i came across a VMware vSphere, version 5.5, stretched cluster implementation that caught my attention.

Background

My customer uses the following setup:

  • 8 ESXi version 5.5 hosts. 4 placed in Datacenter01 and 4 placed in Datacenter02
  • 2 Storage systems. 1 storage node placed in Datacenter01 and 1 storage node placed in Datacenter02
  • NFS based datastores presented in a Uniform configuration meaning all datastores are presented to all ESXi hosts.

The below picture explains my customers logical implementation.
ST01
Don’t pay any attention to the objects in the drawing, they do not represent my customers environment and i will not include any information about what storage system my customer is using.

My customer has chosen to perform failover manually and not use the available witness functionality available in the storage solution meaning (at least from my understanding) the setup does not qualify for the vSphere Metro Storage Cluster classification.
Further storage setup will not be discussed in this blog post.

What caught my attention was one particular thing, the virtual machine (VM) swap file datastore. This datastore was presented by the storage node in Datacenter02  but used by VMs running on ESXi hosts in both Datacenter01 and Datacenter02.
The below table outlines the relationship between the datastore and storage nodes.

Datastore Storage Node01 Storage Node02
DS01-VM01 x
DS01-VM02 x
DS01-VM03 x
DS01-VM04 x
DS02-VM01 x
DS02-VM01 x
DS02-VM01 x
DS02-VM01 x
DS02-VMSWAP01 x

This setup means that VMs running on ESXi hosts in Datacenter01 must use the communication link between the datacenters during normal operation. This is something that slipped under my customers radar when they implemented their solution.
There are automatically routines to keep the VMs (apart from the VM swap file) on datastores presented by the storage node placed in the same datacenter as where the ESXi hosts, running the VMs, are placed.

Question

So what will happen with the VMs running on ESXi hosts in Datacenter01 if the communication to Datacenter02 (where the VMs swap files are located) is broken.

Tests

I decided to perform some testing with the following setup.

  • Running One Windows 2008 R2 VM with 1 vCPU and 4 GB of RAM.
  • The VM virtual disk was placed on datastore DS01-VM01
  • The VM swap file was placed on DS02-VMSWAP01
  • Disk.AutoRemoveOnPDL is set to 0 on the ESXi host.

During my tests, described below, i run the VM on 100% CPU load, created files & folders to the VM file system and also had a constant ping test running against the VM.

  • Test I – Broke the communication to Datacenter02 so the DS02-VMSWAP01 was inaccessible from the ESXi host where the VM runs. No VM swap activity was created.
  • Test II – Broke the communication to Datacenter02 so the DS02-VMSWAP01 was inaccessible from the ESXi host where the VM runs.
    Created VM swap activity by setting the VM Memory Limit to 100 MB when the VM was using a minimum of 500 MB RAM.
  • Test III – Created VM swap activity by setting the VM Memory Limit to 100 MB when the VM was using a minimum of 500 MB RAM.
    Broke the communication to Datacenter02 so the DS02-VMSWAP01 was inaccessible from the ESXi host where the VM runs.

Test results

Here are the results:

  • Test I – The VM continues to run even though the ESXi host where the VM is running do not have access to the datastore where the VM swap file is located.
  • Test II – Immediately when the swap activity starts the VM stops reply the ping request and the RDP session freezes.
    However the VM still runs on the ESXi host, verified by using the  following command:
    esxcli vm process list
    The ESXi host vmkernel.log log file contains a lot of the below messages.
    WARNING: Swap: vm 36326: 5641: Failed to swap out XXXX pages. No connection
  • Test III –  Immediately when the swap activity starts the VM stops reply the ping request and the RDP session freezes.
    However the VM still runs on the ESXi host, verified by using the  following command:
    esxcli vm process list
    The ESXi host vmkernel.log log file contains a lot of I/O related error messages.

Summary

If placing the VM swap file on another datastore compared to the VM home directory make sure both datastores delivers the same availability. In this particular case we will create a VM Swap datastore (DS01-VMSWAP01) presented by the Storage Node01. This makes sure we will avoid to potentially affecting all VMs in Datacenter01 if the communication link between Datacenter01 and Datacenter02 is unavailable.

If anyone has seen another behavior under the circumstances i have described, please reach out to me via the Author page on the blog.

6 pings

Comments have been disabled.