VMFS Locking Mechanisms

In a shared storage environment, when multiple hosts access the same VMFS datastore, specific locking mechanisms are used. These locking mechanism prevent multiple hosts from concurrently writing to the metadata and ensure that no data corruption occurs. This type of locking is called distributed locking.

distributed lock.png

To counter the situation of Esxi host crashing, distributed locks are implemented as lease based. An Esxi host that holds lock on datastore has to renew the lease for the lock time and again and via this Esxi host lets the storage know that he is alive and kicking. If an Esxi host has not renewed the locking lease for some time, then it means that host is probably dead.

If the current owner of the lock do not renews the lease for a certain period of time, another host can break that lock.  If an Esxi host wants to access a file that was locked by some other host, it looks for the time-stamp of that file and if it finds that the time-stamp has not been increased for quite a bit of time, it can remove the stale locks and can place its own lock.

What Happens When a Host Crashes?

When an Esxi host is installed and configured, it is assigned a unique ID, which is known as host UUID. When an ESXi host first mounts a VMFS volume, it is assigned a slot within the VMFS heartbeat region, where the host writes its own heartbeat record. This record is associated with the host UUID.

When an Esxi host in a HA cluster fails/crash, it might leave behind stale locks on the VMFS datastore. These stale locks can prevent HA from restarting the failed VM’s on surviving hosts in the cluster. For this operation to succeed, the host attempting to power on the VM does the following:

1. It checks the heartbeat region of the datastore for the lock owner’s ID.

2. A few seconds later, it checks to see if this host’s heartbeat record was updated. Because the lock owner crashed, it is not able to update its heartbeat record.

3. The recovery host ages the locks left by this host. After this is done, other hosts in the cluster do not attempt to break the same stale locks.

4. The recovery host replays the heartbeat’s VMFS journal to clear and then acquire the locks.

5. When the crashed host is rebooted, it clears its own heartbeat record and acquires a new one (with a new generation number). As a result, it does not attempt to lock its original files because it is no longer the lock owner.

Type of Locking Mechanisms

There are 2 types of locking mechanisms:

Atomic Test and Set (ATS) Mechanism : ATS-only is used on all newly formatted VMFS5 datastores if the underlying storage supports it and supports discrete locking per disk. SCSI reservation locking mechanism is never used for those datastores. ATS locking is also known as hardware assisted locking.

When a VMFS5 volume is created on a LUN located on a storage array that supports ATS primitive, after an ATS operation is attempted successfully, the ATS Only attribute is written to the volume. From that point on, any host sharing the volume always uses ATS.

ATS+SCSI Mechanism – A VMFS datastore that supports the ATS+SCSI mechanism is configured to use ATS and attempts to use it when possible. If ATS fails, the VMFS datastore reverts to SCSI reservations.

SCSI reservations are used on storage devices that do not support hardware acceleration. The SCSI reservations lock an entire storage device while an operation that requires metadata protection is performed. After the operation completes, VMFS releases the reservation and other operations can continue. 

Because this lock is exclusive, excessive SCSI reservations by a host can cause performance degradation on other hosts that are accessing the same VMFS.

SCSI-Reservation.png

Below table gives a fair idea on which locking mechanism is used when

vmfl-1.PNG

How ATS locking works?

1. The ESXi host acquires an on-disk lock for a specific VMFS resource or resources.

2. It reads the block address on which it needs to write the lock record on the array.

3. If the lock is free, it atomically writes the lock record.

4. If the host receives an error—because another host may have beaten it to the lock—it retries the operation.

5. If the array returns an error, the host falls back to using a standard VMFS locking mechanism, using SCSI-2 reservations.

View VMFS Locking Information:

To see current information about current locking mechanism, use below command

dslock-2.PNG

In case when both ATS+ SCSI is used, you might see the output as below

dslock.png

Upgrade the locking Mechanism to ATS-Only

If the backend storage array has support for ATS and currently VMFS is using ATS + SCSI locking mechanism, we can upgrade the locking mode to ATS only. Before going ahead with upgrading the locking mechanism you need to do following things first:

  • Upgrade all hosts which are accessing VMFS5 datastores to the latest version.
  • Determine whether the datastore is eligible to be upgraded. See under ‘ATS compatible’ in above screenshot.
  • Determine if online/offline modes are acceptable.

Note: IF datastore is part of datastore cluster then before doing an online upgrade, place the datastore in maintenance mode and disable Storage DRS (SDRS) for that datastore.

Run the following command to upgrade locking method to ATS-Only:

# esxcli storage vmfs lockmode set --ats --volume-label=Name of Datastore

Then un-mount and re-mount the datastore. 

Downgrade to ATS+SCSI:

# esxcli storage vmfs lockmode set --scsi --volume-label= Name of Datastore

Disabling ATS Usage per ESXi host level

1: Disabling ATS for Non ATS-Only VMFS datastores

# esxcli system settings advanced set --int-value 0 --option /VMFS3/HardwareAcceleratedLocking
  • This option is effective immediately and does not need any reboot or mount/unmount of volumes.
  • The options should be disabled on all ESXi hosts accessing the VMFS datastores to have a consistent locking mode to be used from all ESXi hosts.

2: Disabling ATS for ATS-Only VMFS datastores

# esxcli system settings advanced set --int-value 0 --option /VMFS3/HardwareAcceleratedLocking
  • This option though reflects the value immediately but is not effective unless the datastore is unmounted and remounted.
  • This option need to be reset on all the ESXi hosts in the cluster.
  • When complete, mount/unmount the datastores on all ESXi hosts one by one.

 Disabling ATS Usage only for VMFS heartbeats

Starting with vSphere5.5 U2, VMFS uses ATS for updating its heartbeat compared to plain scsi writes earlier.

To disable ATS heartbeat, run the following ESXLI command:

# esxcli system settings advanced set -i 0 -o /VMFS3/UseATSForHBOnVMFS5

Note: It is not recommended to disable ATS heartbeat unless you see this set logs in the vmkernel.log file: “ATS Miscompare detected between test and set HB images at offset XXX on vol YYY

Additional References

VMFS Locking Uncovered

Disabling hardware accelerated locking (ATS) in ESXi

Disable ATS Heartbeat

I hope you find this post informational. Feel free to share this on social media if it is worth sharing. Be sociable 🙂