Virtual machines residing on NFS storage become unresponsive during a snapshot removal operation

Article ID: VMW0002 When a backup operation is performed using HotAdd mode, virtual machines (VMs) that reside on NFS storage might stop responding during a snapshot removal operation.

Symptom

When a backup is performed using HotAdd mode for VMs residing on NFS storage, target VMs become unresponsive for 30 seconds and removing snapshots takes a long time.  

Cause

This is a known VMware issue that occurs with VMs on NFSv3 storage when the target virtual machine and the backup appliance reside on different hosts.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2010953

The issue is with NFSv3 locking mechanism and how it is implemented in VMware products.  NFSv4 solves issues with locking.

Resolution

To resolve this issue, take one of the following actions:

  • Put Virtual Server Agent on a virtual machine on each ESXi host that hosts target VMs. The issue does not occur if the VSA is on a virtual machine on the same host where the HotAdd operation is performed.
  • Switch over to iSCSI because native VMFS is not exposed to this issue.
  • To switch to NBD when this scenario is detected, configure the SkipHotAddForNFSLock additional setting on the VSA proxy and set the value to true.
  • Move to vSphere 6.x or later and upgrade to NFSv4.

For Best Practices using NFS datastores: