Multipathing and its Techniques

Multipathing: Multipathing is having more than one path to storage devices from your server. At a given time more than one paths are used to connect to the LUN’s on storage device. Multipathing provides:

  • Redundancy
  • Path Management (Failover)
  • Bandwidth Aggregation

Native Multipathing Plugin (NMP)

This is the default multipathing plugin which is provided by VMware and is included in Esxi server iso image. NMP has 2 sub-plugins:

  1. Storage Array Type Plugins (SATP): This plugin keeps information about all the available paths.
  2. Path Selection Policy (PSP): PSP defines which path will be selected based on the multipathing techniques used.

VMware SATPs

Storage Array Type Plug-Ins (SATPs) run in conjunction with the VMware NMP and are responsible for array specific operations. ESX/ESXi offers a SATP for every type of array that VMware supports. It also provides default SATPs that support non-specific active-active and ALUA storage arrays, and the local SATP for direct-attached devices.

Each SATP accommodates special characteristics of a certain class of storage arrays and can perform the array specific operations required to detect path state and to activate an inactive path. As a result, the NMP module itself can work with multiple storage arrays without having to be aware of the storage device specifics.

After the NMP determines which SATP to use for a specific storage device and associates the SATP with the physical paths for that storage device, the SATP implements the tasks that include the following:

  • Monitors the health of each physical path.
  • Reports changes in the state of each physical path.
  • Performs array-specific actions necessary for storage fail-over. For example, for active-passive devices, it can activate passive paths.

Important Note: When NMP is used then the Esxi host identifies the type of array by checking it against /etc/vmware/esx.conf file and then associates the SATP to that array based on the make and model of the array.

What does NMP do?

  • Manages physical path claiming and unclaiming.
  • Registers and de-registers logical devices.
  • Associates physical paths with logical devices.
  • Processes I/O requests to logical devices:
    • Selects an optimal physical path for the request (load balance)
    • Performs actions necessary to handle failures and request retries.
  • Supports management tasks such as abort or reset of logical devices.

We can also use 3rd party multipathing plugins which are provided by the storage vendors. Multiple third-party MPPs can run in parallel with the VMware NMP. When installed, the third-party MPPs replace the behavior of the NMP and take complete control of the path failover and the load-balancing operations for specified storage devices.

Pluggable Storage Architecture (PSA): PSA is a special VMkernel module which gives Esxi host the ability to use 3rd party multipathing software. Storage vendors provides their own multipathing plugins MPP which when installed on Esxi, works together with NMP so that failover and load balancing for that storage array can be optimized.

When coordinating the VMware NMP and any installed third-party MPPs, the PSA performs the following tasks:

  • Loads and unloads multipathing plug-ins.
  • Hides virtual machine specifics from a particular plug-in.
  • Routes I/O requests for a specific logical device to the MPP managing that device.
  • Handles I/O queuing to the logical devices.
  • Implements logical device bandwidth sharing between virtual machines.
  • Handles I/O queuing to the physical storage HBAs.
  • Handles physical path discovery and removal.
  • Provides logical device and physical path I/O statistics.

Multipathing Techniques: There are 3 main techniques of multipathing which is listed as below:

1: Most Recent Used (MRU): MRU selects the first working path which is discovered at the boot time. If the original path fails, the Esxi host switches to another alternative path and continues to use it unless it fails. If the original path which was discovered during boot times comes back online, Esxi host don’t failback on it. MRU is used when LUN’s are presented from Active/Passive array.

2: Fixed: In this technique first working path (defined by administrator) is chosen at boot time. If the original path becomes unavailable or fails, Esxi host switches to another available path, but as soon as the original path comes back online again, Esxi host immediately fail back on that path. This technique is mostly used when LUN’s are presented from Active/Active storage array.

3: Round Robin: In Round Robin technique, Esxi host can use all available paths to connect to LUN’s and thus enables load distribution among the configured path. This technique can be used for both Active/Active and Active/Passive storage arrays.

  • For Active/Active arrays all the paths are used.
  • For Active/Passive, only those paths which are connecting to active controller, will be used.

Apart from these 3 techniques there is one more technique for multipathing which is discussed as below:

ALUA: Asymmetric arrays can process I/O request via both controllers at the same time, but each individual LUN is managed by a particular controller. If I/O request is received for a LUN via a controller other than its managing controller, then the traffic is proxied via it to the managing controller.

ALUA SATP plugin is used for asymmetric arrays. When an Esxi host is connected to ALUA capable array, the array can take advantage of the host knowing it has multiple storage processors and which paths are direct. This allow Esxi hosts to make better load balancing and failover decisions. There are 2 ALUA transition modes that an array can advertise:

  • Implicit: Array itself can assign and change managing controllers for each LUN.
  • Explicit: Here Esxi host can change LUN’s managing controller.