VUM Orchestrated upgrades allow you to upgrade the objects in your vSphere inventory in a two-step process: host upgrades followed by virtual machine upgrades. If you want the upgrade process to be fully automated, you can configure it on cluster level or you can configure this at the individual host or virtual machine level for granular control.
Before going ahead with orchestrated upgrade, we have to ensure that we have baseline groups created for hosts as well as VM’s. I will talk more on this later in the post.
In orchestrated upgrade, we have to first remediate the cluster against the host upgrade baseline (we covered creation/remediation in our last post). Once the hosts are upgraded, we remediate the same cluster against a virtual machine upgrade baseline group containing the VM Hardware Upgrade to Match Host and VMware Tools Upgrade to Match Host baselines.
How Orchestrated upgrade works?
- Orchestrated upgrade of ESXi Hosts
Patches/extensions and upgrades can be applied to an Esxi host by using host baseline group. If the baseline group also contains an upgrade baseline, VUM first upgrades the Esxi hosts in the cluster and then apply the patch and/or extension baselines. This method ensures patches are not lost during host upgrade as the patches are applicable to the specific host version.
- Orchestrated upgrade of Virtual Machines
If you are trying to upgrade both VMware Tools and virtual hardware of all the VMs that are present in the vCenter inventory, then you should have a baseline group with following baselines:
- VM Hardware upgrade to match host
- VMware Tools upgrade to match host
If both baselines are present in the baseline group, then upgrade operation follows this order: VMware tools first, then virtual hardware of the VM.
Note: During the VMware tools upgrade, the VMs must be powered on. If the inventory has VM’s that are in other power state (powered-off/suspended), then VUM will power it on, execute the upgrade and restore the original power state of the VM. During the virtual hardware upgrade, the VMs must be in powered-off state. If there are running VM’s in inventory, VUM will shut it down, upgrade virtual hardware, and power it back on.
Esxi Host Remediation
By default the remediation of Esxi hosts happens in sequential manner i.e VUM will start from top and will remediate one host at a time and when the process is completed for one host, VUM will move over and will start remediating the next host in the list.
We can change the cluster settings to enable parallel remediation so that one more than one host can be remediated at a time. This is possible when you have adequate failover capacity in your cluster. If during parallel remediation, VUM encounters any error with remediation of host, it is ignored and process continues with other hosts in the cluster.
disrupting DRS settings. Alternatively an administrator can define the limit for number of host that can be parallelly remediated during the remediation wizard.
If you are going for sequential remediation then, if during the process any host fails to enter maintenance mode, VUM will report an error and the remediation process stops and fails. The hosts that have already been remediated remain at the updated level.
Important: If your vCenter server or VUM server are running as a VM and resides on one of the host, which VUM is trying to remediate and you have DRS enabled on the cluster, then DRS will first try to migrate these VM’s to another host for remediation process to succeed for that host. If DRS is unable to do so, remediation of that host fails, but the process do not stops and VUM picks the next host from the list.
Remediation of Esxi hosts in a vSAN Cluster
If the Esxi hosts are part of a vSAN cluster, the remediation process is always sequential even if you have selected parallel remediation in the remediation wizard. This is because of the fact that, only one host from a vSAN cluster can be in maintenance mode at any time. This is how vSAN is designed.
Things to do before performing remediation
For remediation process to work smooth without any hassles, it is advisable to disable some of the cluster features such as DPM, HA Admission Control, Fault Tolerance etc. Also disconnect any removable devices from the virtual machines so that DRS should not face any issues while migrating VM to other host.
If you have a big environment and have 1000’s of VM and you are not sure how many of them are attached to removable media, you can either use RVTools or can write a PowerCli script to generate list of such VM’s.
Also during running the remediation wizard, you can Generate Reports to verify that there are not any inconsistent settings present at host/VM level that can cause remediation process to fail.
I hope you find this post informational. Feel free to share this on social media if it is worth sharing. Be sociable 🙂