VUM orchestrated vSphere upgrades

VUM Orchestrated upgrades allow you to upgrade the objects in your vSphere inventory in a two-step process: host upgrades followed by virtual machine upgrades. If you want the upgrade process to be fully automated, you can configure it on cluster level or you can configure this at the individual host or virtual machine level for granular control.

Before going ahead with orchestrated upgrade, we have to ensure that we have baseline groups created for hosts as well as VM’s. I will talk more on this later in the post. read more

Configuring vSphere Update Manager

In last post we learn how to configure UMDS and how to enable VUM to use shared repository for downloading patches. If you are new to VUM/UMDS and by mistake landed directly on this page, I would encourage reading about them first from below links:

1: Installing vSphere Update Manager and Update Manager Download Service

2: Configure Update Manager Download Service

Also in past I have written one blog post on Creating Esxi hosts baselines and how to remediate host. You can read that post from here. read more

Configure Update Manager Download Service for VUM

Last year I wrote a post on how to install and configure VUM and UMDS, but never got chance to connect UMDS to VUM and ended up downloading patches directly on VUM server via internet.

Once again I am playing with UMDS in lab and in this post we will cover why we need UMDS and how to configure it.

I am not covering steps for installing VUM/UMDS here because they are pretty straight forward and if you are new to these things, you can read the instructions about installation steps from here.

What is Update Manager Download Service?

Update Manager Download Service (UMDS) is an optional component which you can deploy with update manager. We can download upgrades for virtual appliances, patch metadata, patch binaries and notifications etc using UMDS.

Why we need UMDS when VUM is there?

Its a obvious question to ask that why we need UMDS when VUM is capable of downloading and installing patches on Esxi hosts/vApps. The answer of this lies in 2 use cases discussed below:

  • If the security policies in your your environment deny Internet access for the Update Manager VM(s), you can configure UMDS on a server that has Internet Access and automate the export process and transfer files from the UMDS to the Update Manager server by setting up a Web Server on the VM on which UMDS is installed.
  • There is a one to one mapping between VUM and vCenter and if you have multiple vCenter servers in your environment, you can save yourself from deploying ‘n’ number of VUM servers and just configure a single repository in UMDS and pointing all the VUM servers to that central repository and thus saving space/resources.

After you download patch data and notifications with UMDS, and export the downloads so that they become available to the Update Manager server, Update Manager deletes the recalled patches and displays the notifications on the Notifications tab.

Exploring UMDS

Post installation of UMDS, you can use the vmware-umds command to configure the UMDS server. This executable is located in the installation directory of UMDS, which defaults to C:\Program Files (x86)\VMware\Infrastructure\Update Manager.

To list the current configuration of UMDS, run ‘vmware-umds  -G’ command

PS C:\Program Files (x86)\VMware\Infrastructure\Update Manager> .\vmware-umds -G Configured URLs URL Type Removable URL HOST NO HOST NO https://hostupdate.vmware.com/software/VUM/PRODUCTION/main/vmw-depot-index.xml HOST NO https://hostupdate.vmware.com/software/VUM/PRODUCTION/csco-main/csco-depot-index.xml VA NO http://vapp-updates.vmware.com/vai-catalog/index.xml Patch store location : C:\Patch-Store Export store location : Proxy Server : Not configured Host patch content download: enabled Host Versions for which patch content will be downloaded: embeddedEsx-6.0.0-INTL embeddedEsx-5.0.0-INTL embeddedEsx-5.1.0-INTL embeddedEsx-5.5.0-INTL Virtual appliance content download: disabled read more

Split vCenter Servers configured in an Enhanced Linked Mode

Yesterday while reading about Enhanced linked mode I stumbled across this blogpost by William Lam where he have demonstrated how to split vCenters which are configured in linked mode.

I thought to give it a try in my lab also as these days I am playing around PSC’s and repointing, ELM things etc.

In my lab I have 2 PSC nodes and 2 vCenter server nodes each pointing to one of the PSC. Both PSC nodes are in same SSO domain/site

vcentersrv02:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location --server-name localhost

https://psc04.alex.local:443/lookupservice/sdk

vcentersrv03:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location --server-name localhost

https://psc06.alex.local/lookupservice/sdk

elm-2.PNG

Both PSC are replicating to each other. Also I have verified that I do not have any stale entries for any PSC nodes from my existing lab activities.

psc04:~ # /usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h psc04.alex.local -u administrator -w SSO-Admin-Pwd cn=psc04.alex.local,cn=Servers,cn=BLR-DC3,cn=Sites,cn=Configuration,dc=alex,dc=lab cn=psc06.alex.local,cn=Servers,cn=BLR-DC3,cn=Sites,cn=Configuration,dc=alex,dc=lab psc04:~ # /usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartners -h localhost -u administrator -w SSO-Admin-Pwd ldap://psc06.alex.local psc06:~ # /usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartners -h localhost -u administrator -w SSO-Admin-Pwd ldap://psc04.alex.local read more

vCenter Server Advance Settings Configuration

vCenter Advance settings is used to modify the vpxd.cfg configuration file. To view the configuration options available with Advance settings, login to Web Client and select vCenter server from vCenter Inventory list and navigate to Manage > Settings > Advanced Settings as sown below.

You can use Advanced Settings to add/edit entries to the vpxd.cfg file, but can’t delete them. A user should have Globa.Settings privileges to make any configuration change from here. 

For e.g to see list of available options for certificate related settings, type certmgmt in search box and hit enter. read more

Configure Linked Mode in vSphere 6

Linked Mode was first intoducedd in vSphere 4.x and it has come a long way with vSphere 6.0.

Enhanced linked mode (ELM) allows administrators to manage multiple vCenter servers from one place using vSphere Web client. vCenter servers in ELM can replicate roles, permissions, licenses and policies between them.

ELM also enables Cross vCenter vMotion i.e you can migrate virtual machines across clusters on separate vCenter instances; subject to network limitations.

Previously linked mode configuration was only possible with Windows based VC as ADAM was used as the replication engine between the VC’s. read more

Reconfigure Embedded vCenter to External PSC

Prior to vSphere 6.0 U1 it was only possible to repoint vCenter Server which was deployed with external PSC to another PSC in same SSO domain. With vSphere 6.0U1, you can now reconfigure embedded vCenter server deployment to an external deployment.

Components of PSC which resides in embedded node are demoted and the repoints vCenter server to an external PSC node which resides in the same Single Sign On (SSO) domain as the source embedded node.

VMware made it possible by introducing an utility named cmsso-util and there are two main uses for cmsso-util:

Reconfigure

  • Reconfigure is used when you want to point your vCenter server from embedded PSC to an externally deployed PSC.
  • The source and target PSC should be in same SSO domain.

Repoint

  • This is used when a vCenter is deployed with external PSC and you have one more external PSC and you want to move vCenter from source PSC to target PSC.
  • The target PSC node must be a replication member in the same SSO domain as the original PSC.

Note: You cannot repoint a VC node to a PSC node in a different SSO domain.

This post is focused on using the reconfigure option for the embedded deployment. If you are new to repoint thing, you can check out my previous blog posts:

How to repoint vCenter Server 6.x between External PSC within a site

Repointing vCenter Server 6.0 to External PSC’s across sites

Lab Setup:

I have one vCenter server (vcentersrv05) with embedded psc and I have one external PSC which is in same sso domain/site as the embedded PSC. Also both vCenter server and external PSC have been joined to AD domain alex.local.

SSO domain name is alexlab.local. I have verified that health status of both vCenter and PSC node is good.

psc-re-1.PNG

Reconfigure using cmsso-util

VMware KB-2148924 outlines the steps for this process.

Note: The reconfiguration of a vCenter Server  is a one-way process so take snapshots of the external PSC node and the vCenter server you are doing the reconfigure operation. Better safe than sorry.

Step 1: Login to the vCenter Server Appliance as root user using SSH.

Step 2: Run this command to verify that all PSC services are running:

# service-control –status –all

Step 3: Run this command for reconfigure operation:

# /bin/cmsso-util reconfigure –repoint-psc psc_fqdn –username administrator –domain-name domain_name –passwd password

For example:

#/bin/cmsso-util reconfigure –repoint-psc psc05.alex.local –username administrator –domain-name alexlab.local –passwd SSO-Admin-Pwd

If all goes well then you should see a message similar to:

The vCenter Server has been successfully reconfigured and repointed to the external Platform Services Controller psc05.alex.local.

psc-re-2.PNG

Step 4: Login to the vCenter Server instance by using the vSphere Web Client and verify that the vCenter Server is running and can be managed.

Also verify the PSC where your vCenter server is pointing to.

psc-res-3.PNG

Regenerate Certificates

Once vCenter has been reconfigured to use the new PSC, We have to regenerate certificates as the certificates that was issues by old psc is now non-existent. In my lab I am not using any complex setup for certs and all certs are issued by VMCA.

In vCSA certificates can be managed using the Certificate-Manager utility:  /usr/lib/vmware-vmca/bin/certificate-manager 

I ran the certificate-manager utility and selected option 3 to replace the machine SSL certificate with a VMCA certificate. The process immediately failed after entering in the administrator credential:

You are going to regenerate Machine SSL cert using VMCA Continue operation : Option[Y/N] ? : Y Status : 0% Completed [Replacing Machine SSL Cert...] Using config file : /var/tmp/vmware/MACHINE_SSL_CERT.cfg Error: 382312514, VMCAGetSignedCertificatePrivate() failedStatus : Failed Error Code : 382312514 Error Message : Failed to connect to the remote host, reason = rpc_s_connect_rejected (0x16c9a042). Status : 0% Completed [Operation failed, performing automatic rollback] read more

Configure Identity Sources for Single Sign-On

VMware introduced SSO with vSphere 5.1 and over the release SSO has matured very much. SSO can now be connected to multiple authentication domains, like active directory and ldap, so that it can exchange authentication for tokens which are used to access multiple vSphere services.

ids-00

An Identity Source is a collection of user and group data, which is stored in either Active Directory, OpenLDAP or locally in the OS.

At the time of PSC/vCenter deployment we create one identity source (SSO domain) and after vCenter installation is completed, only the users defined under this SSO domain or localos can login to vCenter. This identity source is internal to vCenter SSO. read more

Remove PSC from SSO Domain

In this post we will learn how to decommision/remove a PSC from SSO domain. I am covering steps needed for VCSA in this post. Steps for a Windows based vCenter server are very similar and is explained in VMware KB-2106736.

Why I need to do so?

In my lab I was doing a lot of new things with PSC deployments and repointing my vCenter server from one PSC to other. If you are new to how to repoint a vCenter server amongst PSC’s, please read below 2 articles:

1: How to repoint vCenter Server 6.x between External PSC within a site

2: Repointing vCenter Server 6.0 to External PSC’s across sites

At present I have 3 PSC’s namely psc02.alex.local,psc03.alex.local and psc03.alex.local. I have one vCenter server which was originally deployed with psc02 as external psc. First I moved my vCenter server from psc02 to psc03 (they were in same domain/site) and then I moved VC from psc03 to psc04 (they were in same domain but different site)

You can see in output of below command that which PSC is replicating to which other PSC

psc02:~ # /usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h psc03.alex.local -u administrator -w SSO-Admin-Pwd

cn=psc02.alex.local,cn=Servers,cn=BLR-DC2,cn=Sites,cn=Configuration,dc=alex,dc=lab
cn=psc03.alex.local,cn=Servers,cn=BLR-DC2,cn=Sites,cn=Configuration,dc=alex,dc=lab

cn=psc04.alex.local,cn=Servers,cn=BLR-DC3,cn=Sites,cn=Configuration,dc=alex,dc=lab

And currently VC pointing to PSC04

vcentersrv02:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location --server-name localhost https://psc04.alex.local:443/lookupservice/sdk read more

Repointing vCenter Server 6.0 to External PSC’s across sites

In my last post I have demonstrated how to move a vCenter server from one PSC to another. In this article we will learn to repoint vCenter Server 6.0 between Platform Service Controllers (PSC) which are in same domain but different sites.

Before vSphere 6.0 U1, it was not possible to repoint vCenter server amongst PSC’s which were not in same site (but being in same domain). With vSphere 6.0 U1, VMware made this possible by introducing a new utility called cmsso-util. 

VMware KB-2131191 article outline the steps for achieving this goal.The steps outlined in the KB are for vCenter server with external PSC deployment architecture.

Note: If you have an embedded vCenter 6.0, then you can use cmsso-util to change embedded deployment model to an external PSC model. The old PSC will be decommissioned during this process. Go ahead with this configuration only if  you have no plans for using your old PSC again.

This article have all the steps for doing so. 

What is difference between SSO domain and SSO site?

A vSphere SSO Domain is similar to an Active Directory domain, and a SSO site is similar to a site within Active Directory.

SSO domains are a boundary of where vCenter Server/PSC nodes are replicating between each other. If you are using external deployment model for PSC nodes and they are in same SSO domain, enhanced linked mode (ELM) is enabled by default and you can log into any one of the vCenter servers and manage the other vCenter server in the same SSO domain.

You can organize PSC’s domains into logical sites. A site in the VMware Directory Service is a logical container for grouping Platform Services Controller instances within a vCenter Single Sign-On domain. An SSO site represents a single “instance” that will not be geographically disperse. 

Building Topology Information

Before going ahead with doing the vCenter server repoint, it is important to collect the topology information about SSO site name, vCenter pointing to which PSC etc. We can use the following commands to discover the SSO topology

SSO Site

psc03:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name --server-name localhost
BLR-DC2

psc04:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-site-name --server-name localhost

BLR-DC3

You can also use vdcrepadmin command to fetch this info as shown below:

psc03:~ # /usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h psc03.alex.local -u administrator -w psc-admin-pwd

cn=psc03.alex.local,cn=Servers,cn=BLR-DC2,cn=Sites,cn=Configuration,dc=alex,dc=lab

cn=psc04.alex.local,cn=Servers,cn=BLR-DC3,cn=Sites,cn=Configuration,dc=alex,dc=lab

SSO Domain

psc03:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-domain-name --server-name localhost alex.lab psc04:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-domain-name --server-name localhost alex.lab read more

How to repoint vCenter Server 6.x between External PSC within a site

In this post we will learn how to repoint a vcenter server with extenal psc to a new psc. Before doing that lets first understand about PSC high availability.

As we know with vSphere 6.0, VMware introduced the concept of PSC. PSC deals with identity management for administrators and applications that interact with the vSphere platform. PSC contains common infrastructure services such as vCenter Single Sign-On (SSO), VMware Certificate Authority (VMCA) and licensing etc.

To know more about PSC please read VMware KB-2113115

Since these important features lies within PSC, it is an very important to make sure PSC 100% availability of PSC server. PSC can be made highly available by deploying 2 nodes and then configuring a load balancer for the 2 nodes so that in case of failure, connections can be switched to other node.

Now what if you don’t have a load balancer with you to configure failover. Don’t be disheartened as VMware has solution for this also. The idea is to deploy one PSC node and configure the domain etc on your first PSC and then deploy the second PSC in the same domain and same site as of your first PSC.

Instructions for doing so have been laid out in this Article

The only disadvantage of not having a load balancer is that in case of Active PSC node failure, the failover do not happens automatically and you have to manually re-point your vcenter server to the other PSC node.

Even with a load balancer for PSC HA, you are not actually getting a true load balancing. William has explained this nicely in his blog post. I was really surprised to read about load balancer’s affinity to just a single PSC node.

Limitation with PSC repointing feature

Prior to 6.0U1, you had the ability to repoint a VC node to another PSC within the same vSphere SSO site.

With 6.0 U1, some more options were made available to users. These options are:

  • Reconfigure an embedded deployment to an external deployment
  • Repoint the VC node in an external deployment to another PSC within the same SSO domain, whether it is in the same site or not

With vSphere 6.0 U2, the limitation for repointing a VC node to another PSC is still within the same vSphere SSO domain.

In vSphere 6.5 the ability to repoint a VC server to a PSC in another vSphere SSO site is not supported. See this post for details

It means if you are running a vSphere 6.5 or a build prior to vSphere 6.0 U1, you can’t repoint vCenter amongst PSC’ which are in same domain but different site.

Things to know before going ahead with vCenter repointing

To which psc my vcenter server is pointing to?

There are 2 ways of doing so.

1: Using vmafd-cli command as shown below:

vcentersrv02:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-ls-location --server-name localhost

https://psc02.alex.local/lookupservice/sdk

2: From vCenter Web-Client

In Web Client select your vCenter server from vCenter inventory list and navigate to Manage > Advanced Settings and search for string “config.vpxd.sso.admin.uri” 

psc-0

What is the sso site name?

If you have too many PSC’s and vCenters deployed in your environment and each PSC/vCenter have its own domain/site name, then its very difficult to remeber these details. SSO site name can be retrieved via firing below comamnd:

vcentersrv02:~ # /usr/lib/vmware-vmafd/bin/vmafd-cli get-domain-name --server-name localhost

alex.lab

psc-1

Finding all deployed PSC’s

In case if you need to locate all available PSC’s in your environment, you have a couple of options i.e via command line and via Web Client.

In Web Client navigate to Home > Administration > System Configuration > Nodes

It will list all deployed PSC’s and vCenter Server

psc03

SSH to one of your PSC node and fire below command:

psc02:~ # /usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h psc02.alex.local -u administrator -w psc-administrator-passwd cn=psc02.alex.local,cn=Servers,cn=BLR-DC2,cn=Sites,cn=Configuration,dc=alex,dc=lab cn=psc03.alex.local,cn=Servers,cn=BLR-DC2,cn=Sites,cn=Configuration,dc=alex,dc=lab read more

System Swap / Scratch Configuration in vSphere 6

When a host boots from Auto Deploy, it is very common to see following alarms triggered on Esxi host

These alarms are triggered because host booted in a disk less environment and there are no place where system can store logs etc. 

In this post we will focus on how to fix these issues. This article is majorly focused on configuring/changing Esxi host swap and scratch partition configuration. We will start with system swap.

About System Swap

System swap is a memory reclamation process that can take advantage of unused memory resources across an entire system. In case of memory contention situation, system swap allows Esxi to reclaim certain parts of memory that is not used for virtual machines. The reclaimed memory is written to a storage location.

When swap is enabled, you have a tradeoff between the impact of reclaiming the memory from another process and the ability to assign the memory to a virtual machine that can use it. Since accessing the data from storage is slower than accessing data from memory, so we should be very careful with it determining where to store the swapped data  so that performance impact is minimal.

The ESXi host determines automatically where the system swap should be stored and marks that as the Preferred swap file location. This decision can be aided by selecting certain options. If the ESXi host does not find a feasible option, the system swap is not activated. These options are:

Datastore – Allows the use of a specific datastore to store the system swap

Host Cache – Allows the use of part of the host cache

Preferred swap file location – Allows the use of the preferred location configured for the host

Note: We need minimum of  1 GB free space to configure swap.

How to configure system swap:

Swap can be configured via Web-Client,Host profile and Power-CLI. We will discuss Web-Client and host profile method here. 

Configuring Swap from Web-Client

To configure/change  system swap settings, login to vCenter Web Client and navigate to Host and Cluster. Select the Esxi host and Manage > Settings > System  > System Swap. Click on Edit button to specify swap location.

Select “Enabled” check box to activate system swap and check mark the “use host cache” and “Can use datastore specified by host” option.

If you want to specify a particular datastore where swapped data will be stored, you can check mark “can use datastore” option and then select a datastore from drop down menu.

Configuring Swap via Host Profile

Edit your host profile and expand ‘General System Settings’ and select sub-profile “System Swap Settings“. Enable swap and select appropriate options as per your environment need. To specify a particular datastore to be used for storing swapped data, we need enable “Datastore Option” and provide datastore name. 

If you are a Power-CLI fan and want to do some scripting to enable swap on all host then please check this article from Aaron Margeson. Also vBrown-Bag have mentioned some one liner Power-CLI script.

Scratch Partition Configuration

Scratch partition is nothing but a partition which is used to store vm-support bundle when at the time on troubleshooting a support bundle is generated. Although it is not mandatory to have a scratch partition, it is recommended to have it configured because it is useful for troubleshooting purposes. 

During Esxi installation, the installer creates a 4GB VFAT scratch partition is created if it’s not present on another disk. When the host boot, the system tries to find a suitable partition on a local disk to create the scratch partition. If no scratch partition exists, then the support bundle is stored on host’s ramdisk.

In normal situation this doesn’t seems to be a big deal, but in memory contention situation, having a scratch partition starts to seems very important. Also if the support bundle is stored in ramdisk, they will disappear after host reboot. 

A minimum of 5.2 GB of free space is required on the installation disk for the scratch partition to be created.

Before you start setting up the scratch location you need to make a decision which datastore you are going to use for this. You need to create a dedicated folder for each host so that hosts’ do not overwrite each others data.

You can create these folders either via doing a SSH to a host and then navigating to /vmfs/volumes/datastore and then use mkdir command to create folders for each host or you can do this from vSphere Web Client by browsing datastores.

Note: Datastore chosen for creating per/host folder should be visible to all ESXi hosts.

There are various methods by which we can create/configure scratch partition for Esxi hosts. Most commonly used are via Web Client, SSH command line and via Power-CLI. We will discuss all three method one by one.

Configuring scratch partition using vSphere Web Client

login to vCenter Web Client and navigate to Host and Cluster. Select the Esxi host and Manage > Settings > Advanced System Settings and type scratch in the search box. 

Select the first option ‘ScratchConfig.ConfiguredScratchLocation‘ and edit this by clicking on the pencil button as shown below:

Enter the full path of he folder which you have created for your host. Remember to enter the path name using datastore UUID and not datastore name. 

Reboot Esxi host for changes to take effect. Post host reboot verify that configuration is persistent. 

Configuring scratch partition using Tech-Support Mode

The information about scratch partition configuration is written to the host’s /etc/vmware/locker.conf configuration file for use during the next boot.

Out of curiosity I just checked how locker.conf file looks and what info is stored there. 

[root@esxi01:~] cat /etc/vmware/locker.conf
/vmfs/volumes/5916bead-baa2874b-367f-0050560346b9 1

Now lets proceed with creating/configuring scratch partition using command line

Create a dedicated folder for your Esxi host

[root@esxi02:~] cd /vmfs/volumes/iSCSI-1/
[root@esxi02:/vmfs/volumes/591ac3ec-cc6af9a9-47c5-0050560346b9] mkdir -p Esxi02/scratch
[root@esxi02:/vmfs/volumes/591ac3ec-cc6af9a9-47c5-0050560346b9] cd Esxi02/scratch

Make a note of the full path to the folder created

/vmfs/volumes/591ac3ec-cc6af9a9-47c5-0050560346b9/Esxi02/scratch

Review the current scratch configuration

[root@esxi01:~] vim-cmd hostsvc/advopt/view ScratchConfig.ConfiguredScratchLocation (vim.option.OptionValue) [ (vim.option.OptionValue) { key = "ScratchConfig.ConfiguredScratchLocation", value = "/vmfs/volumes/5916bead-baa2874b-367f-0050560346b9" } ] read more

Configuring Syslog Settings on Edge Gateway in vCloud Air via Rest API

Recently I deployed syslog server in my vCloud Lab and was looking for a way to send Edge gateway logs to my syslog server. This post in focused on how to configure edge gateway syslog settings. 

VMware vCloud® Air supports the ability for customers to collect information about traffic coming to and from their edge gateway through the use of a syslog server. By configuring edge gateway to transfer log data to your syslog server, you can then set up alerts or notifications and build reports with your preferred tools.

If you do not have ANS subscription in vCloud Air then the only way to configure syslog settings on the Edge gateway is via vCloud API. There is no option available in GUI when you open edge gateway properties from within vCloud Director interface.

When it comes to using Rest API we have variety of choice to use as Rest Client. Some of the common clients include curl, Postman,Mozilla rest Client etc.

I personally prefers curl and postman and in this post I will demonstrate the curl option.

Requirements to Configure Syslog on Edge Gateway:

1: A REST client.

2: vCloud Air credentials.

3: vCloud Air Endpoint/Org name.

4: Configured syslog server and IP address.

Obtaining vCloud Air Endpoint/Org name

You can obtain the endpoint details by logging into vCloud Air portal and navigating to your Org/vDC.

Obtaining vCloud Air supported API versions

List of supported API versions that can be used with vCloud Air can be obtained by firing below command. 

# curl -sik -H “Accept:application/*+xml;version=5.6” -u “mjha@vmware.com” -X GET https://au-south-1-15.vchs.vmware.com/api/versions

You will get a long list of versions as output. Select any one of the version. Also make a note of the login URL. 

<VersionInfo>
 <Version>9.0</Version>
 <LoginUrl>https://au-south-1-15.vchs.vmware.com/api/compute/api/sessions</LoginUrl>
</VersionInfo>

Obtaining Auth Code for vCloud API Login

You need 4 things for generating Auth code for API login

A: Login URL (copy from previous output)

B: API Version: (copy from previous output)

C: Customer Header: Accept:application/*+xml;version=9.0

D: vCloud Air Credentials in format: username@domain-name@org-name

When you have all the 4 info handy, fire below API query to obtain Auth code

# curl -sik -H “Accept:application/*+xml;version=9.0” -u “mjha@vmware.com@bdd75fd4-a319-47d5-b4f2-77aad691488f” -X GET https://au-south-1-15.vchs.vmware.com/api/compute/api/sessions | grep auth

Enter host password for user ‘mjha@vmware.com@bdd75fd4-a319-47d5-b4f2-77aad691488f’:

x-vcloud-authorization: 1e95dc1064aa4083ae79bb617221853e

Now use following API queries in sequence

Find Org Href

# curl -sik -H “Accept:application/*+xml;version=5.6” -H “x-vcloud-authorization:1e95dc1064aa4083ae79bb617221853e” -X GET https://au-south-1-15.vchs.vmware.com/api/org/ | grep bdd75fd4-a319-47d5-b4f2-77aad691488f    

Note: bdd75fd4-a319-47d5-b4f2-77aad691488f is my org name

<Org href="https://au-south-1-15.vchs.vmware.com/api/compute/api/org/4f5feba5-bb82-456e-8898-95d4970f2624" name="bdd75fd4-a319-47d5-b4f2-77aad691488f" >

Find vDC Href

# curl -sik -H “Accept:application/*+xml;version=5.6” -H “x-vcloud-authorization:1e95dc1064aa4083ae79bb617221853e” -X GET https://au-south-1-15.vchs.vmware.com/api/compute/api/org/4f5feba5-bb82-456e-8898-95d4970f2624 | grep vdc

<href="https://au-south-1-15.vchs.vmware.com/api/compute/api/vdc/e89232de-3507-4b66-98d7-8ec25e99c826" name="Manish-VCAP-LAB" >
 

Find Edge Gateway Href

# curl -sik -H “Accept:application/*+xml;version=5.6” -H “x-vcloud-authorization:1e95dc1064aa4083ae79bb617221853e” -X GET https://au-south-1-15.vchs.vmware.com/api/compute/api/vdc/e89232de-3507-4b66-98d7-8ec25e99c826 | grep edge

<href="https://au-south-1-15.vchs.vmware.com/api/compute/api/admin/vdc/e89232de-3507-4b66-98d7-8ec25e99c826/edgeGateways" > read more

Troubleshooting You must be a member of SystemConfiguration.Administrators group issue

Today while working in lab came a situation where I had to enable/start a service and when I logged into Web Client with a user that has Administrative privileges I was seeing the error

You must be a member of SystemConfiguration.Administrators group in vcenter Single Sign-On to access System Configuration

This error was not new as I have encountered this several times in lab I was skipping this by logging into Web Client via administrator@vsphere.local user. I never tried to know why I was getting this error when my other user was part of the administrator group. read more

Configure Core Dump Settings On vSphere 6 Hosts

In this post we will look into how to configure Core Dump settings on Esxi hosts. But before doing that lets talk a bit about what is core dump.

What is Core Dump?

A core dump is the state of working memory of an Esxi host in the event of host failure like Purple Screen Of Death aka PSOD. In the event of PSOD the state of the VMkernel Memory is sent to the server where where dump collector service is running. This server is typically your vCenter server.

Core dumps information are very important when it comes to identifying and troubleshooting the issue which made the ESXi host to show a purple screen.

By default, a core dump is saved to the local disk. You can use ESXi Dump Collector to keep core dumps on a network server for use during debugging. The core Dump resides in a Diagnostic partition and in-order to create a partition we require atleast 100 MB of free space either locally or remotely available disks.

Some facts about core dump:

1: The Core dump Server service works on UDP Port (1025-9999) and uses port 6500 as default.

2: Network dump collector will not work if the management VMKernel port has been configured to use Etherchanel/LACP

3: The name of the protocol which is used for sending core dumps from failed ESXi to the Dump collector service is netdump.

4: Core Dump collector is not supported over IPv6 and only supports IPV4.

The network traffic is not encrypted and no authentication mechanism to make sure the integrity and validity of the data being received by the Dump Collector Service.

How to configure Core Dump on Esxi hosts?

There are various ways of configuring core dump settings on Esxi host which includes esxcli command, host profiles, from Web-Client, PowerCli and/or any other scripting method. In this post I will only discuss about esxcli and host profile method. Let’s get started.

Before firing any commands on Esxi hosts to enable/configure coredump service, we first have to start coredump service on network server (vCenter server) where Esxi host will send the coredumps. 

To do so login to vCenter Web-Client and navigate to Home > Administration > System Configuration > Services and select the Esxi Dump Collector service and click on Actions tab to enable the service as shown below.

Once coredump service has been enabled, you will now see option to start the service under Actions menu. 

Configuring Core Dump using esxcli utility

Available option for coredump network namespace

[root@esxi01:~] esxcli system coredump network

Usage: esxcli system coredump network {cmd} [cmd options]
Available Commands:
 check Check the status of the configured network dump server
 get      Get the currently configured parameters for network coredump, if enabled.
 set      Set the parameters used for network core dump

Verify if coredump service is enabled on esxi host

[root@esxi01:~] esxcli system coredump network check
Network coredump not enabled

Retrieve current configuration for coredump service

[root@esxi01:~] esxcli system coredump network get Enabled: false Host VNic: Is Using IPv6: false Network Server IP: Network Server Port: 0 read more

Configure Centralized Logging on ESXi 6 Hosts

In this post we will learn how to configure Esxi-6 hosts to send the logs to a centralized syslog server.

Purpose of configuring syslog server?

As per VMware KB-2003322

ESXi 5.0 and higher hosts run a syslog service (vmsyslogd) that provides a standard mechanism for logging messages from the VMkernel and other system components. By default in ESXi, these logs are placed on a local scratch volume or a ramdisk.

To preserve the logs further, ESXi can be configured to place these logs to an alternate storage location on disk and to send the logs across the network to a syslog server.

Retention, rotation, and splitting of logs received and managed by a syslog server are fully controlled by that syslog server. ESXi cannot configure or control log management on a remote syslog server.

How to configure Esxi hosts for centralized logging?

There are various ways to configure syslog settings on Esxi hosts. These includes:

1: Using esxcli command on Esxi host.

2: Using vSphere Web-Client.

3: Using vSphere Thick client.

4: Using PowerCli.

5: Using Host Profiles.

We will look individually on all available method one by one. Let’s get started.

Before configuring esxi hosts to send logs to syslog server, we need to have a syslog server in our environment. I have configured my syslog server on a CentOS 6 box following instructions illustrated here

I added additional 2 lines at the bottom of rsyslog.conf file so that all hosts should have their logs in their individual folder

$template TmplAuth, "/var/log/%HOSTNAME%/%PROGRAMNAME%.log"
*.* ?TmplAuth

Configuring Syslog Using esxcli utility

The command to configure syslog settings on Esxi hosts is esxcli system syslog config

Lets first see what are the available options with this command.

[root@esxi01:~] esxcli system syslog config set –help

With this command we have following options available:

--check-ssl-certs Verify remote SSL certificates against the local CA Store --default-rotate=<long> Default number of rotated local logs to keep --default-size=<long> Default size of local logs before rotation, in KiB --default-timeout=<long> Default network retry timeout in seconds if a remote server fails to respond --drop-log-rotate=<long> Number of rotated dropped log files to keep --drop-log-size=<long> Size of dropped log file before rotation, in KiB --logdir=<str> The directory to output local logs to --logdir-unique Place logs in a unique subdirectory of logdir, based on hostname --loghost=<str> The remote host(s) to send logs to --queue-drop-mark=<long> Message queue capacity after which messages are dropped --reset=<str> Reset values to default read more

Using Host Profile With Auto Deploy

Last week I wrote a post on Auto deploy configuration in vSphere 6 and deployed on Esxi host using Auto Deploy. In this post we will learn about using host profiles with Auto Deploy for customizing Esxi hosts that will be installed via Auto Deploy.

But before we begin with creating Host Profiles lets have a brief introduction of what is Host Profile and what challenges we are solving by using it. read more

Host Profile Issue – Cluster Non Compliant – FT logging is not enabled

Recently while working with Host profiles in my lab, I faced too many issues and was getting frustrated and decided to pen down my frustration. Using Host profile was not new for me but I guess I have not used it in last 2 years and so forgot a bit about it.

The issue was I got 2 of my host deployed via Auto Deploy and customized via Host Profile and both hosts were showing compliant with the attached profile. Its the cluster which was unhappy and was complaining about “FT is not supported” and “FT logging not enabled”. read more

Auto Deploy Configuration in vSphere 6

Auto deploy is used for PXE booting/installation of Esxi over the network. When a host is deployed using Auto Deploy the state information is loaded to memory upon boot, the state is not permanently stored on the physical host by default. read more

Cannot Redeploy Edge Gateway “VSM response error (10020): Failed to deploy edge appliance vse-XXXX-0. The name ‘vse-XXXX-0’ already exists”

This post is very similar to issue described in my last post. The only difference in last issue and this was I was not able to redeploy edge gateway to get rid of stubborn Org Networks whereas in previous case Edge redeploy fixed the issue quite comfortably.

Let me start with a little bit background of how was this issue discovered and what challenges I faced.  I was working investigating a failed deprovision issue when this issue was discovered. Deprovision tasks in our environment are fully automated and we have some portal where these tasks arrives and there is a Resume button which when clicked, kicks the deprovision process.

When the Resume button is clicked that portal initiates API calls to vCD and start deleting stuffs. It starts with deleting vApps, vApp Templates and then proceed to Org Network deletion and then the edge gateway and at last deletes the Org vDC and Org.

Sometimes stuffs at vCD level are in inconsistent state and thus API calls are unable to delete that element and deprovision is halted in portal.

During my investigation I checked the logs and found that API calls were unable to remove one of the Org Network.

Following errors were visible in vCD UI for network deletion failure

[ 695e10af-1677-4c64-bbe1-42250b6c249d ] Cannot delete organization VDC network default-routed (0694f25a-78b9-45b0-be44-e5c8ccda4b91)
Failed to delete interface of edge gateway urn:uuid:5286e85d-afb0-4821-b4f4-db87b390ba11

- Failed to delete interface of edge gateway urn:uuid:5286e85d-afb0-4821-b4f4-db87b390ba11
 
- com.vmware.vcloud.fabric.nsm.error.VsmException: VSM response error (202): The requested object : vm-3768 could not be found. Object identifiers are case sensitive.

From the logs it was very clear that there are issues with edge backing VM’s. I went ahead with performing edge gateway redeploy without checking the edge VM’s status in vCenter. I was thinking that redeploy fixes this issue 9 out of 10 times so just give it a shot.

To my surprise edge gateway redeploy also failed and also I observed that redeploy task took around 20 minutes (usually it takes 5-7 minutes) and eventually timed out. 

Errors related to edge redeploy task failing was

[ e04b76e6-7bb1-4d97-a85c-0df2813a06be ] Cannot redeploy edge gateway M738162563-11503 (urn:uuid:5286e85d-afb0-4821-b4f4-db87b390ba11) com.vmware.vcloud.fabric.nsm.error.VsmException: VSM response error (10020): Failed to deploy edge appliance vse-xxxxx-0. (The name 'vse-xxxxx-0' already exists.) - com.vmware.vcloud.fabric.nsm.error.VsmException: VSM response error (10020): Failed to deploy edge appliance vse-xxxxx-0. (The name 'vse-xxxxx-0' already exists.) - VSM response error (10020): Failed to deploy edge appliance vse-xxxxx-0. (The name 'vse-xxxxx-0' already exists.) read more

Edge Gateway Network deletion failed with error “Failed to communicate with NSX Edge vm. Error code VIX_E_PROGRAM_NOT_STARTED was returned by VIX API”

Today while working on one production issue, I came across one incident where I was unable to delete one of the Org Network in vCloud Director. Observed following errors in vCD UI for the Org network deletion failure: 

On checking vcloud-container.debug.log I observed similar log entries as seen in vCD UI

This was entirely new error for me so I started googling this around and unfortunately did not found helpful article. The only article which I got for this error was this but of no use for me.  read more

Detaching and Deleting Independent Disks in vCloud Director via REST API

Yesterday while working on one of the production issue where we had to deprovision a tenant environment in vCloud Air, I noticed that independent disks were preventing automated deprovision of the environment and the error messages were loud and clear in the log files.

It was a new issue for me so I started reading about independent disks in vCloud Director and want to share few things about this.

First of all independent Disk feature in vCD is completely different from an Independent Disk in vSphere. Independent disks can be shared across multiple vApps/VM’s in vCloud Director. This feature was first introduced in vCD v5.1.

Following quote from vCloud Architecture Toolkit document rightly explains about independent disks

The use of independent disks with vCloud Director allows updates of virtual machines without impacting the underlying data.

The feature is designed to enable users to create virtual disks which can be attached to and detached from virtual machines. There is no functionality to control this feature from the vCD UI and this can be controlled via API’s only. 

When you create an independent disk, it is associated with an organization vDC but not with a virtual machine. After the disk has been created, the disk owner or an administrator can attach it to any virtual machine deployed in that vDC, detach it from a virtual machine, and remove it from the vDC.

Presence of Independent disks in vCD can be seen on navigating to Org > Administration > Org vDC > Independent Disks tab. If you right click on any of the disk you will not see any action window opening. 

0.PNG

In this post I am going to demonstrate how we can detach/delete independent disks from VM via API calls. Lets get started.

For sake of this demonstration, I have used some hypothetical names for Org and Org vDC.

Step 1: Obtain vCD Auth token code

# curl -sik -H “Accept:application/*+xml;version=5.6” -u “admin@system” -X POST https://vCD-FQDN/api/sessions | grep auth

Enter host password for user ‘admin@system’:

x-vcloud-authorization: Auth

Step 2: Locate your Org 

# curl -sik -H “Accept:application/*+xml;version=5.6” -H “x-vcloud-authorization:Auth-X GET https://vCD-FQDN/api/org/

On using the above API call, you will see a href to all your Org that are present in vCD. For your next query chose the href of the org where independent disks are lying.  

<Org href="https://vCD-FQDN:443/api/org/08356307-2939-42d3-a2a2-aeccef6478e4" name="ABC" type="application/vnd.vmware.vcloud.org+xml"/>

<Org href="https://vCD-FQDN:443/api/org/2b729e6f-588e-49c4-964f-89b2e744c075" name="DEF" type="application/vnd.vmware.vcloud.org+xml"/>

<Org href="https://vCD-FQDN:443/api/org/fc432145-f1f3-42f6-a26f-eeb3d306a405" name="GHI" type="application/vnd.vmware.vcloud.org+xml"/>

Step 3: Locate your Org vDC

# curl -sik -H “Accept:application/*+xml;version=5.6” -H “x-vcloud-authorization: Auth” -X GET https://vCD-FQDN:443/api/org/fc432145-f1f3-42f6-a26f-eeb3d306a405 | grep vdc

<Link rel="down" href="https://vCD-FQDN:443/api/vdc/adf0929b-a107-4671-9f85-b629b744c2b7" name="VDC1" type="application/vnd.vmware.vcloud.vdc+xml"/> read more

Replacing vCD SSL Certificates in a Multi Cell Environment

After a long wait I finally got chance to work on vCloud Director ssl certificates. This was the only component in my lab which was still using self-signed certs and that encouraged me to do something new in lab.

A note on vCD SSL certificates

vCloud Director like any other VMware product needs a certificate to be installed on the device that it uses for communication with the other products. By default vCD uses a self-signed certificate. If you have a certificate authority in your environment then you can get the certs created in advance before installing vCloud director and save your self from pain of messing with certificates at later stages. read more

Troubleshooting Failed Org Network Creation in vCloud Director

Today while working in my lab, I observed that while creating a new VDC in vCD was failing because org network failed to create.

On navigating to Org VDC list and clicking on error, it read the error load and clear that org vdc network can’t be created.

vcd-1

On navigating to Org VDC Networks section and clicking on error, I was able to identify what has caused the network creation failure.

vcd-2

The error stack was reading as below:

[ 114db22d-fc14-4c87-9030-36d2316aff8b ] Cannot deploy organization VDC network (f1514426-647e-4a03-a5a9-fafa4d73bb58)
com.vmware.vcloud.api.presentation.service.InternalServerErrorException: Cannot create network “dvs.VCDVSRouted-NW-9ab02973-9ded-4c4b-8826-4a52bdf2d6cf” from VXLAN network pool “urn:uuid:5c9de104-0f40-4cec-898f-985ee1fce1d6”. Make sure vShield Manager infrastructure is properly configured and there are segment IDs available. read more