Manual vCNS / vShield Edge HA Little Gem!

vCNS-HARecently, I have been doing lots with vCNS and manual creation / manipulation of vShield Edge devices (posts coming soon). One thing that drive me crazy is a tiny little thing that prompted me to write this quick Little Gem – ‘Edge HA’ sat on my to do list, and gloated at me…..

When creating a manual vShield Edge device in vCNS, there is the usual opportunity to create an pair of appliances for running the pair in High Availability mode. Trouble is, the options for deployment are limited and not very clear. (This might be clear / obvious to some, but weren’t to me!)

When creating an HA pair, in the vShield Manager console editing the Edge device in question under Settings – the HA Configuration gives few options. Essentially, ‘Enabled’ or ‘Disabled’, vNIC, Declared Dead Time and Management IPs. Here’s where my confusion was based. Management IPs. So many questions……!

The option for Management IPs is even outlined. 2 IP entry boxes, and note text: ‘You can specify pair of IPs (in DIDR format with /30 subnet. Management IPs must not overlap with any vnic subnets’.

OK, so I need Management IPs to manually create a HA pair. What /30 address range do I need to specify? Can the IP range share an existing vNIC, or does the Edge device need another interface or uplink. Where do I define the /30 addresses. Do they need their own vLANs? Must I create a whole new private address range specifically for HA heartbeat? Like I said – so many questions. Scour the documentation, Google ‘vShield Edge Management IPs’ produces no helpful results. So – to the LAB!

Turns out, you don’t need Management IPs at all. Simply change the HA Status to ‘Enable’, select a vNIC to support HA heartbeat, and add a second Edge appliance via the green plus symbol (it will prompt for the parameters) to deploy the HA pair! When both report as ‘Deployed’, HA is configured and your Edge device is protected.

Sigh. Like I said. This might seem obvious to some, but it wasn’t to me. ‘Edge HA’ is no longer on my to-do list!

New Book – VMware vSphere Design (2nd Edition)

Forbes Guthrie and Scott Lowe have been busy. I very much enjoyed the first edition of the VMware vSphere Design book and now the second edition is up on Amazon for download in Kindle format or pre-order for print copies. In this edition, there’s also a chapter on vCloud design by Kendrick Coleman.

Besides being a good read in and of itself, the first book was good to help with VCAP4-DCD preparation. I imagine that this edition will be equally useful for VCAP5-DCD preparation. I look forward to reading it. Well, I will when my copy shows up (family rule: I’m not allowed to buy anything for myself in the month of my birthday).

HA Failover Errors in vSphere

In a production system running vSphere 5.1, I’ve noticed that on occasion an error appears at cluster level about ‘HA initiating virtual machine failover’.

vSphereHA

Checking the host and vCenter logs shows no host reboots or isolation. In this case, there fortunately is a simple fix that seems to rid the cluster of the erroneous error message.

Disable HA on the cluster (Right-click cluster > Edit Settings > Uncheck HA checkbox). The hosts will automatically disable HA as part of the cluster settings with no effect to the running VMs on the host.

Re-enable HA on the cluster (reverse the settings above). The HA cluster will have an election and elect a new HA master and configure the remainder of the hosts as slaves.

Voila! The error should now disappear and your cluster should be back to full health.

Quick Tip: Adding Active Directory to vSphere 5 SSO

If, like me, you are installing, burning, re-installing your lab set-up at the moment with the latest and greatest VMware releases, you’ll no doubt be going into battle with installing vSphere 5.1 and Single Sign-On.

There are several little idiosyncrasies with SSO that are well documented across the interweb, but one I came across hasn’t been posted much about. It concerns adding Active Directory.

If you install vSphere SSO as a local user, the domain you are connected to doesn’t automatically get interrogated and added to the SSO configuration, and needs to be added subsequently as an authentication domain. You can do this through the vSphere Web Client.

The quick tip I have for this is: When adding the authentication credentials to connect to the AD domain, use the credentials in the format ‘user@domain’ rather than ‘domain\user’, as the build I was installing when I came across this only allowed the former and not the latter format.

This foxed me for a little while until I worked it out – not least because the resulting error messages from the vSphere Web Client when trying the latter format are not exactly crystal clear as to where the error resides with the information being entered.

To get around this problem, either use the former format, or instead install vSphere as a domain connected user, as mentioned in the VMware vSphere documentation – vSphere Pre-Requisites:

http://pubs.vmware.com/vsphere-51/topic/com.vmware.vsphere.install.doc/GUID-C6AF2766-1AD0-41FD-B591-75D37DDB281F.html

Hope this saves someone some time if they come up against the same scenario I did!

“Failed to login to NFC server” errors in vSphere

I came across a strange little issue with vSphere the other morning whilst moving VMDK files around the network.

The infrastructure I’m working with is all vSphere 5.1, with ESXi hosts and VMFS-5 datastores (VMFS v5.58).

The error occured when uploading a VMDK file from the local vCenter desktop to a LUN via the vSphere client datastore browser – resulting in the error below.

Log files gave few clues, so a little searching arounnd VMware Support and Google resulting in the following possible solutions:

- DNS issues with the client hosting vSphere client.

- Issues with the ESXi host providing connectivity to the datastore (and specifically the VMware configuration file).

- An issue with the VPXA agent (for ESX 3 hosts).

Solution:

Any of the below KBs or Communities threads may be relevant to your particular issue. However, first suspect should always be DNS! Check this first and resolve any issues before delving further into support and troubleshooting.

- KB: 1017196

- KB: 1007336

- Communities Search Results (Google)