vSphere ‘Invalid configuration for device ’0′ error’ Solution

Using vSphere 4 hosts (in this case a legacy un-patched host that was being migrated off and decommissioned), we came across an interesting and ambiguois error – ‘Invalid configuration for device ’0′, plus a note of time, the target object and the vCenter Server.

In this case, I was trying to migrate a powered-off VM to different storage – resulting in the error. I also found that the issue was related to the second disk attached to the VM. Editing the VM showed the size as 0MB, but removing this disk also threw the error in vCenter.

The solution was to follow these steps:

  • Remove the VM from the vCenter inventory.
  • Update the VM VMX file. There are 2 ways to do this – SSH to the host / datastore using a tool like Putty, or use the datastore browser to download the VMX file, then edit the it in Notepad.
  • Inside the VMX file. look for the following entries:

scsi0:1.present = “true”

scsi0:1.fileName = “vmname.vmdk”

Update these entries to the following:

scsi0:1.present = “false

scsi0:1.fileName = “vmname.vmdk”

  • Re-add the VM to the vCenter inventory, either through the GUI or using ‘vmware-cmd -s register \path\to\your\vm.vmdk’.
  • Check the VM properties, you should now show the offending drive as missing and it can be re-added from the datastore.

Manual vCNS / vShield Edge HA Little Gem!

vCNS-HARecently, I have been doing lots with vCNS and manual creation / manipulation of vShield Edge devices (posts coming soon). One thing that drive me crazy is a tiny little thing that prompted me to write this quick Little Gem – ‘Edge HA’ sat on my to do list, and gloated at me…..

When creating a manual vShield Edge device in vCNS, there is the usual opportunity to create an pair of appliances for running the pair in High Availability mode. Trouble is, the options for deployment are limited and not very clear. (This might be clear / obvious to some, but weren’t to me!)

When creating an HA pair, in the vShield Manager console editing the Edge device in question under Settings – the HA Configuration gives few options. Essentially, ‘Enabled’ or ‘Disabled’, vNIC, Declared Dead Time and Management IPs. Here’s where my confusion was based. Management IPs. So many questions……!

The option for Management IPs is even outlined. 2 IP entry boxes, and note text: ‘You can specify pair of IPs (in DIDR format with /30 subnet. Management IPs must not overlap with any vnic subnets’.

OK, so I need Management IPs to manually create a HA pair. What /30 address range do I need to specify? Can the IP range share an existing vNIC, or does the Edge device need another interface or uplink. Where do I define the /30 addresses. Do they need their own vLANs? Must I create a whole new private address range specifically for HA heartbeat? Like I said – so many questions. Scour the documentation, Google ‘vShield Edge Management IPs’ produces no helpful results. So – to the LAB!

Turns out, you don’t need Management IPs at all. Simply change the HA Status to ‘Enable’, select a vNIC to support HA heartbeat, and add a second Edge appliance via the green plus symbol (it will prompt for the parameters) to deploy the HA pair! When both report as ‘Deployed’, HA is configured and your Edge device is protected.

Sigh. Like I said. This might seem obvious to some, but it wasn’t to me. ‘Edge HA’ is no longer on my to-do list!

Windows 2008 R2 Domain Trust – Fixing The Security DB Trust Relationship

I came across an interesting problem recently. A customer is trialling child DNS domains for their server in their lab set-up, and we came across an interesting problem when changing the FQDN of the server.

In this instance, we were playing with using dcpromo to promote and demote the domain controller (DC) into a single domain. At one point, we got out of step with the reboots and got the following message when logging-in to the domain:

The Security database on the server does not have a computer account for this workstation trust relationship.

Interesting. Somehow, we managed to get the domain out of sync with the FQDN of the server. We found 3 ways to resolve this issue, 2 simple (which felt a little like a hack) and 1 that permanently fixed the issue:

  1. Login to the DC with local admin credentials, and in Computer / Properties: change the FQDN.local name to just the machine name, then reboot the server.
  2. Update the DC FQDN in the domain DNS system, so the current DC FQDN and that in the DNS server match.
  3. Update the computer account in the domain to reflect the FQDN of the DC. To do this:
    1. Open your domain Users & Computers console.
    2. Under View, select Advanced Features.
    3. Find the computer account of your DC. (Depending on what your policy is, this might be under <AD Host> / <Domain> / Domain Controllers, or elsewhere in another OU. If in doubt, use the domain search for the machine name, or ask the domain admin!
    4. Open Properties of the computer account, then select “Attributes Editor”. (If you can’t see it, re-check step 2 above).
    5. Look for attributes “dNSHostName” and “servicePrincipleName” (and anywhere else the FQDN of the DC is found).
    6. If incorrect (they should be – to produce the error on the DC), update them to the correct values.
    7. Reboot the DC and try connecting again. If the error persists, re-check the attributes in step 5 again for any missed values that need resetting. Also bear in mind that the DCs may need some time to replicate if your domain has more than one DC.

Also worth pointing out that these errors don’t always require a reboot, so if you are doing this under a change window or are limited for time then it’s possible the fix will be applied fine without a reboot. But then hey – your DCs are all virtual, so will reboot in about 10 secs anyway. Right?

HA Failover Errors in vSphere

In a production system running vSphere 5.1, I’ve noticed that on occasion an error appears at cluster level about ‘HA initiating virtual machine failover’.

vSphereHA

Checking the host and vCenter logs shows no host reboots or isolation. In this case, there fortunately is a simple fix that seems to rid the cluster of the erroneous error message.

Disable HA on the cluster (Right-click cluster > Edit Settings > Uncheck HA checkbox). The hosts will automatically disable HA as part of the cluster settings with no effect to the running VMs on the host.

Re-enable HA on the cluster (reverse the settings above). The HA cluster will have an election and elect a new HA master and configure the remainder of the hosts as slaves.

Voila! The error should now disappear and your cluster should be back to full health.

SQL Express: Quick results using SQLCMD

I’ve been playing with some of VMware’s new products in my homelab, and have been using SQL Express 2008 R2 as part of the set-up / installations. In building and burning installs, I’ve often found the need to build and delete databases quickly.

Now – instead of using SQL Management Studio Express Edition to manage this (mostly DB drop) avtivity, I’ve found a much quicker method: Use SQLCMD.

To connect to a local SQL Express instance \SQLEXPRESS:

C:\sqlcmd -S.\SQLEXPRESS

To list existing databases within the default instance:

1> sp_databases
2> go

To ‘use’ a particular database (e.g. DISCO):

1> use DISCO
2> go

To delete a particular database (e.g. DISCO):

1> drop database [DISCO]
2> go

(with the last delete command, note the square brackets around the DB name. Without these, an error will result).

These might seem simple commands, but if you’ve not used SQL or SQL Express before, or you do lots of building in your homelab with SQLExpress, this might just save some time!