SQL VM CPU Spikes

Some people still aren’t convinced by virtualisation and while it’s true that there are some situations that it’s not especially suited for they are relatively few in my experience. I know a few people who are yet to be convinced completely. One’s a SQL DBA and there are times when she has a point. I thought that this might be one of them until I started poking around.

screenshot_2009-12-04_18-25-21Initially I was asked about what was causing a SQL VM to respond slowly and use 100% CPU. I had a look in vCenter and while it looked slightly busy it didn’t seem over worked. As the graph to the right shows, it was using only about half of the available 3Ghz CPU it had access to. Perhaps I should explain further at this point that my client’s practice when it comes to VMs is to provision them with a single vCPU and add more if they are required. It seems that if this was normal load for the VM that 1 vCPU should be enough.

Looking back through the VM’s performance history I could see nothing particularly wrong either. Occasional CPU spikes in the past possibly indicating reboots or overnight processing. Oddly though there were random plateaus of activity for several hours at a time. Mostly overnight the VM would idle along using practically no CPU resource but during the day there were long periods where it looked a lot like the activity above. Time to look at the guest OS.

The picture inside Windows is slightly different. Opening up task manager shows frequent bursts of 100% CPU usage (see below). Actually you could call them regular. And, more worringly, it transpires that the server is not yet in production – it’s still being configured.

screenshot_2009-12-04_18-18-30

The offending process is services.exe so it’s not immediately obvious what the issue is. Purely by coincidence I asked the DBA if she could log off for a while so that I could look into what was going on. When she did, the strangest thing happenned:

screenshot_2009-12-04_18-27-18

See how the CPU usage dropped back down to idle and stayed there. That begged the question “What were you running?”.

It turns out that the culprit was none other than SQL Management Studio. When opened and connected it polls the server’s status every 10 seconds. Strangely though, instead of polling just the SQL services it polls all services on the server (this can be seen using Process Monitor) which seems a bit excessive to me. Due to the way that hypervisors share resources, what would be a small blip on a physical host is magnified within the VM somewhat. Microsoft have acknowledged that this happens but to my knowledge haven’t done much about it. There is a registry key that can be modified to adjust Management Studio’s behaviour. For SQL 2005 SP1 onwards (it’s not available before that) it is:

HKLM\Software\Microsoft\Microsoft SQL Server\90\Tools\Shell\PollingInterval

Setting it to 600 will reduce the frequency of polls to once a minute. Alternatively, just don’t leave SQL Management Studio open longer than you have to and wait for Microsoft to fix it.

Michael is a Senior Consultant for Xtravirt. If it's got buttons or flashy lights on it then it'll probably be on his radar. When not "mending computers" (it's sometimes easier than explaining "cloud" to relatives), Michael provides essential education, entertainment and trampoline services to his two children.

Comments

  1. says

    How many servers did the DBA in question have “snapped-in” to the management console ?

    I’ve just build a fresh SQL 2005 VM and though i’d see if I could replicate the issue , but to no avail. The only thing I can think of is that I only had a single server attached. If I had a lot of them attached I could see the polling taking a lot longer.

Leave a Reply

Your email address will not be published. Required fields are marked *