Today I played catch-up with Mike Dipetrillo’s blog, which contained an interesting post on VMware’s Distributed Power Management (DPM), an “experimental” feature in VMware’s VI 3.5 software. This is an interesting concept, and one that competitors such as Virtual Iron offer as well. While such features make for a nice story and blog post, I have to question how relevant they are to the typical enterprise. Don’t get me wrong. Power management is critical and the virtualization platform vendors will have to play a role in adding power management to the hypervisor; however, I’m not convinced that fully shutting down physical hosts is the solution. Sure the mobility offered by virtualization mitigates the risk associated with component failure that may result from frequent power cycles - if one physical host doesn’t come up, the VMs can run on the remaining nodes in the cluster. But is the approach offered by DPM a pill most enterprises are ready to swallow?

This year alone, I have asked hundreds of IT folks about power management and I can count on one hand those that would allow software to dynamically shut down and power up their production systems. Why? Practically every one of them has cold booted a server and had it not come back up. So the idea of scheduling such an activity to occur daily scares the (insert your favorite adjective here) out of them. Driven by these concerns, Burton Group approached every major server independent hardware vendor (IHV) this year and asked them if they had conducted any testing on the impact of distributed power management (shutting down servers nightly) and mean time between failure (MTBF). Across the board, the answer was “No.”

Some IHVs aren’t too interested in even testing such a scenario because they don’t think it will ever fly in most enterprises. Instead, they feel that adding more power management features (e.g. powering down unneeded CPUs, memory, or PCI devices) to their server platforms is the right path to take. Of course, this means that the hypervisor vendors will need to update their software to take advantage of such features. Also, the IHVs will still need to conduct MTBF testing to ensure that any new power management features do not significantly degrade MTBF. As embedded hypervisors (e.g. ESXi, XenServer OEM) continue to evolve, we’ll hit a time where physical drives are no longer needed in servers. Instead, the hypervisor will simply load from flash. Getting the hard disk out of the server will improve power efficiency, but we also need better power management between the hypervisor and server too.

Microsoft has already announced that advanced power management will be included in the Hyper-V update coming in Windows Server 2008 R2. VMware, Citrix, and Virtual Iron (among others) will need to follow a similar path. Until we have advanced power management in the server and hypervisor, and have MTBF tests that allow enterprises to adopt such features with confidence, let’s hold off on propping up science projects features like DPM. Using such features in development, test, or training environments may make sense if you feel the good (improved energy efficiency) outweighs the bad (reduced server or component life). In production environments, stay away from features such as DPM until your preferred IHV will stand behind a particular power management solution, and has the test data to back it up.

Note: Originally posted to Burton Group’s Data Center Strategies blog.