Recently while meeting with a vendor some skepticism was expressed as to our shop's decision to go with Hyper-V from Microsoft over a virtualization solution from VMware. The vendor rep mentioned memory over-commit as his reason for preferring VMware over Microsoft's solution, which instantly put me on the defensive. While I admittedly work for the "Windows Group", I try to stay as OS-agnostic as possible. As with all things in life, it's all about the right tool for the job - and I won't ever and have never claimed that Microsoft software is the "end-all, be-all" fix for what ails you.
But this particular comment about memory over-commit was preceded by the statement “you guys are pretty much the only folks we know of who are doing Hyper-V”, so on this occasion I was particularly sensitive to this argument. I responded with the certainly-obvious point that “I would never want to put into production a system that had a potential to blow past the top of the available memory space in the event of an unplanned failover, as that to me is a single point of failure.” Naturally the vendor rep rebutted with standard lines about “getting the most out of the hardware” and more drivel which I completely ignored – in much the same way that my point was ignored by the rep. It’s unfortunate that many of us in IT which do claim to be agnostic still fall into our partisan ways when put on the spot, myself included.
Today I came across an article on the Hyper-V vs. VMware topic by one of my favorite bloggers, James O’Neill. Mr. O’Neill (who does work for Microsoft, and thusly wears his bias on his sleeves) eloquently made the technical argument for my point much more coherently than I’ve seen.
“Memory over-commit. Vmware's advice is don't do it. Deceiving a virtualized OS about the amount of memory at its disposal means it makes bad decisions about what to bring into memory - with the virtualization layer paging blindly - not knowing what needs to be in memory and what doesn’t. That means you must size your hardware for more disk operations, and still accept worse performance.” - http://blogs.technet.com/jamesone/archive/2009/12/21/drilling-into-reasons-for-not-switching-to-hyper-v.aspx
So I’ll promptly be inserting this exact language into my defense of our decision. Aside from the built-in goodness of high availability (driven by proven Windows Failover Clustering technology) (which, BTW, is free – check out Hyper-V Server!), the lack of over-commit is a huge win when comparing platforms. It prevents novice admins from making a major mistake when designing a virtual infrastructure, and forces admins to provide adequate memory on hosts for the virtualized workloads in question. We in higher education understand better than most the pains of slashed budgets and “do more with less” directives… But when it comes to the pursuit of five-nines, in my book you can’t leave a critical factor like memory availability to chance and expect great results.
Besides, I would posit that no professional system administrator worth his weight in DVI dongles would ever take advantage of memory over-commit in a production system. And as a vendor, if your go-to argument for VMware over Hyper-V is memory over-commit you’d better do your homework - next time I’ll be much better prepared to make my point. After all, VMware’s own advice is “don’t do it”.
PS: For the record, I am in the process of bringing up two VMware ESXi 4.0 nodes for some non-Windows workloads. It's the right tool for that job, at least as of December 2009.