Well, that was short lived. There are now two new issues popping up for vSphere 6 that are pretty serious.
KB2143943 is a bug that causes vMotion to not work when upgrading from 5.0 or 5.1 to 5.5u3 or 6.0u1, which means your VMs can’t be moved forward after upgrades unless you turn them off and then back on again. 5.5u3 has a patch that fixes this, but 6.0 does not. The big problem is that it’s hard to tell if this effects everyone or just some implementations. So, with that in mind, unless you want to turn off all your VMs to manually move them to the upgraded host, trying to upgrade might put you in a tough situation.
KB2144657 is a bug that can cause VMware to not failover to other paths after they have a transient issue. Essentially it marks the first path as permanent device loss, and doesn’t failover to the others, so the LUN can no longer be accessed. So, if you have any path failovers during a storage controller upgrade, or a switch having maintenance or something else of that shape, form or fashion, the whole LUN may die. There isn’t a current solution, you just have to reboot the host.
If you run VMware (as most people do) and you follow their updates, you’ve probably noticed that the last few iterations have had some issues. It has widely been regarded in the industry that 5.5 update 2 was the most recent “stable” release of VMware — including all versions of 5.5 and 6.0 that have come since. Finally, though, we appear to have (mostly) issue-free versions of VMware in the form of 5.5 update 3b and 6.0 update 1b.
Below is a slightly tongue-in-cheek recount of the issues and problems, but the short of it is, as always, never truly assume that even a very mature software company always puts out good releases (Microsoft, anyone?). Unfortunately, testing in your lab often isn’t enough to discover all the issues. Sometimes you have to just wait it out and watch the industry unless an update has a critical fix you just HAVE to have. Of special note, in IT we’re often very tempted once we finally get through an upgrade cycle to ignore the newly upgraded software completely until we HAVE to touch it again. In the below cases, that could lead to serious issues. There is no safe time to ignore updates (or at least their release notes).
Let’s Start with 5.5
5.5 update 3 has a serious issue where a VM might randomly fail when a snapshot is removed or consolidated, which is bad for the ones you make yourself (especially since most of us do it without even thinking about it). A LOT of backup products use snapshots during backups. That means during backup windows, loads of VMs could start dying. KB2133118 points out this problem. The fix, of course, is to upgrade to 5.5 update 3a.
Unfortunately, there was still a big SSL3 POODLE vulnerability … so, this was also not a safe harbor version for your production. Luckily, 5.5 update 3b then came out, which finally gave us a safe place to dock our virtual fleet.
It is kind of expected that a new major release of ANYTHING will have some serious problems. VMware 6.0 had a very unique problem, though. After some period of time (or never, depends on your environment), the entire host would lose network connectivity. No management, no vMotion, nothing. Basically, you could gracefully shut down the VMs running on it (if you could figure out what they were), and then kill the box. KB2124669 points out this problem. The fix was to upgrade to 6.0 update 1.
Then, with update 1, we have a new problem. Change Block Tracking (CBT), used by many backup products to make backups faster and more efficient, breaks. It acts like it runs, but you just can’t trust the data it gives your backup software. Your options are to disable CBT data for every server, and do full backups, or downgrade to 5.5 update 2 (since update 3 still had big problems). KB2136854 points out what has become SNAFU for VMware.
Finally, 6.0 update 1b fixed the CBT issue for good, though you do have to reset CBT for every host after upgrading to get known good data.
So, with that being said, if you’re on 5.5 or 6, you might want to go ahead and apply the updates, but keep a keen eye going forward when new updates come out — wait it out just enough to see if any issues arise but don’t wait too long.