Update: The hotfix mentioned below (2796995) has been superseded and is now included in the following Hotflix: http://support.microsoft.com/kb/2870270/en-us
This is not how I envisioned kicking off my very first blog post, I had this grand idea that I would write up some of the cool stuff I am currently working on with Server 2012 Hyper-V and System Center 2012 – Virtual Machine Manager around converged fabric or something awesome like that. That said, I have been dealing with some real world issues for the last couple of days hindering me from making a grand entrance :-). Instead I will share a quick bit of information to keep you from temporarily ruining your life!
@scevans and I were doing what we would normally be doing on any Tuesday night, held up in a conference room eating a rather large bag of pistachios for dinner while trying to reproduce an issue for Microsoft PSS (reproducing an issue not related to our destruction of data via ODX issue). To reproduce our issue I needed to move a few virtual machines to a CSV (Cluster Shared Volume) and make the VMs highly available, since the VMs were already running I figured I would just Storage Live Migrate the VMs from the non CSV LUN into the CSV LUN and then add the VMs as a cluster resource. Since the Hyper-V lab is running Server 2012 and the Compellent code is at 06.03.02.016.01 which supports ODX this move operation did not hit my iSCSI NICs at all and instead the move operation was handled by the Compellent’s copy manager via ODX. Once the Live Migration completed I got this rather alarming error on all of the VMs that were moved in this fashion:
Once it reboots I was then presented this absolutely horrifying error:
A disk read error occurred Press Ctrl+Alt+Del to restart
At this point, you either explain to your pistachio eating colleague that you are going to jump out of the conference room window (1 story building) or you man up and start searching for a root cause that can be directly related to the fact that you are using extremely bleeding edge technology (limited release Compellent code to support ODX, RTM version of Server 2012 etc).
What we quickly stumbled upon was KB article 2796995 which goes on to explain that the hotfix will address an issue where “the copy operation freezes at 99 percent” which does not explain the exact issue we are experiencing as our live migration completes successfully. If you look through the the following TechNet article which details a Hyper-V specific update list for Windows Server 2012 you will see that they reference the same KB to resolve “ODX causes corruption during Live Migration.” This is definitely more like the issue we are seeing.
Hyper-V: Update List for Windows Server 2012
2796995 has been superseded and is now included in the following Hotflix: http://support.microsoft.com/kb/2870270/en-us
Once this hotfix was applied to the Hyper-V hosts I went ahead and restored the VMs via the Compellent replays (snapshots) and I was able to try the storage Live Migration again and everything worked perfectly! In addition to the Storage Live Migration via ODX not causing corruption, ODX also appeared to work much faster. I will need to do more research / testing around this to verify speed increase via this hotfix.
The hotfix can be requested from the following URL (you have to request an email link for this hotfix – it took me about 5 minutes to have the hotfix linked to me from Microsoft):
http://support.microsoft.com/kb/2796995 – Update: http://support.microsoft.com/kb/2870270/en-u