What To Do In A Hyper-V Guest If An IP Address Cannot Be Removed

November 25, 2009

My company recently had an issue at our datacenter in MA. To make a really long story short, somebody else's equipment started smoking and the power to the facility got cut as a cautionary measure. Although the datacenter has 2 generators (a main power generator, and a backup generator) that are designed to power the facility indefinitely in the event of an electrical outage, all the electrical power became useless because the EPO (emergency power off) switch got flipped by the fire department.

In defense of the fire department, the bottom line is that the EPO switch got flipped because they determined there was a risk of physical harm to people inside the facility. As I understand it, EPOs are put in place so that a fire department can lessen the risk of electrical shock when attempting to put out a fire. The place was so smoky that nobody could see, but at the same time they wanted to avoid having the fire suppression system kick in and/or or spraying equipment down with hoses. So, in perspective, the temporary outage seems to have prevented a much larger issue for everybody involved.

This leads me to the subject of this post. When the power got cut, our servers went down immediately. A portion of our infrastructure runs in Hyper-V instances. One particular server instance runs IIS, and has a number of websites on it, and each has a distinct IP address. There is a primary IP bound in the "General" IP settings on the primary NIC, and all the other IPs are set under the "Advanced TCP/IP Settings" dialog area.

When the server came online later in the night, only 3 of the websites were working. After numerous conversations with one of the datacenter engineers (who was exceptionally helpful), I was able to restore partial functionality by making random changes in the various IP areas of the NIC settings. The weird thing was that no matter how many times I removed the primary IP (which was still not functioning in a serviceable way), it would just keep coming back when I went back into the NIC settings.

I was finally able to resolve the issue once and for all by using the "netsh" command from the command prompt. I was unfamiliar with this command until this incident. After fiddling around for a bit, it began to remind me of the process for configuring a NIC in a Cisco IOS.

Here is what I had to do to finally restore normal functionality:

C:\Users\John>netsh interface ipv4 delete address "Local Area Connection" addr=192.168.1.10 gateway=all

C:\Users\John>netsh interface ipv4 add address "Local Area Connection" 192.168.1.10  255.255.255.0

After I ran the 2 commands above, I went back into the NIC setting and re-added the gateway, and everything went back to normal.

Before I arrived at the commands above, I tried the delete comand without the "gateway=all", but it did not delete the address correctly (when I tried the subsequent add command, it said the address still existed).

So, what I think ultimately happened was that some kind of binding got messed up when the instance went down that may or may not have effected the internal NIC route table in some low level way. The GUI just didn't do the trick. At the end of the day I guess the CLI still rules.

Month List