Mark Minasi's Tech Forum
Sign up Calendar Latest Topics
 
 
 


Reply
  Author   Comment  
DennisMCSE

Senior Member
Registered:
Posts: 185
Reply with quote  #1 
Upgrading 2 ESXi 5.1 hosts (48GB RAM each and failover configured between the two) with 2 ESXi 5.5 hosts (120GB RAM each with failover configured). Both are connected to vSphere and HA configured. The new infrastructure doesn't have the network configuration finalized.

Upgrading the ESXi servers because the memory requirements of the VM's (18 VM's using about 76GB RAM) is more memory than what each host has (only 48GB RAM). So wanted to upgrade before any issues happened.

Guess what, issue happened before the new infrastructure was configured. Of course.

So one of the old ESXi host had a hard drive fail, and the system decided to halt (purple screen). The result in our case is that the VM's decided that since there was not enough RAM on the old host, the VM's decided to migrate to the new infrastructure (that doesn't have the network configuration finalized). Once they migrated to the new infrastructure, the VM's lost network connectivity due to the network configuration not being finalized (one of the VM's was DNS). Brought our whole network down.

So what is supposed to happen with all the VM's on the failed host when the failover host doesn't have enough RAM to host all the VM's? Do the VM's not migrate? Do they migrate, but use less than the configured RAM? Do the VM's just shut down?

0
wobble_wobble

Avatar / Picture

Associate Troublemaker Apprentice
Registered:
Posts: 940
Reply with quote  #2 
A best effort will be made to power up all guests.
Then memory ballooning/ memory compression/ memory sharing will kick in.
When the ESXi Host or vCenter says its too full, then guests will stop getting started.

Needless to say, it will be sending mail like a spammer to say help.


__________________
Have you tried turning it off and walking away? The next person can fix it!

New to the forum? Read this
0
Infradeploy

Avatar / Picture

Senior Member
Registered:
Posts: 186
Reply with quote  #3 
You configure those things in HA. You say which VM's have priority. To my knowledge RAM is not often the limiter, there's a bunch of options so that in need the VM's will cramp together like the visitors of a fast food congress in an elevator. That's what HA is for. Most people confuse cluster with HA, but actually what you do is tell the cluster what to do in case of an emergency in order for the critical systems to keep running.

You would also configure HA Isolation option to ping an address so they stay put when something happens. The VM's would not start on the host without network if he can't ping the isolation address

__________________
Have SpaceSuit, Will Travel

0
DennisMCSE

Senior Member
Registered:
Posts: 185
Reply with quote  #4 
So it looks like we had a #PF Exception 14 error. From looking through some log files, it looks like a VM was migrating to another host and in the middle of the migration, a hard drive failed causing the Page Fault Exception 14 error. I believe the Exception 14 error occurs when a page being requested has not been successfully loaded into memory.

Also, checked the configuration. HA is configured to restart the VM's on another node in the cluster and it is configured to power up VM's even if there is not enough memory for optimal operations (VM's would just run slower). DRS is also configured to give resource priority to critical systems while lower priority systems will have the memory workarounds (memory ballooning/memory compression/memory sharing).

0
Previous Topic | Next Topic
Print
Reply

Quick Navigation:

Easily create a Forum Website with Website Toolbox.