Last week i had a discussion with a customer regarding the vSphere Distributed Port Group Teaming and Failover configuration Failback (Failback) when using the Teaming and Failover:
- Load balancing configuration “Route based on physical NIC load” aka LBT.
- All available dvUplinks configured as Active uplinks.
We discussed if the Failback configuration makes a difference regarding virtual machine network availability and ESXi host vmnic usage.
Yes i know we are talking about approximately one missing echo reply when failing a virtual machine from one ESXi host vmnic to another ESXi host vmnic when using the Teaming and Failover Notify Switches configuration set to “Yes”.
I have used Failback configuration “Yes” based on:
- The only recommendation i have seen where you should set the Failback configuration to “No” is when configuring the ESXi host management connection using:
- 2 ESXi host vmnics.
- Using the 2 ESXi host vmnics in an Active uplinks/Standby uplinks configuration.
- It is the default configuration when creating a vSphere Distributed Port Group.
The vSphere Documentation describes the Teaming and Failover configuration options “Load Balancing – Route based on physical NIC load” and “Faiback” according to:
Route based on physical NIC load
Choose an uplink based on the current loads of physical NICs.
Faiback
This option determines how a physical adapter is returned to active duty after recovering from a failure. If failback is set to Yes (default), the adapter is returned to active duty immediately upon recovery, displacing the standby adapter that took over its slot, if any. If failback is set to No, a failed adapter is left inactive even after recovery until another currently active adapter fails, requiring its replacement.
Preparation
- Create one distributed switch – dvSwitch01
- 2 dvUplinks
- Network I/O Control enabled and using the default configuration:
- Add one ESXi host to the newly created distributed switch using two vmnics:
- Create the Distributed Port Group using the following Teaming and Failover configuration:
- Create 4 virtual machines:
- w2k8-01
- w2k8-02
- w2k8-03
- w2k8-04
During my testing i:
- Did not put any load on the ESXi host vmnics since LBT distributes the virtual machines to the ESXi host vmnics based on network load when using the Teaming and Failover Load Balancing configuration “Route based on physical NIC load”.
- Used the ESXi host built in tool “esxtop” to list the virtual machine to ESXi host vmnic mapping.
During my tests the ESXi host vmnic2 is mapped to the distributed switch dvUplink1 and the ESXi host vmnic3 is mapped to the distributed switch dvUplink2.
Test I
- Verified the virtual machine to ESXi host vmnic assignment.
- Started an ongoing ping test to all four virtual machines using “ping -t IP-address”
- Unplugged the physical network cable for ESXi host vmnic2 and verified all virtual machines were moved to ESXi host vmnic3.
During the failover from ESXi host vmnic2 to ESXi host vmnic3 i lost one echo reply/ping each to the virtual machins w2k8-01 and w2k8-03. - Plugged the physical network cable back for ESXi host vmnic2, listed the virtual machine vmnic belonging and found that both w2k8-01 and w2k8-03 were moved back to ESXI host vmnic2.
During the failover from ESXi host vmnic3 to ESXi host vmnic2 i lost one echo reply/ping each to the virtual machins w2k8-01 and w2k8-03.
Test II
- Changed the Failback option to “No”
- Verified the virtual machine to ESXi host vmnic mapping.
- Started an ongoing ping test to all four virtual machines using “ping -t IP-address”
- Unplugged the physical network cable for ESXi host vmnic2 and verified all virtual machines were moved to vmnic3.
During the failover from ESXi host vmnic2 to ESXi host vmnic3 i lost one echo reply/ping each to the virtual machins w2k8-01 and w2k8-03. - Plugged the physical network cable back for ESXi host vmnic2 and verified the virtual machine to ESXi host vmnic mapping.
We can see that w2k8-01 and w2k8-03 that belonged to the ESXi host vmnic2 before we started Test II did not moved back to ESXi host vmnic2. Instead the virtual machines w2k8-02 and w2k8-04 were moved to the ESXi host vmnic2 when it came back online.
During the failover from ESXi host vmnic3 to ESXi host vmnic2 i lost one echo reply/ping each to the virtual machins w2k8-02 and w2k8-04.
Conclusion
No matter if we are using the Failback configuration set to “Yes” or “No”:
- All (both in my case) ESXi host vmnics (mapped to distributed switch dvUplinks) will be used whenever they are online.
The virtual machines running on the ESXi host will be distributed on the available ESXi host vmnics (mapped to distributed switch dvUplinks) when a failed vmnic recovers from a failure. - There will be potential/very limited virtual machine network outage when a failed ESXi host vmnic (distributed switch dvUplink) recovers from a failure.
I will keep using the default Failback configuration, “Yes”, since i do not see any major advantage of changing it to “No”.
You must use the Teaming and Failover configuration Active uplinks and Standby uplinks to prevent virtual machines from being distributed on all available ESXi host vmnics for a specific vSphere Distributed Port Group when a failed ESXi host vmnic recovers from a failure.