Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection

Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection

Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection
Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection
2023-11-16 05:48:24 - last edited 2024-03-11 01:25:34

Background: 

 

In order to deal with more and more questions about Load Balancing, in this article, we will discuss the most asked questions and answer them with explanations.

 

This Article Applies to:

 

All routers with Load Balance function.

 

Application Scenario:

 

 

Contents:

 

>Common Issues and Explanations

 

1. Load Balancing does not switch clients back to Primary WAN properly.

2. Backup WAN shows "Offline".

3. Backup WAN has traffic flow while it is inactive.

4. How does Online Detection work? Where do I set the Online Detection in Controller mode?

5. Why there are "unauthorized" DNS resolutions and ping? Where do they come from?

6. Can I add up or aggregate my Internet speed by Load Balancing?

7. How can I achieve a cascade Backup? Primary > Secondary > Pis Aller(Last resort)

8. Is the Load Balancing effective on WIFI?

>Note
>QA
>Update Log

 

Common Issues and Explanations:

 

1. Load Balancing does not switch clients back to Primary WAN properly.

 

In response to this issue, please make sure you have configured your router and load balancing correctly. Please refer to the guide: Troubleshooting Online Detection and Link Backup (Failover) Don't Take Effect

Based on the correct configuration in Load Balancing and Online Detection, your network should switch to the primary quickly. About this switch time to backup or primary, it depends on your Online Detection interval.

 

If you encounter an issue in which your Load Balancing and Failover cannot switch back or prioritize your Primary WAN. Please follow the steps below and paste your screenshots into a New Thread so I can discuss and troubleshoot with you.

 

Steps for verification:

 

1. Set up the Link Backup properly. (This applies to the Controller as well. The mechanism is the same.)

 

2. Make sure your both WANs work so that the Online Detection can work properly. Use a computer to tracert in Command Prompt(CMD) to learn about your route. You can unplug the WAN one by one and verify the Internet connection and routing tables.

 

 

3. To simulate the Primary WAN(WAN/LAN4) being down by setting an incorrect IP address to my Primary WAN.

(Do not recommend you unplug the Ethernet which is not testing the Online Detection. We want to make sure the Online Detection can detect and make your router switch to the Backup WAN.)

 

The yellow marker shows the incorrect WAN IP address. And the tracert shows my connection has been switched to the backup.

 

 

4. Set the Primary WAN(WAN/LAN4) to the correct Connection Type again. And monitor the connection by tracert again. Note that I ran two of CMD to rule out the error.

 

 

This is the end of the verification. If you experience an issue with this, please perform and follow the steps above. Screenshot your configuration and steps. Start a New Thread and our team will follow it up.

 

2. Backup WAN shows "Offline".

 

If you have enabled the Link Backup, AKA Failover, your backup WAN will remain inactive so it shows as "Offline" in the status.

 

Please note that when it is inactive, it will still count the packet. The statistics system counts the packet instead of the traffic. When it is inactive, there will still be a small number of packets like DHCP discovery and request(for maintaining a healthy WAN connection), DNS resolution, and time syncing flowing.

 

  Internet Status Connection Status
Ethernet Plugged In Up Online
Ethernet Plugged In Down Offline
Ethernet Plugged Out - Link Down

 

 

3. Backup WAN has traffic flow while it is inactive.

 

As mentioned previously:

Please note that when it is inactive, it will still count the packet. The statistics system counts the packet instead of the traffic. When it is inactive, there will still be a small number of packets like DHCP discovery and request(for maintaining a healthy WAN connection), DNS resolution, and time syncing flowing.

 

When it is inactive, it does not allow much traffic. The router should not use up all the plan or consume unusually.

ICMP used here is to detect the Internet availability.

DHCP discovery/request is plain.

 

Note that these Wireshark results might be different for different people. Pictures were remarked differently.

 

  • 23 hours ago, it showed this much data:

 

winservice.tp-link.com is a local service based on our local network. You should not see it in yours.

 

 

  • 23 hours later:

 

 

 

If you encounter an issue with this where your metered LTE plan was used up or consumed an unusual amount of data, please follow the steps below and paste your screenshots into a New Thread so I can discuss and troubleshoot with you.

 

Steps for troubleshooting:

 

1. Check if your Primary WAN was down before and if the backup was effective in keeping the Internet flowing while you were NOT noticing it. Please screenshot your Log in your new thread which shows the WAN Up or Down. Alternatively, our team will create a ticket for you and ask for the export of your log and review it based on your timeline when you find out about this issue.

 

 

2. Go to your LTE modem(router), examine if the WIFI is enabled. Disable it. You can view the DHCP client list and check if there are any clients using the WIFI from your LTE modem.

3. Wireshark. How to capture packets using Wireshark on SMB router or switch and How to Use Port Mirror to Capture Packets in the Controller

  • Mirror the WAN port and monitor your traffic from the WAN.
  • Leave your computer for hours or days to check if there is any traffic from the backup WAN(while inactive) on the Omada router. Share the screenshots in your newly created thread.

 

4. How does Online Detection work? Where do I set the Online Detection in Controller mode?

 

Online Detection is the name of the Internet availability test in Standalone mode. In the Controller mode, it is called Echo Server.

For any Internet detection IP or domain, please set a public one. We do NOT recommend you use a private IP address even though you can put a private one.

It continuously pings from the WAN, and by receiving the ICMP replies from the IP(server) or domain you've set to identify if your Internet is available or not.

 

 

5. Why there are "unauthorized" DNS resolutions and ping? Where do they come from?

 

Like said before, if you have set the Online Detection(Echo Server) in Controller mode, the router will ping the server IP you've set. So, you'll see ICMP in the Wireshark pinging the IP address.

 

 

It is normal to see only replies because the router has sent the requests, I am Wireshark from the LAN. Based on the knowledge that ICMP works by Request and Reply, the Internet actually works.

 

 

If you did not set the Online Detection manually, yet you have enabled the Link Backup, this will automatically enable the Online Detection.

In this situation, the router will use the default address to ping and test your Internet availability.

 

Additionally, the system writes four domains to be resolved to detect if your network is online or not: www.google.com, www.tp-link.com, www.ieee.org, www.w3.org.

 

6. Can I add up or aggregate my Internet speed by Load Balancing?

 

It is NOT adding up or aggregating your download or upload speed. Unless the server you are downloading from supports multiple sessions(terminology) while you have enabled Load Balancing and your multiple WANs are active(status as Online). In this situation, the background Load Balancing algorithm will balance the sessions between the WANs.

Again, the prerequisite is the server you are downloading from supports multiple sessions. This is only the possible scenario where you witness an aggregated speed.

If you simply want to test your multiple connections, you should refer to this FAQ: Why failing to achieve bandwidth aggregation of multiple WAN ports by Speedtest on SMB router

 

7. How can I achieve a cascade Backup? Primary > Secondary > Pis Aller(Last resort)

 

The default system does not support this failover setup. But we have a workaround to achieve this.

For example, you have three WANs in this situation.

1. Set WAN1 and WAN2 as the primary WANs. Set the backup WAN as WAN3.

 

 

2. Go to the Policy Routing, and set the ALL network route to the WAN1, mode as Priority.

 

 

8. Is the Load Balancing effective on WIFI?

 

Yes, of course. There is no reason it does not allow WIFI clients. The Load Balancing is set on the router. It applies to most features on the router if it is set on the router, it is layer 3(IP) based. Your network routes according to the IP.

So, if you have trouble with this, please follow the steps above in tracert, you can verify it based on those instructions, too.

 

Note:

 

1. We cannot make a zero-feeling (insensible) failover when switching to the other WAN when the network fails. Due to the differences in TCP and UDP, we cannot make it happen. Sometimes, you have to reset the session by refreshing the page.

2. If you are in standalone, you cannot set the Online Detection period, unlike the Controller mode. In this case, you may face a few seconds of downtime before it detects the offline and switch over. We have optimized in previous firmware updates and this is acceptable in our opinion. 

 

QA:

 

Q1: In Online Detection Auto mode, does it detect the online status by pinging the DNS server or by sending DNS lookup queries? What are the domain names being queried?

 

A1: In Auto mode, detection is done by sending DNS lookup queries. The following 4 domain names are queried: www.google.comwww.tp-link.comwww.ieee.org, and www.w3.org. Successful resolve means your network is online. Vice versa.

 

Q2: How many times does the router send detection packets in Online Detection Auto mode before considering the WAN port disconnected?

A2: In each detection, each WAN port will individually send DNS lookup packets to 4 domains. If there are no replies, it is considered that the WAN port has disconnected.

 

Q3: How long does it take for the router to switch to Backup WAN after determining that Primary WAN is disconnected?

A3: Based on current test results, it takes about 30s-60s(Standalone mode). This duration is related to the number of WAN ports and the interval of online checks. Under link back functionality, the status of the backup WAN is offline and it requires online check to detect the Primary WAN as offline before enabling the Backup WAN.

 

Q4: Can Ping test and DNS query(detection) be performed simultaneously in Online Detection Manual mode? What mechanism determines if a WAN port is online or offline? Is it necessary for both Ping and DNS queries to fail in order to consider a connection is down?

 

A4: In Manual mode, the Ping test and DNS query can be performed simultaneously. Properly set the Ping IP address, and set a DNS Lookup server and it will work as expected.

 

The priority is given to the DNS domain queries. If all 4 domain names fail during a DNS query, then the Ping test will be conducted to cross-check.

If a successful response is received during DNS query in the 4  domain names, the Ping test will not be carried out.

 

For DNS check - if an IP address associated with any one of 4 domain names can be resolved successfully, then it's considered online; if none of them are detected successfully then it's considered offline. But then it starts to ping and check the online status.

4 domain names have higher priority than the Ping test.

For ping - if an IP address can be reached through the ping command successfully then its' considered online; otherwise its' considered offline.

 

Q5: How many times does Online Detection Manual mode perform tests before considering a connection dropped?

A5: In Manual mode, testing occurs based on configured interval time for the online check process. During each test when either the specified IP address cannot be pinged or the default domain name's DNS lookup fails, the WAN port is directly set as offline.

If the Internet is disconnected after the interval, it'll set WAN to offline in the next test and this might cause a delay.

 

Q6: In Online Detection Manual mode, can we specify a DNS query domain name instead of just a DNS server address?

A6: Currently, it does not support specifying a DNS query domain name(DoH or DoT). It is a DNS Lookup server with an IP address.

 

Q7: What are the use cases for Always Online mode?

A7: The use case may arise in internal LAN environments where DNS queries cannot be resolved. It has also been observed that some ISP service providers block specific DNS queries made by the user gateway, such as for the domain name www.google.com, resulting in offline detection.

 

Q8: Does Backup WAN perform Online Detection when Primary WAN is online? After switching to Backup WAN due to Primary being disconnected, does Primary continue with Online Detection and automatically switch back once it detects that Primary is online again?

A8: Backup WAN does NOT perform Online Detection when Primary WAN is online.

After switching to Backup WAN due to the disconnection of Primary WAN, Primary WAN continues with the Online Detection test. Once it detects that Primary WAN is back online, automatic switchback occurs.

 

Q9: Are there any differences in Online Detection and Link Backup mechanisms between Controller mode and Standalone mode?

A9:

In Controller mode:

1. The time interval for Online Detection can be modified on the Controller.

2. Online Detection under the Controller setup, ping test(AKA Echo Server) can fill an IP or hostname; in standalone mode, currently supports only IP filling.

3. Online Detection on the Controller side, it's not possible to specify a DNS query server for DNS checks but only the ping test(Echo Server) is available.

4. The "Always Online" option is NOT available on the controller.

5. There are two backup modes (backup modes referring to link backup), which are present on the controller but not available in a standalone setup.

 

Mode 1:

 

Mode 2:

 

Update Log:

 

Dec 20th, 2023:

Add QA section.

Update Contents.

 

Jan 10th, 2024:

Update the title.

 

Jan 11th, 2024:

Update the format for a better reading experience.

Optimize the QA for better referring.

 

Mar 11th, 2024:

Update the format.

 

Recommended Threads:

 

Troubleshooting Online Detection and Link Backup (Failover) Don't Take Effect

Troubleshooting Custom DDNS Is Not Working

 

Feedback:

 

  • If this was helpful, welcome to give us Kudos by clicking the upward triangle below.
  • If there is anything unclear in this solution post, please feel free to comment below.
  • If you encounter such an issue, please follow the troubleshooting above to check your settings. Besides, ensure your Omada Controller and Gateway are running with the latest firmware.
  • If the issue still exists after you try the suggestion above, please feel free to comment below or contact our support team with a detailed description of your issue and the steps you have tried.

 

Thank you in advance for your valuable feedback!

 

------------------------------------------------------------------------------------------------

Have other off-topic issues to report? 

Welcome to > Start a New Thread < and elaborate on the issue for assistance.

 

Best Regards! If you are new to the forum, please read: Howto - A Guide to Use Forum Effectively. Read Before You Post. Look for a model? Search your model NOW Beta firmware got some NEW features! Subscribe for the latest update!Download Beta Here☚ ☛ ★ Configuration Guide ★ ☚ ☛ ★ Knowledge Base ★ ☚ ☛ ★ Troubleshooting Manual ★ ☚ (Disclaimer: Short links are used above solely for guidance to TP-Link subdomains and are safe and tracker-free. Exercise caution with short links from non-official members on forums. We are not liable for external content or damage from non-official members' link use.)
  9      
  9      
#1
Options