ER8411 - Dual WAN with faillover, primary WAN isn't prioritized after reconnection
ER8411 - Dual WAN with faillover, primary WAN isn't prioritized after reconnection
Hello again!
I have configured my ER8411 with Link Backup. I have two WANs with the following characteristics:
- WAN1: 10 Gbps, PPPoE, IPv4+IPv6 (both dynamic)
- WAN3: 1 Gbps, IPoE, IPv4 (dynamic)
Load Balancing > Link Backup settings:
- Primary WAN: WAN1
- Secondary WAN: WAN3
- Mode: Failover (any, tried both)
- Status: Enable
Routing > Static Route settings:
- Static route to X.X.X.X.0/24, next hop 0.0.0.0, to WAN3 (for SIP, works fine).
When my primary WAN1 fails, secondary WAN3 gets prioritized; but then after WAN1 reconnects, the routing table doesn't prioritize WAN1 again:
ID | Destination IP | Subnet Mask | Next Hop | Interface | Metric |
1 | X.X.X.0 | 255.255.255.0 | 0.0.0.0 | SFP WAN/LAN3 | 0 |
2 | 0.0.0.0 | 0.0.0.0 | 188.X.X.X | SFP WAN/LAN3 | 0 |
3 | 0.0.0.0 | 0.0.0.0 | 10.0.0.1 | SFP+ WAN1 | 0 |
4 | 10.0.0.1 | 255.255.255.255 | 0.0.0.0 | SFP+ WAN1 | 0 |
5 | 87.X.X.X | 255.255.255.255 | 188.X.X.X | SFP WAN/LAN3 | 0 |
6 | 87.X.X.X | 255.255.255.255 | 188.X.X.X | SFP WAN/LAN3 | 0 |
7 | 100.X.X.X | 255.255.255.255 | 10.0.0.1 | SFP+ WAN1 | 0 |
8 | 100.X.X.X | 255.255.255.255 | 10.0.0.1 | SFP+ WAN1 | 0 |
9 | 188.X.X.0 | 255.255.252.0 | 0.0.0.0 | SFP WAN/LAN3 | 0 |
10 | 192.168.0.0 | 255.255.252.0 | 0.0.0.0 | LAN | 0 |
When primary WAN1 is active, rows 2 and 3 are swapped, with WAN1 first. I wouldn't have specified what's the primary WAN if I didn't care.
As always, I'll answer any question about the issue.
Best regards!
- Copy Link
- Subscribe
- Bookmark
- Report Inappropriate Content
Thanks for posting in our business forum.
I tested it and I found out there was nothing wrong with it. In standalone, I am able to prioritize my primary WAN.
Test both WANs are working correctly and can forward the traffic.
To simulate my primary WAN being down, I set the manual IP address and made it offline but still physically connected.
Failover is effective and switch me to the backup WAN 172.30.30.1.
When I reconnect the primary WAN by changing it back to correct IP and gateway, primary is online again.
Now I ran the traceroute instantly after the reconnection. I was still on the backup WAN.
Waited 30 seconds. I am switched back to the primary.
- Copy Link
- Report Inappropriate Content
Hi @EmuAGR
Thanks for posting in our business forum.
Are you in Controller mode? There is an option. Have you tried this?
Do you rule out the chance that the session is established on WAN3 when the primary is down and it continues on WAN3 after the recovery?
- Copy Link
- Report Inappropriate Content
Thanks for the reply @Clive_A ,
It seems that "Backup Mode" addresses exactly my issue. I'm using standalone mode, though, I see both "Mode" options (with a third timed one) but no "Backup Mode".
I'll test this again focusing to see if they're old sessions as you suggested.
- Copy Link
- Report Inappropriate Content
@Clive_A I can confirm that new sessions are still being sent through secondary WAN3 several minutes (edit: an hour still) after primary WAN1 comes online.
I tested with both with speedtest (<1 Gbps) and traceroute (second hop goes to WAN3 ISP).
EDIT:
To be honest, keeping old sessions seems reasonable, but right now it's switching to the secondary WAN3 indefinitely. I'd rather have the choice to terminate sessions and switch to primary (good for LTE WAN) or keep old sessions and switch new ones to primary (good for connectivity if asymmetry isn't too high).
The behaviour I would expect:
- No load balancing, no failover:
- Use a random WAN statically, in case of failure remove that WAN row from the routing table and whatever happens.
- Load balancing:
- Randomly distribute sessions between WANs. [Worst]
- Weighted Round-Robin between them considering fixed throughput [this is what I think the router does currently], or...
- Even better, use the WAN with fastest throughput available (approx) when deciding (keeping track of the current usage against a fixed speed. [Possible enhancement!]
- Failover:
- Same priority:
- Failure of a primary WAN switches to another random primary WAN, keeps all traffic in the new WAN indefinitely.
- Different priority:
- If any/all primary are down (selectable as it is now), switch to secondary WAN and route back to primary when it recovers and two options:
- Keep old sessions in the secondary to avoid connectivity interruptions. [Good for unmetered/fiber secondary.]
- Kill old sessions and route back everything to the primary. [Good for metered/LTE/slow secondary.]
- If any/all primary are down (selectable as it is now), switch to secondary WAN and route back to primary when it recovers and two options:
- Same priority:
- Copy Link
- Report Inappropriate Content
@Clive_A I have the same problem, can you follow up on the bug?
- Copy Link
- Report Inappropriate Content
- Copy Link
- Report Inappropriate Content
Thanks for posting in our business forum.
I tested it and I found out there was nothing wrong with it. In standalone, I am able to prioritize my primary WAN.
Test both WANs are working correctly and can forward the traffic.
To simulate my primary WAN being down, I set the manual IP address and made it offline but still physically connected.
Failover is effective and switch me to the backup WAN 172.30.30.1.
When I reconnect the primary WAN by changing it back to correct IP and gateway, primary is online again.
Now I ran the traceroute instantly after the reconnection. I was still on the backup WAN.
Waited 30 seconds. I am switched back to the primary.
- Copy Link
- Report Inappropriate Content
Hello @Clive_A !
I see you enabled load balancing, that's a completely different test case than what I introduced in my first post, as with failover and without load balancing the WANs aren't used alternatively (unlike you've shown) unless one fails.
I'm not using load balancing because my WANs are very asymmetrical and the secondary doesn't have dual stack.
- Copy Link
- Report Inappropriate Content
Hi @EmuAGR
EmuAGR wrote
Hello @Clive_A !
I see you enabled load balancing, that's a completely different test case than what I introduced in my first post, as with failover and without load balancing the WANs aren't used alternatively (unlike you've shown) unless one fails.
I'm not using load balancing because my WANs are very asymmetrical and the secondary doesn't have dual stack.
The original post, the very first, did not mention this. If you don't have Load Balancing, then failover isn't gonna be prioritized. I don't see the reason why it should switch back. Then this is just a failover without load balancing.
If you search for the correct steps to enable failover, load balance is required. Then there is nothing wrong with the feature but your application is special and the feature does not fit your purpose.
- Copy Link
- Report Inappropriate Content
@Clive_A I'll test then as you suggested, with load balancing enabled.
But I think that load balancing with primary-secondary shouldn't have used the secondary until after the primary was down (as per the wording of the option was implying).
- Copy Link
- Report Inappropriate Content
Hi @EmuAGR
When failover is enabled, backup WAN is offline until the Online Detection fails to ping through primary and tells the router to switch over. It's been like this ever since the Omada router was released years ago.
- Copy Link
- Report Inappropriate Content
Information
Helpful: 1
Views: 3001
Replies: 13