all clients disconnect and reconnect across network of multiple TP Link devices
all clients disconnect and reconnect across network of multiple TP Link devices
Every client connected to my network is disconnecting and reconnecting from the network and I am looking for some help to debug whether this is a product bug or a misconfiguration.
I've updated every switch and WAP firmware and rebooted all of them, but the problem persists.
Any advice on what to do?
Not shown in the diagram, I have a 2 port LAG between brain and switch-core and a 2 port LAG between switch-core and switch-poe on the fibre ports.
- Copy Link
- Subscribe
- Bookmark
- Report Inappropriate Content
I have tried the additional things:
- unplugged switch-core for 5m
- disconnected one of the LAG links between switch-core and switch-poe for a few minutes
- disconnected switch-basement from the network for a few minutes
I still encounter constant disconnect/reconnect of clients (wired or wireless) regardless of these changes.
- Copy Link
- Report Inappropriate Content
- Copy Link
- Report Inappropriate Content
Hi @BubbaBubba
Have you enabled Loop Detection on your core switch?(like the following screenshot)
On the Log, is there any thing like "xxxx port x has been blocked"?
I feel like there is a loop on your network, so some ports are blocked by the switch automatically.
But to figure out where is the loop, you may need to check it physically.
- Copy Link
- Report Inappropriate Content
Hank21 wrote
Hi @BubbaBubba
Have you enabled Loop Detection on your core switch?(like the following screenshot)
On the Log, is there any thing like "xxxx port x has been blocked"?
I feel like there is a loop on your network, so some ports are blocked by the switch automatically.
But to figure out where is the loop, you may need to check it physically.
It apears that I had already enabled Loop Detection on my core switch. I assume that the output would be in the logs if any loops were detected.
When I search for "blocked" in the logs I do not get any hits.
I will take your advice and physically disconnect all core switch ports, and assuming there are no problems, add them back one-by-one.
- Copy Link
- Report Inappropriate Content
My problem persists when only brain is connected to switch-core and I've physically removed all other links. The problem persists regardless if I have either physical link of the LAG up or both LAG ports.
So, there's something between the switch-core and my Fedora-based router/firewall/dns/dhcp server.
Here's my routing table on brain:
root@brain ~
# route -n !10003
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 96.234.217.1 0.0.0.0 UG 100 0 0 enp15s0
10.10.20.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0.20
10.10.30.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0.30
10.10.40.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0.40
96.234.217.0 0.0.0.0 255.255.255.0 U 100 0 0 enp15s0
169.254.0.0 0.0.0.0 255.255.0.0 U 1009 0 0 bond0.20
169.254.0.0 0.0.0.0 255.255.0.0 U 1010 0 0 bond0.30
169.254.0.0 0.0.0.0 255.255.0.0 U 1011 0 0 bond0.40
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.0.0 192.168.1.1 255.255.0.0 UG 425 0 0 br0
192.168.1.0 0.0.0.0 255.255.255.0 U 425 0 0 br0
193.1.0.0 192.168.1.1 255.255.0.0 UG 425 0 0 br0
I'll continute debugging tommorrow when I'm refreshed.
- Copy Link
- Report Inappropriate Content
Hi @BubbaBubba
Do you have anything like Sonos Speaker in your network?
If you disable LAG/LACP, and only leave a normal Ethernet connection between your switches, will there be an improvement?
On your screenshot I can see many records are related to "brain(VLAN20 gateway)". If possible, could you please share us your VLAN settings on "brain"?
The switch uplink port(to the brain) has profile "All", correct?
- Copy Link
- Report Inappropriate Content
Hank21 wrote
Hi @BubbaBubba
Do you have anything like Sonos Speaker in your network?
If you disable LAG/LACP, and only leave a normal Ethernet connection between your switches, will there be an improvement?
On your screenshot I can see many records are related to "brain(VLAN20 gateway)". If possible, could you please share us your VLAN settings on "brain"?
The switch uplink port(to the brain) has profile "All", correct?
I do not have Sonorr speakers, but I do have a lot of devices and other networks like ZWave and Zigbee. I initially thought the Zigbee network could be an issue because it's relatively new and IIRC I've read that it can interfere with 2.4 wireless. The first thing I did was to disconnect the Zigbee controller (Home Assistant Skyconnect), but it didn't make a difference.
I did not disable LAG/LACP, but after physically disconnecting all Ethernet cables from switch-core, I tried disconnecting one of the LAG Ethernet cables, waited 20m, still saw the disconnect/reconnects, reconnected the cable and then disconnected the other Ethernet cable for the second LAG port/link, but I still observed the problem. Tomorrow I will completely disable the LAG/LACP and go to a normal Ethernet connection and see how that goes.
The switch uplink LAG port does have Profile "all".
For the VLANs on brain, I'm not exactly sure what info is the most relevant. I'd be happy to share more info. I just brought the link down on all of the VLAN interfaces on brain, about 30m ago, but that doesn't seem to be helping either.
root@brain ~
# ip addr show dev bond0.20 ; ip addr show dev bond0.30 ; ip addr show dev bond0.40 !10048
9: bond0.20@bond0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether ba:92:f7:ae:50:d1 brd ff:ff:ff:ff:ff:ff
inet 10.10.20.1/24 brd 10.10.20.255 scope global bond0.20
valid_lft forever preferred_lft forever
10: bond0.30@bond0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether ba:92:f7:ae:50:d1 brd ff:ff:ff:ff:ff:ff
inet 10.10.30.1/24 brd 10.10.30.255 scope global bond0.30
valid_lft forever preferred_lft forever
11: bond0.40@bond0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether ba:92:f7:ae:50:d1 brd ff:ff:ff:ff:ff:ff
inet 10.10.40.1/24 brd 10.10.40.255 scope global bond0.40
valid_lft forever preferred_lft forever
Note that with the VLAN interfaces down my routing table looks like this:
root@brain ~
# route -n !10049
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 xx.xx.xx.xx 0.0.0.0 UG 100 0 0 enp15s0
xx.xx.xx.xx 0.0.0.0 255.255.255.0 U 100 0 0 enp15s0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.0.0 192.168.1.1 255.255.0.0 UG 425 0 0 br0
192.168.1.0 0.0.0.0 255.255.255.0 U 425 0 0 br0
- Copy Link
- Report Inappropriate Content
A few things I'm a little unsure of.
1)
In Omada -> Settings -> Wired Networks -> LAN, I have the following settings. Is it correct for the VLAN "Purpose" to be VLAN and not Interface. I'm not doing any routing on the switches, so I think this is correct as the "Interface" settings want subnet/gateway information, which I think is only relevant if I'm using an Omada Gateway/Router. I did try both "Purposes" and neither have resolved my issue.
2)
In Insights -> Routing Table -> Switch
I am not sure why there are routes with next-hop 192.168.0.1, there isn't anything at that IP and I'm not sure what is putting those next-hops into the table. My current thinking is that this isn't a problem because I'm not doing routing on these switches so this routing table is irrelevant, but I don't even understand why there's a routing table at all.
Also, the routing table changes every few minutes where some of the routes for some of the switches disappear and then reappear. In the below screenshots you can see the routes for switch-familyroom disappear and reappear. At this point I'm not sure if it's always switch-familyroom or different switch routing entries being removed/added. As stated above, I'm unsure if this is related as I'm not actually doing any routing on the switches, that's all done on brain.
- Copy Link
- Report Inappropriate Content
Hi @BubbaBubba
To better assist you, I've created a support ticket via your registered email address, and escalated it to our support engineer to look into the issue. The ticket ID is TKID230327125, please check your email box and ensure the support email is well received. Thanks!
- Copy Link
- Report Inappropriate Content
- Copy Link
- Report Inappropriate Content
Information
Helpful: 0
Views: 1315
Replies: 13
Voters 0
No one has voted for it yet.