EAP653 frequently getting stuck re-adopting
After moving into my own home, a three-storey building, I have bought into the Omada ecosystem with the following TP-Link hardware:
- ER605 v1.0 @ firmware 1.3.1
- SG2218 v1.20 @ firmware 1.20.0
- TL-SG2210MP v3.0 @ firmware 3.0.6
- 3x EAP653(EU) v1.0 @ firmware 1.0.9
The TL-SG2210MP plugs directly into the ER605. The three EAP653 are powered using Cat7a cables over PoE by the TL-SG2210MP. The SG2218 is connected to the TL-SG2210MP through their SFP slot.
All of these are coordinate using the omada controller software (5.13.22) running in an LXC container on a proxmox virtualisation server plugged into the ER605. Initial setup and provisioning worked like a charm and I've setup a couple of VLANs though I have not yet configured any firewalls. All TP-Link devices are currently on the default "LAN" network at 192.168.0.X and get IP addresses assigned through DHCP.
The setup worked fine for a couple of days, however, since about a week I am constantly experiencing issues with the EAP653 access points. What happens is that the access points become listed as "ADOPTING" in the controller webinterface however they never succeed in the adoption. There is no status change visible through the UI, thus, the AP is stuck in the "ADOPTING" state. The TL-SG2210MP shows the ports as "active" with PoE power output roughly similar compared to that of the "CONNECTED" APs (about 6 watts). The log accessible through the controller webinterface shows multiple alerts indicating that the AP "was isolated". I can get the AP unstuck by power cycling the AP either by unplugging and re-inserting the network cable or triggering the "PoE recovery" on the relevant port of the TL-SG2210MP.
The wireless network configuration is relatively simple with just a single SSID on 2.4 and 5 GHz secured using PPSK without RADIUS. I have configured three VLANs (/24 ip space, each has its own DHCP) each with its own passphrase. All other configuration options are left to defaults.
All three EAP653 are affected although they typically don't fail at the same time. The failure rate is about once a day, meaning that currently I have to power cycle at least one EAP653 per day.
I don't really know where to go from here. The omada controller doesn't really offer much in terms of observability / logging which makes debugging near impossible for a newcomer. Can anyone more experienced provide any hints what to check?