Two EAP245 (v1 + v3), Omada controller, different behavior

This thread has been locked for further replies. You can start a new thread to share your ideas or ask questions.

Two EAP245 (v1 + v3), Omada controller, different behavior

This thread has been locked for further replies. You can start a new thread to share your ideas or ask questions.
Two EAP245 (v1 + v3), Omada controller, different behavior
Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-02 23:49:10
Model: EAP245  
Hardware Version: V3
Firmware Version: 2.4.0 Build 20200117 Rel. 39932

Hello,

 

I have two EAP245 APs, one old model v1 (firmware 1.4.0 Build 20180323 Rel. 32551) and one recently bought v3 (2.4.0 Build 20200117 Rel. 39932). Configured using Omada controller 3.2.10. I have two VLANs (and two SSIDs) configured with the ids of 1 and 7.

 

I have Netgear GS108PEv3 managed switch. I have installed that swich when I added the new AP (v3 one). The switch is also running the latest firmware.

 

Shortly after configuring everything I have noticed that I could no longer access the Web UI of the switch from my laptop. Odd. I have restarted the switch while moving around with the latop and it's got working again, for a while. This did lead me on the wrong path in my investigation. Long story short, I have finally found that it is not the switch - it depends on which of these two APs I am connected to!

 

Turned out that if I am connected to the old v1, everything works smoothly, I can access the switch Web UI. Once my laptop jumps to v3, I lose access. Cannot ping the switch, cannot browse it. However, everything else is working fine at the same time, I can connect any device in the network or outside. Any but the switch itself!

 

Finally I got some time and did the ultimate test. Plugged only v1 AP in the switch, confirmed that I can access the UI and ping the switch. Unplugged v1 AP, plugged v3 one in the SAME port. No access. Again, everything else works, can access any device on the net but that switch. So, I am convinced that it is not the switch, not its configuration. It also cannot be the AP configuration since I manage both via Omada Controller. Also the network is rather simple. Oh, I have also tried to plug my laptop directly in the switch with the cable - I can access the UI of the swich without any issues.

 

The same behavior is demonstrated by all wireless devices, so it is clearly not the client problem either.

 

Just to add some details to it. The APs are configured with two SSIDs and two VLANs, 1 and 7. 1 is the main VLAN.

 

APs receive the IP addresses via DHCP. One is 192.168.8.103, another one is 192.168.8.105. The switch is 192.168.8.104.

 

Netgear switch port configuration (concentrating on one port - all my last tests were done using that particular port):

- 802.1Q mode is used (not port-based)

- APs are connected to port #1

- port #1 has VLANs 1 and 7 enabled

- on port #1 VLANs 1 and 7 both use tagged mode

- PVID 1 is configured for all ports by default

 

 

I am really confused. Given the kind of test I have done, I have eliminated all possibilities except somehing being different in the traffic from EAP245 v3. And thinking about what can be different, the only word that comes to my mind is VLAN. But what is it that can be different by the traffic sent by ONE OF THESE TWO APs that makes it impossible to connect to the UI (so, essentially, IP stack of the switch)?

 

The only test I have left before I start plugging things into the computer and doing some network dumping is trying the 3rd AP. I have just received E225-outdoor and my plan was to plug it in the same switch and have 3 APs.

  0      
  0      
#1
Options
8 Reply
Re:Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-03 00:58:32 - last edited 2020-07-03 00:58:51

@ngrigoriev,

 

unfortunately you did not describe to which VLAN/SSID you connect to when doing the test and what frame types this VLAN uses.

 

I guess the different behavior comes from a bug in older firmwares, which leaked tagged frames to a SSID not assigned to a VLAN. There were several reports claiming that after this bug was fixed their VLAN network no longer worked as it used to work before the update.

 

In other words: if your VLAN setup depends on leaking frames from a VLAN to a SSID not assigned to this VLAN, then it won't work with the latest firmwares which did fix this leakage starting with version 2.3.0 of 2019-07-31 for EAP245 V3. From the release notes:

 

 

Since EAP245 V1 is not supported anymore, its firmware has not been fixed.

 

It's just a guess (since your posts lacks important information about the tagging of VLANs and SSID-mapping), but at least you now know a difference in behavior of the two hardware versions.

 

The above bug fix affects all networks which use untagged and tagged frames on the trunk to the EAP and utilize the »Default VLAN 1« for the purpose to process untagged frames. It does not affect networks which use only tagged frames on a trunk, so a work-around is possible even for EAP245 V1.

༺ 0100 1101 0010 10ཏ1 0010 0110 1010 1110 ༻
  0  
  0  
#2
Options
Re:Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-03 01:49:39

@R1D2 

 

 

My WiFi devices that I test it from connect to the SSID that is associated with VLAN 1. And, as I said, everything works well with the _old_ v1 AP. But not with the new v3 one. So, indeed, I saw these fixes in the change log for the recent firmwares and, in fact, I was thinking that they are probably to blame for this. Or other bugs introduced and not yet fixed between two firmwares - the latest for v1 and latest for v3.

 

Another SSID (associated with VLAN 7) is not really in question - it is used by some IoT stuff, has completely different network etc.

 

So, to make my description simple, I connect to the SSID that is associated with VLAN 1 (and get the IP address from the same 192.168.8.0/24 network). In Omada I have 

Wireless VLAN: [x] Enable and Wireless VLAN ID: 1 for this WLAN. So, my outgoing traffic should be tagged with VLAN1 by either EAP and then arrive to the port of the switch that is configured as "tagged" for VLAN 1. The response should go back through the same port of the switch. So, basically, in this scanario, we have only two devices (switch + new v3 AP) that create a non-working configuration.

 

I could not find anywhere any information about how exactly the IP stack of the managed switch is "connected" to the switch itself. There are no settings for this. So, the tagged frame arrives to the switch, the ethernet target is the MAC address of the switch. Does it get forwarded to the internal "port" of some sort? And when the reply is sent, so it goes back to the MAC address of my wireless device. The switch surely won't tag it with VLAN 1 at the source. But I assume that when it arrives to the port #1 to be sent back to AP - it would either have VLAN 1 tag or, if not, PVID would be used (native VLAN) and VLAN 1 tag would be assigned to the untagged frame. So, either way, the frame sent back would be tagged with VLAN 1. 

 

  0  
  0  
#3
Options
Re:Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-03 03:57:33

ngrigoriev wrote

 Oh, I have also tried to plug my laptop directly in the switch with the cable - I can access the UI of the swich without any issues.

@ngrigoriev 

Hi, but I have a doubt that whether the Port 1 of the switch is tagged or not, and here you didn't tell the laptop is configured with VLAN1

 

From your test with the laptop directly in the switch via Port1, if the Port1 on the switch is tagged with VLAN1, then the outgoing packet from Port1 would be tagged with VLAN1.

 

As we know, an ordinary laptop is unable to handle tagged packets. If the laptop is not specially configured with VLAN1, but you can access the UI of the switch, then it's likely that the Port1 is not tagged.

 

I think the different behavior might be a bug that untag packets can be transferred to SSIDs with different VLANs. V3 fixed it but not for V1.

 

  0  
  0  
#4
Options
Re:Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-03 08:57:53 - last edited 2020-07-04 01:15:59

@ngrigoriev,

 

if you use a common broadcast domain over more than one VLAN, you probably use an asymmetric VLAN setup.

There is nothing wrong with asymmetric VLAN setups, except that asymmetric VLANs can't work with VLAN-mapped Multi-SSIDs.

 

The bug in older firmwares was a VLAN leak, which magically made asymmetric VLANs work with VLAN-mapped Multi-SSIDs by accident, but which did screw up VLAN isolation in real (symmetric) VLAN setups with separate broadcast domains. Thus, it had to be fixed to make VLAN isolation work again.

 

I suggest to either post a network diagram with enough information to be able to help (IPs, switch port VLAN memberships, PVID settings, SSID to VLAN mappings, see diagram below) or to compare your VLAN setup with the setup required for Multi-SSIDs as shown in this diagram of a typical VLAN-mapped Multi-SSID network:

 

 

Notes:

  • You can use VLAN 1 instead of VLAN 200 for mgmt.
  • You can even use untagged ports for the mgmt VLAN, but not for VLAN-mapped SSIDs.
  • If you use untagged ports for the mgmt VLAN and want to reach this VLAN from an SSID, the SSID must not be assigned to a VLAN.
  • You can use VLAN 10 for mgmt to access it through SSID »Private«, but switch ports 1/0/3-1/0/4 need to be tagged members of VLAN 10.
  • You can not use an asymmetric VLAN setup with VLAN-mapped SSIDs.

 

Just for the record: Laptops running a modern OS can indeed process tagged frames nowadays.

༺ 0100 1101 0010 10ཏ1 0010 0110 1010 1110 ༻
  0  
  0  
#5
Options
Re:Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-04 19:28:47

Hello,

 

Here is a simple diagram to better explain the situattion. By the way, I have added the 3rd AP (EAP225 outdoor) and it demonstrates exactly the same behavior as EAP245 v3. Can't access the switch when connected to it.

 

  0  
  0  
#6
Options
Re:Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-05 09:00:55

@ngrigoriev,

 

what's the »Management VLAN« setting in OC200? If unset, what's the PVID of ports 1/2 of the Netgear switch? What's the PVID of port 5?

 

༺ 0100 1101 0010 10ཏ1 0010 0110 1010 1110 ༻
  0  
  0  
#7
Options
Re:Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-06 01:30:55

@R1D2 

 

I run open-source Omada, not OC200 device, but, I guess, that won't make any difference. Management VLAN is not enabled.

 

All PVIDs are set to 1 except port #7 (where it is set to 3).

  0  
  0  
#8
Options
Re:Two EAP245 (v1 + v3), Omada controller, different behavior
2020-07-06 09:27:06 - last edited 2020-07-06 18:06:34

 

ngrigoriev wrote

I run open-source Omada, not OC200 device, but, I guess, that won't make any difference. Management VLAN is not enabled.

 

All PVIDs are set to 1 except port #7 (where it is set to 3). 

 

Ok, then there is something really weird either with your Netgear switch or with the topology shown in post #6.

 

Scenario #1:

 

This are the settings you described in post #6:

 

1) Management VLAN in SW controller is not enabled.

2) Traffic from controller uses VLAN 1 due to PVID=1 of port #5 (I assume this PVID, you still didn't tell us).

3) Traffic from the EAP itself is untagged due to setting 1).

4) Port #1 is tagged member of VLANs 1 & 7 according to your diagram. PVID is 1.

5) SSID1 is mapped to VLAN 1.

 

Following are the effects of those settings with latest EAP firmware:

 

  • Traffic from the EAPs theirself (not from the SSIDs!) is untagged and gets assigned VLAN ID 1 on the switch due to port's PVID=1.
  • Replies from the controller arrive at the EAP tagged with VLAN ID 1 and will be ignored by the EAP, since it only processes untagged frames.
  • No management possible in SW controller, EAPs will become disconnected from the controller.
  • The switch is in VLAN 1. Traffic to/from SSID1 uses VLAN 1-tagged frames, so it is possible to communictae with the switch using a client associated with SSID1.

 

 

Scenario #2:

 

Let's assume the following settings.

 

1) Management VLAN in SW controller is not enabled.

2) Traffic from SW controller uses VLAN 1.

3) Traffic from the EAP itself is untagged due to setting 1).

4) Port #1 is untagged member of VLAN 1. PVID is 1.

5) SSID1 is mapped to VLAN 1.

 

 

This are the effects with latest firmware:

 

  • Traffic from the EAPs theirself (not from the SSIDs!) is untagged and gets assigned VLAN ID 1 on the switch due to PVID=1.
  • Replies from the controller arrive at the EAP untagged and will be processed by the EAP.
  • Management of the EAPs is possible in SW controller, EAPs are connected to the controller.
  • The switch is in VLAN 1. Traffic from SSID1 uses VLAN 1 tagged by the EAP, but replies from the switch use untagged frames, so it is not possible to communicate with the switch using a client associated with SSID1.

 

Do you see what happens if EAPs itself use untagged VLAN 1 traffic and SSID1 uses tagged VLAN 1 traffic? It cannot work this way.

BTW: that's why I always tell people to never mix untagged and tagged traffic on trunks except you really need this for legacy systems and know what you're doing.

 

Now, in older firmware the bug was that untagged traffic arriving at an EAP was sent to VLAN-mapped SSIDs by accident (not only to SSID1, but also to SSID2, VLAN 7). The effect is that replies from SW controller and the switch will reach SSID1. With older firmwares a setup of switch port #1 as untagged member of VLAN 1 while mapping an SSID to the same VLAN 1 did indeed work, but only because the EAP's firmware was buggy.

 

Thus, VLAN setups which »worked in the past« did rely on a bug and therefore have been buggy, too.

 

 

Solution

 

The solution is very easy: Fix your VLAN setup.


Either assign SSID1 and Management VLAN in the controller to VLAN 1 if you want to send tagged frames to the EAPs or don't assign SSID1 and Management VLAN to any VLAN in the EAP (let VLAN 1-assignment do the switch) if you want to send untagged frames to the EAPs.

 

Bonus tip: When setting Management VLAN to 1 in Omada controller, make sure port #1 is untagged member of VLAN 1. EAPs will be configured and will lose contact to the controller. Now set port #1 as a tagged member of VLAN 1. EAPs will become re-connected automatically.

 

 

If this solution does not work for you (b/c the switch probably behaves in a way not conforming to 802.1Q), I can't help any further – will have to return to work and to set up a VLAN-based EAP network for one of our customers today. As always, I will use the VLAN setup shown in my post #5, that's no untagged traffic to EAPs at all. This setup always will work with old (buggy) and latest (fixed) EAP firmwares.

 

༺ 0100 1101 0010 10ཏ1 0010 0110 1010 1110 ༻
  1  
  1  
#9
Options