DNS Resolution issue with my domain name on new Omada environment

DNS Resolution issue with my domain name on new Omada environment

DNS Resolution issue with my domain name on new Omada environment
DNS Resolution issue with my domain name on new Omada environment
3 weeks ago - last edited 3 weeks ago
Model: ER8411  
Hardware Version: V1
Firmware Version: 1.3.6

I’m running into a very odd DNS resolution issue after migrating from a Ubiquiti (UBNT) environment to TP-Link Omada. Everything worked perfectly before, so this appears to be Omada-related. I’m hoping someone here has run into something similar.

 

Environment

  • Omada router: ER8411 v1.0 (1.3.6)
  • Controller: OC200 v2.0 (2.24.6)
  • Switch: SG2210P v5.20 (5.20.20)
  • Access points: EAP772 v2.0 (1.3.14)
  • Pi-hole (DNS): 10.0.5.20
  • Internal web server (Apache reverse proxy with ACLs): 10.1.5.3
  • Multiple VLANs (management, main, server, etc.)

 

DNS Configuration

  • DNS Proxy: disabled
  • Omada local DNS entry:
    mydomain[.]com → 10.1.5.3
  • Pi-hole also has:
    mydomain[.]com → 10.1.5.3
  • DHCP:
    • Some VLANs use Pi-hole (10.0.5.20)
    • Others use default gateway DNS
  • Behavior is the same regardless of which DNS server is assigned
  • Querying either the gateway or Pi-hole produces the same incorrect result
  • OpenVPN is configured on the Omada router, with DNS set to Pi-hole

 

Goal

I want split DNS behavior:

  • External:
    mydomain[.]com → public IP (working correctly)
  • Internal:
    mydomain[.]com → 10.1.5.3 (NOT working)

 

Current Behavior

  • Internal DNS queries for mydomain[.]com return the IP of the Omada device the client is connected through.

Examples:

  • Connected to AP1 (10.0.0.10) → DNS returns 10.0.0.10
  • Connected to AP2 (10.0.0.11) → DNS returns 10.0.0.11
  • Wired to switch (10.0.0.3) → DNS returns 10.0.0.3

Important notes:

  • The DNS server being queried is still Pi-hole or the gateway depending on what vlan the device is on
  • If I query Pi-hole locally from itself, it returns the correct IP (10.1.5.3) ... So Pi-hole appears to be functioning correctly

Additional observations:

  • OpenVPN clients resolve perfectly (must be connected from outside the network obviously)
  • External resolution works correctly (public IP returned)
  • All internal and external subdomains resolve correctly (e.g. homeassistant[.]mydomain[.]com)
  • Only the root domain (mydomain[.]com) is affected

 

What I’ve Tried

  • Switching DHCP DNS between Pi-hole and gateway (ddefault setting)
  • Adding and removing local DNS entries in Omada
  • Letting Pi-hole handle DNS entirely everywhere or not at all anywhere
  • Enabling and disabling DNS proxy override (proxy to the pihole when on)
  • Removing all ACLs (currently none configured until this issue is sorted out)
  • Testing with no local DNS / dns proxy / DHCP DNS settings at all

 

Current Theory

It seems like Omada (likely the APs or switch) is intercepting DNS queries and somehow rewriting responses for the root domain only. I cannot explain why it would behave this way or why only the base domain is affected.

 

Questions

  • Has anyone seen Omada return device IPs in DNS responses like this?
  • Is there any kind of DNS interception, captive portal, or “optimization” feature that could cause this?
  • Why would this affect only the root domain and not subdomains?
  • How can I resolve this issue? Our current workaround is super ugly (disconnect phone from wireless, turn on hot spot, connect laptop to hot spot, and VPN into local network with openVPN.

 

Any guidance would be GREATLY appreciated…

0
0
#1
1 Accepted Solution
Re:DNS Resolution issue with my domain name on new Omada environment -Solution
3 weeks ago - last edited 3 weeks ago

UPDATE: I believe this is actually a feature issue relating to the LAN DNS. Yesterday a new controller version was released, I upgraded before the below resolution was found and it was still not resolved. I can also still replicate it on the latest controller version. TP-Link responded to my ticket asking for controller access but it has been several days without the invite being accepted so I am pulling the invite so it isn't just sitting there pending indefinitely and not being monitored.

 

I had an entry set up in there as type IP that had my root domain mapped to my internal web server ip (mydomain[.]example -> 10.1.5.3) When I discovered that was not being looked up properly and I was seeing the original issue I posted about, I disabled that rule but did not remove it. I turned off DNS Cache, and turned off DNS Proxy, mDNS, removed all of the DNS entries under my VLAN DHCP configurations so everything was back to default configuration. I also disabled ALL of my ACLs in the name of debugging. I still saw the issue.

 

The fix *for now* was to completely delete all of the LAN DNS entries all together. If I add it back, whether it is enabled or disabled, the bad behavior returns. Delete it and expire off my caches? Resolves to my public IP. Add the DNS server back into the DHCP options and everything works how it should internally.

 

To recap the issue:

  • VPN into my network from outside: Everything worked great
  • Access public resources from outside my network: Everything worked great
  • Access public resources from inside my network: Failed to properly resolve DNS for my root domain ONLY (mydomain[.]example).

 

The failure presented itself as:

  • Resolve any subdomain of my root such as sub.mydomain[.]example, would work fine.
  • Resolve the root domain and it would return the IP of the first hop network device (connected to wireless, it -would return the AP you were connected to, wired into a switch -it would return the IP of the switch, hooked right to the router -it would return the IP of the gateway)
  • If that first hop device was not an omada branded device, DNS resolution for the root domain would just time out.

 

Further debug information:

  • Capturing traffic at a client would show a totally normal DNS request and response, supposedly coming from the server that was expected to be replying (10.0.5.2 in this case).
  • Capturing traffic at the DNS server would show absolutely no traffic from the client for the root domain queries, everything else you query though would make it there. This definitely implied query interception was at play.
  • Checking the DNS server logs would show no logs related to the root domain lookup even though the client got a response (again suggests an interception issue).
  • Forcing the server IP using nslookup or dig would not change this behavior.
  • Forcing DNS over TCP at the client, everything would work perfectly (again, implies interception is at play since TCP is much harder to implement proper interception).
  • Resolving the root domain on the DNS server itself was returning the expected results, so it was DEFINITELY not the DNS server.
  • Performing a DNS lookup test from the OC200 "Network Tools" toolkit would also show this behavior where APs would show the switch IP they were connected to, the switches would show the gateway IP, etc. (This implies the issue was at the gateway itself)
  • Deleting all of the LAN DNS entries would resolve the DNS interception issue. Adding it back would immediately cause the issue to return, even if the rule was disabled.
Recommended Solution
0
0
#4
3 Reply
Re:DNS Resolution issue with my domain name on new Omada environment
3 weeks ago

Hi,  @hobbymaster001 
Thank you for sharing your issue on our business forum. 

We would like to guide you through the following troubleshooting steps to help locate this problem:

1.Please update the firmware of all your devices to the latest official stable version, including your OC200 controller, ER8411 router, switch and EAP access points.

2.After the update, please double-check your Omada controller settings to confirm that both mDNS and DNS Proxy are disabled.

3.To narrow down the origin of the issue, please connect a wired client directly to a LAN port on your ER8411, bypassing all intermediate switches and access points in your current setup, then test the DNS resolution for your domain. This step will help us confirm whether the problem originates from the ER8411 itself or downstream networking devices. If DNS resolution works correctly when connected directly to the ER8411, you can then add your switch and access points back to the connection one by one and retest to locate the problematic device.

4.For further in-depth analysis, we recommend a packet capture test on the client side:
Install Wireshark on a client that encounters the issue.Start the packet capture session, then run the nslookup mydomain[.]com command.Filter the captured traffic by the DNS protocol to check the details of the DNS response.

 

0
0
#2
Re:DNS Resolution issue with my domain name on new Omada environment
3 weeks ago

  @Jeremy_12 Thank you foryour reply!

 

1. This was one of the first things I did as a part of my troubleshooting. All of the devices are current and I just double checked, and no newer firmware is available.

 

2. I had mDNS rules in place and enabled. DNS proxy was already disabled. I have made sure both are disabled and have rebooted the router and my test client to ensure everything is fresh.

 

3. I have connected my computer directly to the ER8411 now, and I am seeing the same strange behavior... if I query mydomain[.]com I am givem my routers IP instead of the one that is configured in the local dns or on the pihole. I tried the query from both a vlan that has dns options under DHCP set to automatic, and one vlan that has my pihole specified. Both return the gateway address when directly connected to the gateway, but return the correct IP of the DNS server that it is supposedly querying.

 

My IP address in the below example was 10.0.1.14, so this is from my main user vlan network. The 10.0.0.1 would be the default vlan gateway address, which seems odd. I would have expected to see it return the address of the gateway for the main vlan (10.0.1.1).

 

C:\Users\Me>nslookup mydomain[.]com
Server:  skynet-dns
Address:  10.0.5.2

Non-authoritative answer:
Name:    mydomain[.]com
Address:  10.0.0.1

C:\Users\Me>nslookup mydomain[.]com 10.0.1.1
Server:  UnKnown
Address:  10.0.1.1

Non-authoritative answer:
Name:    mydomain[.]com
Address:  10.0.0.1

4. Screenshot of the full DNS query above where I didn't specify an explicit server to query.

Thank you again for the assistance,

Braden

0
0
#3
Re:DNS Resolution issue with my domain name on new Omada environment -Solution
3 weeks ago - last edited 3 weeks ago

UPDATE: I believe this is actually a feature issue relating to the LAN DNS. Yesterday a new controller version was released, I upgraded before the below resolution was found and it was still not resolved. I can also still replicate it on the latest controller version. TP-Link responded to my ticket asking for controller access but it has been several days without the invite being accepted so I am pulling the invite so it isn't just sitting there pending indefinitely and not being monitored.

 

I had an entry set up in there as type IP that had my root domain mapped to my internal web server ip (mydomain[.]example -> 10.1.5.3) When I discovered that was not being looked up properly and I was seeing the original issue I posted about, I disabled that rule but did not remove it. I turned off DNS Cache, and turned off DNS Proxy, mDNS, removed all of the DNS entries under my VLAN DHCP configurations so everything was back to default configuration. I also disabled ALL of my ACLs in the name of debugging. I still saw the issue.

 

The fix *for now* was to completely delete all of the LAN DNS entries all together. If I add it back, whether it is enabled or disabled, the bad behavior returns. Delete it and expire off my caches? Resolves to my public IP. Add the DNS server back into the DHCP options and everything works how it should internally.

 

To recap the issue:

  • VPN into my network from outside: Everything worked great
  • Access public resources from outside my network: Everything worked great
  • Access public resources from inside my network: Failed to properly resolve DNS for my root domain ONLY (mydomain[.]example).

 

The failure presented itself as:

  • Resolve any subdomain of my root such as sub.mydomain[.]example, would work fine.
  • Resolve the root domain and it would return the IP of the first hop network device (connected to wireless, it -would return the AP you were connected to, wired into a switch -it would return the IP of the switch, hooked right to the router -it would return the IP of the gateway)
  • If that first hop device was not an omada branded device, DNS resolution for the root domain would just time out.

 

Further debug information:

  • Capturing traffic at a client would show a totally normal DNS request and response, supposedly coming from the server that was expected to be replying (10.0.5.2 in this case).
  • Capturing traffic at the DNS server would show absolutely no traffic from the client for the root domain queries, everything else you query though would make it there. This definitely implied query interception was at play.
  • Checking the DNS server logs would show no logs related to the root domain lookup even though the client got a response (again suggests an interception issue).
  • Forcing the server IP using nslookup or dig would not change this behavior.
  • Forcing DNS over TCP at the client, everything would work perfectly (again, implies interception is at play since TCP is much harder to implement proper interception).
  • Resolving the root domain on the DNS server itself was returning the expected results, so it was DEFINITELY not the DNS server.
  • Performing a DNS lookup test from the OC200 "Network Tools" toolkit would also show this behavior where APs would show the switch IP they were connected to, the switches would show the gateway IP, etc. (This implies the issue was at the gateway itself)
  • Deleting all of the LAN DNS entries would resolve the DNS interception issue. Adding it back would immediately cause the issue to return, even if the rule was disabled.
Recommended Solution
0
0
#4