Any Performance and Footprint Tuning options for Omada SDN Controller V5 on Linux

Any Performance and Footprint Tuning options for Omada SDN Controller V5 on Linux

Any Performance and Footprint Tuning options for Omada SDN Controller V5 on Linux
Any Performance and Footprint Tuning options for Omada SDN Controller V5 on Linux
a week ago - last edited Monday
Model: OC200  
Hardware Version: V5
Firmware Version: 5.15.24.19


We have been running a 72 AP, 2 PoE Switch, ER8411 Router with an Omada Software Controller on a custom built 1U linux embedded low power Hardware.  The board can support only 4 GB RAM max and that much is installed. The CPU is Intel Atom D2500, 2C2T, 1.866GHz. Here is how the deployment scene looks like under normal use:



 

You can see that 1.7G is used by Omada , 452 MB by Mongo DB, which is again used only by Omada exclusively in the system. So Omada App footprint on linux is 2.1 GB+. Other major services on system are Pi-Hole and tailscale but their contribution is less than the above two. Total Memory usage (OS+other services+omada) is 2.91 GB out of which 2.1GB+ is Omada alone >

The general CPU usage of the device is like this:

 

It does not seem overloaded in nornal opreration and that periodic spike in use is probbaly Java GC or some other system GC kicking in, doing its job quickly and going away.
 

Our current, configuration of omadaDOTproperties is as under:


maxDOTdevice=150

 

And the following has been added to control.sh:


-Xms1024m -Xmx1024m


The above changes did bot impact the footprint at all.
 

Are these numbers reasonable and to be expected for such a system. Is their a way to reduce the memory footprint of Omada by tweaking some other startup parameter ?

Also we observed that:

 

(1) On startup, Omada takes long time to start. Once startup is completed, the Omada Software's CPU usage stays at 100% for a good amount of time (5+ minutes), but after that falls to 5%. Is this behavior to be expected ?

 

(2) Whenever a configuration is made from mobile app or Web-interface of omada.tplinkcloud.com that impacts all APs, then the CPU shoots momentarily to 100% but this is for very small duration of time. The problem is that after this few APs go into a flapping state of ADOPTING & DISCONNECTED. The APs have to be recovered either by Force Provisioning (sometime even this does not work) or Rebooting by "PoE recovery". This problem usually does not happen if the configuration is done directly from web-interface of controller hardware (Controller-IP COLON 8043)

So is there a safe and tested way to throttle Omada related PIDs CPU usage to say 60-70%, even if it becomes a bit slow in implementing configuration and mobile app access ?
 

  0      
  0      
#1
Options
1 Accepted Solution
Re:Any Performance and Footprint Tuning options for Omada SDN Controller V5 on Linux-Solution
Monday - last edited Monday

Hi  @APRC-P3-Tel 

 

Thanks for posting here.

 

Here are some optimization suggestions for your Omada controller setup:

1. Memory Optimization
- Tune JVM settings: Try reducing -Xms and -Xmx (e.g., -Xms768m -Xmx768m) to see if it stabilizes memory usage. High GC activity may be causing spikes.
- Limit MongoDB memory: Set wiredTigerCacheSizeGB=0.5 in mongod.conf to cap MongoDB at 512MB.
- Disable non-essential services: If Pi-hole/Tailscale consume significant RAM, consider moving them to another device.

 

2. CPU Optimization
- Throttle CPU usage: Use cpulimit or systemd's CPUQuota=70% to prevent Omada from maxing out CPU during config pushes.
- Avoid cloud config when possible: Apply changes locally (IP:8043) to reduce CPU spikes from cloud sync.

 

3. Other Recommendations
- Hardware upgrade: The Atom D2500 (2C/2T) is underpowered for 150 devices. A 4C/8GB+ system would improve stability.
- Check logs: Review Omada logs (/opt/tplink/EAPController/logs) for GC errors or excessive queries.

 

Final note: Start with JVM/MongoDB tweaks and CPU limiting. If issues persist, a hardware upgrade may be necessary.

Recommended Solution
  0  
  0  
#2
Options
3 Reply
Re:Any Performance and Footprint Tuning options for Omada SDN Controller V5 on Linux-Solution
Monday - last edited Monday

Hi  @APRC-P3-Tel 

 

Thanks for posting here.

 

Here are some optimization suggestions for your Omada controller setup:

1. Memory Optimization
- Tune JVM settings: Try reducing -Xms and -Xmx (e.g., -Xms768m -Xmx768m) to see if it stabilizes memory usage. High GC activity may be causing spikes.
- Limit MongoDB memory: Set wiredTigerCacheSizeGB=0.5 in mongod.conf to cap MongoDB at 512MB.
- Disable non-essential services: If Pi-hole/Tailscale consume significant RAM, consider moving them to another device.

 

2. CPU Optimization
- Throttle CPU usage: Use cpulimit or systemd's CPUQuota=70% to prevent Omada from maxing out CPU during config pushes.
- Avoid cloud config when possible: Apply changes locally (IP:8043) to reduce CPU spikes from cloud sync.

 

3. Other Recommendations
- Hardware upgrade: The Atom D2500 (2C/2T) is underpowered for 150 devices. A 4C/8GB+ system would improve stability.
- Check logs: Review Omada logs (/opt/tplink/EAPController/logs) for GC errors or excessive queries.

 

Final note: Start with JVM/MongoDB tweaks and CPU limiting. If issues persist, a hardware upgrade may be necessary.

Recommended Solution
  0  
  0  
#2
Options
Re:Any Performance and Footprint Tuning options for Omada SDN Controller V5 on Linux
Tuesday

  @Vincent-TP 

Vincent-TP wrote

Hi  @APRC-P3-Tel 

 

Thanks for posting here.

 

Here are some optimization suggestions for your Omada controller setup:

1. Memory Optimization
- Tune JVM settings: Try reducing -Xms and -Xmx (e.g., -Xms768m -Xmx768m) to see if it stabilizes memory usage. High GC activity may be causing spikes.
 

Modified /opt/tplink/EAPController/bin/control.sh. 768m setting as suggested above did help cut footprint by 100-200 MB. So i got a bit more greedy and configured -Xms512m -Xmx512m and that also seems to be working till now with memory usage of omada down to 1.3 GB from about 2.2 GB . My further question:
 

(a) Is their a guideline or rule to set these parameter values to vbe passed to Java ? How do we know what we are setting is good enough and will not create any problem down the line ?

 

(b) This control.sh file is part of TP-Link Omada package. New version will likely overwrite this file or not ? If yes, will we have to remember to manually redo these changes or Tplink can give a config file for this which is not overwritten as upgrades go on but only appended/modifled.

 


- Limit MongoDB memory: Set wiredTigerCacheSizeGB=0.5 in mongod.conf to cap MongoDB at 512MB.
 

Modified /etc/mongod.conf and added this

 engine: wiredTiger
  wiredTiger:
    engineConfig:
      cacheSizeGB: 0.5

 Works but no gains observed in this one for the moment. Maybe some swap space usage is reduced. Again i hope future upgrades of Omada will not overwrite this file. 

 


- Disable non-essential services: If Pi-hole/Tailscale consume significant RAM, consider moving them to another device.

They don't (especially tailscale with less than 100 MB memory use and Pi-hole with less than 300 MB).  I have one linux server for the moment, and I am trying to run all linux things on it AFAP. If not feasible or has operational issue then I will surely move services, including ofcourse getting new hardware installing Proxmox and running omada, pi-hole as seperate VMs if not seperate containers, removing tailscale on omada h/w and run on another machine as subnet router. This Hardware I have presently is very old and primitive (Giada MI-NAS04 motherboard). It does not support virtualization also. However, its advantage is very low (10W only) power consumption. It seems to mostly work very well for our workload and that is why we want to continue using it as long as we can. The idea to use low power hardware came from ARM based specs of OC200v1.  
 

 

2. CPU Optimization
- Throttle CPU usage: Use cpulimit or systemd's CPUQuota=70% to prevent Omada from maxing out CPU during config pushes.
- Avoid cloud config when possible: Apply changes locally (IP:8043) to reduce CPU spikes from cloud sync.

 

I am implementing the CPU throttling state, but I saw the following status if i do systemctl status tpeap:

 

Omada starts up and works normally in the system. Where could this be coming from ?

 

3. Other Recommendations
- Hardware upgrade: The Atom D2500 (2C/2T) is underpowered for 150 devices. A 4C/8GB+ system would improve stability.
- Check logs: Review Omada logs (/opt/tplink/EAPController/logs) for GC errors or excessive queries.
 

Noted. We have an embedded industrial minITX J3160 Celeron (Biostar J3160NH) with 4C/8GB DDR3 Celeron, which we may try next. it uses about 20W maximum. It would be faster than this Atom CPU for sure, plus supports Virtualization. 

For logs, I will start seperate thread as we have lots of Java software exceptions seen previously and even now in startup.log and server.log files.
 

 

Final note: Start with JVM/MongoDB tweaks and CPU limiting. If issues persist, a hardware upgrade may be necessary.

 

  0  
  0  
#3
Options
Re:Any Performance and Footprint Tuning options for Omada SDN Controller V5 on Linux
3 hours ago - last edited 2 hours ago

 The Change -Xms512m -Xmx512m was slow to work with and not stable (the controller would go in some sort of dead loop with 100% CPU usage) if we use the mobile app too much. So I increased to -Xms768m -Xmx768m and this is very usable in my setup and perfectly stable with high mobile app use.

Also the 70% CPU usage and conversion to Systemd Init method worked out. It indeed caps both the runtime and starting time CPU usage to 70% (my config). I do not see the mobile app become unresponsive or any slow to work with at this usage level. Nor did startup take an abnormally high time compared to previous default (100%). Its usable IMO as in most cases we are restarting controller (for eg., in case OS updates/upgrades in totally off peak hours).

This is how the CPU usage normally looks like (without any Omada Management tasks being triggered):



Omada Management activities are sparse in time. So normally whatever the role of omada is (like auth lookup, auth, etc) does not use any CPU (less than 5% even on my Atom Dual core). This is reason why I don't want to upgrade the hardware. I think Tp-Link can consider how to make the omada management tasks light on the CPU, even if its made slow by a user selectable tweaking option  (like normal -same behavior as of today, slow - uses less cpu but more time to complete). It just might enhance usability on old and underpowered hardware,

So I could use all the suggestions offered in this thread by Vincent. Thanks for your support.

  0  
  0  
#4
Options