ER605 v2 Firmware update 2.1.0 causes adopting loop

This thread has been locked for further replies. You can start a new thread to share your ideas or ask questions.

ER605 v2 Firmware update 2.1.0 causes adopting loop

This thread has been locked for further replies. You can start a new thread to share your ideas or ask questions.
67 Reply
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 06:54:22 - last edited 2023-01-15 08:04:49

My country side er605 which is configured with remote controlled got 2.1.0 auto upgraded and went down with same issue.

 

Now I am wondering if 5.8.4 will solve that when released for Linux. Deb or I should drive 16km to it to do downgrade ?

 

I did 2.1.0 upgrade in my home and it worked without issues.

 

EDIT:

My home router is down too. It seems working for couple minutes and go down which confuse me initially.

  0  
  0  
#33
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 08:04:57 - last edited 2023-01-15 08:05:38

The problem pattern is a bit confusing. Maybe those who have problems AND those who don't may provide the following information for all of us and TP-Link to find a solution together:

 

  • Did the problem occur?
  • Did you change the default IP range of the default/native network?
  • Do you use a separated management VLAN?
  • Is the controller inside the management VLAN?
  • Is the controller connected via switch or directly to the router?
  • Are ACL restrictions in place to separate the VLANs during update?

 

There might be other question useful but these are my first to also understand the risk it might take to update this firmware for others like me who did not do it yet.

 

Best Carsten

  0  
  0  
#34
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 08:49:45

  @modebm 

Hi,

 

here some answers from my case after upgrading to 2.10

 

Problem occured

IP range of all networks (VLANs) were changed to other values at initial setup months ago. No issue so far when updating

Yes, seperated VLANs

Controller is in MGMT VLAN

Controller is connected to a switch

no ACL restrictions

 

further I saw that WAN connection was shown as Link Down after upgrading to 2.10 and internet did not work properly.

 

Had to downgrade to 2.1 again, back in business.

 

regards

Checko

  0  
  0  
#35
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 09:06:27

  @modebm after upgrading my network works fine, but the loop occurs. I do not use VLAN's. The IP range is not the default range. The controller is a container in Docker on my Synology NAS, version 5.7.4 (mbentley). Firewall on or off doesn't make a difference. No specific ACL in place. Controller works fine with 2.0.1. 

 

I will post the log of Omada with more details of a loop.

  0  
  0  
#36
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 09:15:15 - last edited 2023-01-15 09:28:34

  @SuperUserOne 

 

Upgraded ER605v2 this morning and my internet connection died.

I had first upgraded a second ER605v2 that I have setup as a subnet off my main router. 

No problems with the upgrade on that one.

So I "forgot" the main router and moved the test router into my internet connect - whammo the loop issued started after adopting.

 

I then did another forget and a factory reset and readopted. No change, the router keep rebooting.

I then moved a ER605v1 for my internet connect to get everything working.

I now have two ER605v2 running two subnets with a VPN between them do load testing - no issues on the new firmware.

 

So the same router ER605v2 works in one configuration OK (the subnets) but not as the ISP router.


The only options are - downgrade firmware or reimplement a ER605v1.

This is a FAILED firmware upgrade following in the tracks of alerts spamming the logs for normal operation of Apple devices which are causing the Large IP and no-Flag messages and the weekly reboots needed due to the routers dropping off the network.

 

Goodbye my Sunday and lots of others for what should be a run of the mill upgrade.


Not a good look.  

 

There is also a firmware upgrade for the controllers coming *soon* to fix the Large IP and no-Flag messages. I received this message yesterday from TP-Link about that issue
 

"We have double confirmed with our senior engineer, and the new version of Controller would add the feature to show the source IP that triggers the large ping.

Before that, if such notices did not affect your network then please ignore it and pay attention to our website for new firmware."

 

  0  
  0  
#37
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 10:41:12

This is the log of Omada with one cycle of the loop: (Docker container of Omada 5.7.4 - mbentley on a Synology NAS)

 

(I did replace the MAC-address of my ER605v2 with XX-XX-XX-XX-XX-XX and replace the OmadaId with some 'x'. I also add '*' at some places to paste the log in post. Without the * I can't post the log for some reason.)


INFO [Thread-6] [] c.t.e.c.c.CloudClient(): Connect service server automatically, ConnectionType is PERSISTENT_CONNECTION.
INFO [Thread-6] [] c.t.e.c.c.q(): set connection status: CONNECTING
INFO [Thread-6] [] c.t.e.c.c.q(): set connection status: CONNECTED
INFO [Thread-6] [] c.t.e.c.c.q(): The result of connection is true.
WARN [handle-hello-cloud-task-6] [] c.t.s.o.i.a.b.a.e(): omadacId=0cxxxx0x4b392fc28b006beb5exxxxxxx check device bind status error: -7119, Operation failed. Network connection error.
INFO [manage-work-group-7] [] c.t.s.o.m.d.p.t.c(): Device XX-XX-XX-XX-XX-XX OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) changed to status CONNECTED_ERROR, which don't need to handle.
INFO [discovery-work-group-1] [] c.t.s.o.m.d.d.m.b.a(): MANAGED_BY_OWN Device XX-XX-XX-XX-XX-XX on omadac 0cxxxx0x4b392fc28b006beb5exxxxxxx is discoveried.
INFO [adopt-work-group-2] [] c.t.s.o.m.d.d.m.d.b.c(): Gateway OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) SiteId(63a636839363bb7d480ba073) DeviceMac(XX-XX-XX-XX-XX-XX) adopt[auto=true] ok
INFO [adopt-work-group-2] [] c.t.s.o.m.d.d.m.a.c(): send mini setting to OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) DeviceMac(XX-XX-XX-XX-XX-XX)
INFO [manage-work-group-9] [] c.t.s.o.m.d.p.t.c(): Device XX-XX-XX-XX-XX-XX OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) changed to status CONNECTED, which don't need to handle.
INFO [server-comm-pool-7] [] c.t.s.o.m.d.d.m.i.a(): got first inform of OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) DeviceMac(XX-XX-XX-XX-XX-XX)
INFO [manage-work-group-8] [] c.t.s.o.m.d.d.m.i.e(): first inform send full config to OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) DeviceMac(XX-XX-XX-XX-XX-XX)
INFO [comm-pool-13] [] c.t.s.o.l.c.a.a(): Start pushing connected devices.
INFO [monitor-topology-pool-6] [] c.t.s.o.c.u.d.a(): list local interface macs: [XX-XX-XX-XX-XX-XX, XX-XX-XX-XX-XX-XX, XX-XX-XX-XX-XX-XX, XX-XX-XX-XX-XX-XX]
WARN [manage-work-group-11] [] c.t.s.o.m.d.p.t.h.a(): send set request to OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) DeviceMac(XX-XX-XX-XX-XX-XX) fail, com.tplink.smb.ecsp.common.TransResult@7fbf2f16[errCode=0,msg=ERR_SUCCESS,result=EcspMessage(header=com.tplink.smb.ecsp.protocol.packet.header.MessageHeader@54b0a2d2, body=com.tplink.smb.ecsp.protocol.packet.body.datagram.BaseConfigResponse@bbd4ad3[sequenceId=48,errcode=-1,configVersion=<null>,additionalProperties={}]),addressDTO=<null>]
INFO [manage-work-group-11] [] c.t.s.o.m.d.d.m.d.b.A(): syncFullConfiguration to OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) DeviceMac(XX-XX-XX-XX-XX-XX) rebuild[false], result:SendDeviceMsgResult(success=false, deviceResponse=BaseConfigRespBody(sequenceId=48, errcode=-1, configVersion=null, additionalProperties={}))
INFO [tcp-message-executor-17-5] [] c.t.s.e.s.c.c(): need not set server route expire for device XX-XX-XX-XX-XX-XX, as device reconnect: /192.168.2.200:29814 -> /192.168.2.254:44604
INFO [manage-work-group-10] [] c.t.s.o.m.d.p.t.c(): Device XX-XX-XX-XX-XX-XX OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) changed to status CONNECTED, which don't need to handle.
INFO [manage-work-group-12] [] c.t.s.o.m.d.d.m.r.d(): send v2 rebuild reply[reset=true] to omadacId OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) & mac DeviceMac(XX-XX-XX-XX-XX-XX)
INFO [Thread-116] [] c.t.e.c.c.q(): 'recvThread' exception:java*.*net*.*SocketException: Connection reset

INFO [Thread-116] [] c.t.e.c.c.q(): Thread 'recvThread' is stopped
INFO [Thread-116] [] c.t.e.c.c.q(): change conn status from connected to disconnected
INFO [Thread-116] [] c.t.e.c.c.q(): set connection status: DISCONNECTED_NORMAL

INFO [manage-work-group-14] [] c.t.s.o.m.d.p.t.c(): Device XX-XX-XX-XX-XX-XX OmadacId(0cxxxx0x4b392fc28b006beb5exxxxxxx) changed to status CONNECTED_ERROR, which don't need to handle.

WARN [Thread-118] [] c.t.e.c.c.q(): send heartbeat getting timeout response: {"error_code":-20002}

INFO [Thread-118] [] c.t.e.c.c.q(): java*.*net*.*SocketException: Socket is closed

java*.*net*.*SocketException: Socket is closed        at sun*.security.ssl.SSLSocketImpl.getOutputStream(SSLSocketImpl.java:1233) ~[?:?]
        at com*.tplink.eap.cloudsdk.client.p.*a(SourceFile:111) ~[cloudsdk-1.0.13.jar:?]
        at com*.tplink.eap.cloudsdk.client.q.*a(SourceFile:485) ~[cloudsdk-1.0.13.jar:?]
        at com*.tplink.eap.cloudsdk.client.q.*a(SourceFile:425) ~[cloudsdk-1.0.13.jar:?]
        at com*.tplink.eap.cloudsdk.client.t.*run(SourceFile:892) ~[cloudsdk-1.0.13.jar:?]
        at java*.lang.Thread.*run(Thread.java:833) [?:?]

INFO [Thread-118] [] c.t.e.c.c.q(): Thread 'heartBeatThread' is stopped
INFO [Thread-6] [] c.t.e.c.c.CloudClient(): Close connection to service server.
INFO [Thread-6] [] c.t.e.c.c.q(): set connection status: DISCONNECTED_NORMAL
INFO [Thread-117] [] c.t.e.c.c.q(): expiredRequestCleanThread is interrupted.
INFO [Thread-117] [] c.t.e.c.c.q(): Thread 'expiredRequestCleanThread' is stopped
INFO [Thread-6] [] c.t.e.c.c.CloudClient(): Connect service server automatically, ConnectionType is PERSISTENT_CONNECTION.
INFO [Thread-6] [] c.t.e.c.c.q(): set connection status: CONNECTING
INFO [Thread-6] [] c.t.e.c.c.q(): set connection status: CONNECTED
INFO [Thread-6] [] c.t.e.c.c.q(): The result of connection is true.


 

 

The status changes from connected to disconnected, see the red lines. This cycle repeats over and over.

 

  2  
  2  
#38
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 10:53:01

  @SuperUserOne 

 

Interesting. Combined with what Checko and Lurk wrote it does not seem to be a setup issue within the LAN setup but a crash loop of the router itself. Maybe the type of connection the router tries to setup with the ISP helps digging further?

  0  
  0  
#39
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 11:38:31 - last edited 2023-01-15 11:40:56

  @modebm 

 

  • Did the problem occur?

YES

 

  • Did you change the default IP range of the default/native network?

YES

 

  • Do you use a separated management VLAN?

YES

 

  • Is the controller inside the management VLAN?

NO (this is a non starter, and as far as I can see has never worked if you have a TP-Link router)

 

  • Is the controller connected via switch or directly to the router?

A switch

 

  • Are ACL restrictions in place to separate the VLANs during update?

NO

 

I have, as I mentioned before, now completely restructured my network as a consequence of this. My main LAN is now in the factory default range of 192.168.0.1/24 and only has Omada devices connected to it, so it is now effectively my Management LAN. All other devices are distributed across several VLANs. 

 

Since this change I have not attempted to perform the router upgrade again..., going to wait an see what (if anything) we hear from TP-Link, but they have already pulled the firmware update from the main support website (at least in the UK) https://www.tp-link.com/uk/support/download/er605/v2/#Firmware

 

I do have some sympathy for the developers/testers, as I imagine that out of hundreds of thousands of installations, every one in unique, so impossible to test for every possible configuration. What is clear is that if you are using a TP-Link router, then moving away from the factory default subnet for the control plane causes problems. A controller (at least in my experience) cannot adopt a router which is not on the same subnet as itself. My controller which was at 192.168.10.10 could 'see' a factory reset router on 192.168.0.1 and listed it for adoption, but the adopt failed every time I tried until I changed the IP of the controller to be in the same subnet as the router.

 

Regardless of this particular issue, my takeaways are

1) Always use a static IP address on your controller. If the router fails, so will DHCP, and you will not be able to reach the contoller....

2) Keep the factory default subnet range for the Omada control plane. Using a different subnet/Management VLAN _will_ cause problems if the router needs resetting / replacing.

 

  0  
  0  
#40
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 13:14:55

 Hi  @Tescophil 


I add a 192.168.0.2 address to my pc and then change the router ip before adopting.

 

e.g. 
forget the router

logon to the router - not the managed by Omada controller message has gone from landing page

the router will keep its up address but let you factory reset back to 192.168.0.1


then you have two options - leave both address on the pc and adopt 

or

logon to the router and change the LAN address to the address it will have after it is adopted e.g 10.16.1.1 and then adopt 

 

i found method two is reliable and let's me use a different subnet than the default 192.168.0.x

 

 

 

  0  
  0  
#41
Options
Re:ER605 v2 Firmware update 2.1.0 causes adopting loop
2023-01-15 13:50:00

I can confirm that default 192 168 0 1

Or not 

5.7.4 sdn and 2.1.0 are just in constant reconfigure loop.

 

Going to reset again and downgrade to 2.0.1 before adopting.

Obviously I should make 2x18km trip to downgrade remote er605.

I still have some minor hope that issue is caused by controller not compatible with 2.1 0.

So if they release 5.8.4 some looping devices might start working again. However I think they might stop all updates because of that issue for investigation.

  0  
  0  
#42
Options
Related Articles