ER706W freezes
ER706W freezes
Hi,
I have two sets of ER706W in two different locations.
One is connected to 4G cellular network (modem is in bridge mode) - let's call it Side Office,
Second is connected to fiber 1/1Gbps network (directly to ONT) - let's call it Main Office.
Devices are connected via IPsec Site-2-Site, where Main Office is in responder mode. I do use DynDNS since public IPs may change.
Both sites are managed by one OC200 controller, placed in Main Office, connected directly to router, powered separately. Cloud access is enabled.
Since (about) a month I'm facing weird "freezez" of both devices which looks for me totally random and can happen at night, when there is no traffic in the network at all. It happens to both but not at the same time, every few days.
The freezes, looks the same for both routers: the device in theory is working (LEDs are blinking) but router is not responding:
1. Can't attempt to access it's webgui (address not responding) - both
2. Can't connect to WiFi - any attempt end up with failed connection - both
3. Side Office on Controller's dashboard looks like it is totally disconnected (no graph)
4. When Main Office's router is frozen, shows as Offline Controller in cloud access
5. After router's reboot (point 4) Main Office on Controller's dashboard from that time shows as connected, but there was no traffic, PING line was totally flat and constantly on value 1mc (which is impossible, minimum ping is 6ms)
The only thing I can do to make those devices work then is to manually restart those (reconnect to power supply).
I do have some more configuration on Main Office router (few subnets, IPsec and OpenVPN server, DynDNS, some ACLs) but Side Office router has almost no specific config (IPsec, DynDNS only).
No device error is showed in logs, no massive CPU/memory usage/spikes, no traffic before/during those freezes.
The only thing that comes to my mind is the last update (1.1.2) - but I'm not 100% sure if the problems started after that upgrade or some time before.
What should I check in that case? Are there any additional logs somewhere available or only those I can see in controller's log section? Anyone faced the same issue by any chance?
Thanks for any tips.
Cheers!
- Copy Link
- Subscribe
- Bookmark
- Report Inappropriate Content
- Copy Link
- Report Inappropriate Content
Unfortunately, I would rather avoid upgrading those devices to BETA version of fw. That's why I'm also waiting few weeks for standard upgrade to be sure noone reported any issues with stable version as well.
I want to avoid downgrading in that case as well, since after 1.1.2 upgrade I had to create and provide to all network's users new OVPN config files since old one stopped to work. Not sure what was changed in 1.1.2 (compared to 1.1.1) but that's what happen. Therefore I'm afraid downgrade will trigger the same "problem" of reconfiguration.
Cheers.
- Copy Link
- Report Inappropriate Content
ok, yes I have an ER706W here but it is only in a lab environment, it is mostly just a raspberry pi that is connected to that network if I am not going to test anything, but I have never had any problems at all with that router, but I have seen a lot of problems on the forum. but I don't have many other tips to give since I have had so few problems unfortunately. now tp-link is at work again tomorrow so maybe you will get some tips from Clive_A or something..
- Copy Link
- Report Inappropriate Content
Hi @RaRu
Thanks for posting in our business forum.
RaRu wrote
Hi,
I have two sets of ER706W in two different locations.
One is connected to 4G cellular network (modem is in bridge mode) - let's call it Side Office,
Second is connected to fiber 1/1Gbps network (directly to ONT) - let's call it Main Office.
Devices are connected via IPsec Site-2-Site, where Main Office is in responder mode. I do use DynDNS since public IPs may change.
Both sites are managed by one OC200 controller, placed in Main Office, connected directly to router, powered separately. Cloud access is enabled.
Since (about) a month I'm facing weird "freezez" of both devices which looks for me totally random and can happen at night, when there is no traffic in the network at all. It happens to both but not at the same time, every few days.
The freezes, looks the same for both routers: the device in theory is working (LEDs are blinking) but router is not responding:
1. Can't attempt to access it's webgui (address not responding) - both
2. Can't connect to WiFi - any attempt end up with failed connection - both
3. Side Office on Controller's dashboard looks like it is totally disconnected (no graph)
4. When Main Office's router is frozen, shows as Offline Controller in cloud access
5. After router's reboot (point 4) Main Office on Controller's dashboard from that time shows as connected, but there was no traffic, PING line was totally flat and constantly on value 1mc (which is impossible, minimum ping is 6ms)
The only thing I can do to make those devices work then is to manually restart those (reconnect to power supply).
I do have some more configuration on Main Office router (few subnets, IPsec and OpenVPN server, DynDNS, some ACLs) but Side Office router has almost no specific config (IPsec, DynDNS only).
No device error is showed in logs, no massive CPU/memory usage/spikes, no traffic before/during those freezes.
The only thing that comes to my mind is the last update (1.1.2) - but I'm not 100% sure if the problems started after that upgrade or some time before.
What should I check in that case? Are there any additional logs somewhere available or only those I can see in controller's log section? Anyone faced the same issue by any chance?
Thanks for any tips.
Cheers!
Based what I read, I am more into this question:
Are you able to ping the gateway while it looks like "dead"?
What do you mean by the highlighted part?
You cannot even connect to the WIFI. When you wire to the router, do you get an IP address? It gives me a feeling that the router crashes. I think you need to verify both ends as you describe there are two routers and they behave the same. They do not have similarities but they create the VPN. I am curious if it is the VPN that caused the problem and crashed them so they behave in this way randomly.
- Copy Link
- Report Inappropriate Content
When Side Office was "dead" I was trying to ping it without success, but keep in mind that I could test it while connected cia VPN to Main Office only. I have no access to both Offices on daily basis :(
I have no possibility to check it via cable while it's "dead" on my own (as mentioned above), but next time it happens, I'll try to ask end user to proceed with that.
What I have seen that in both cases, last logs are:
And after those there were no reconnection attempts.
But I'm not sure if that's accurate. Especially since for last 2 days those logs were also visible but nothing bad happen, the connection was restored as it should:
Regarding the highlighted part:
When the Main Office is down, I can't access the controller at all. Cloud access is showing whole Controller as offline.
After the router's (manual) reboot, I can see that the controller was registering some logs/statistics but those are impossible to achieve (part 1 on graph follower by manual reboot - part 2).
Please, be informed that above graph IS NOT real graph. I just manually edited the picture to show you what I saw last time I had that issue. Unfortunately, I have't saved actual screenshot.
For now I have set up scheduled reboot on Side Office (every day at 3AM) as well a Main Office (every week, at 3AM).
I have also disabled AGL IPsec on Side Office's router.
- Copy Link
- Report Inappropriate Content
Hi,
Scheduled restarts (at least once a week) were not enough to "fix" the issue. Today the router died again, during workday. Last Reboot was during sunday's night.
The last logs before it died are:
So nothing really informative. The WAN port was not responding. I don't know about LAN.
Unfortunately, the device is in remote location and the users there were not able to test anything. I'll keep trying tho.
Simple restart (reconnect to power supply) "fixed" the case. Question is - for how long?
I don't want to set daily reboot schedule. I don't believe that's a proper solution imo.
- Copy Link
- Report Inappropriate Content
Hi @RaRu
Thanks for posting in our business forum.
RaRu wrote
Hi,
Scheduled restarts (at least once a week) were not enough to "fix" the issue. Today the router died again, during workday. Last Reboot was during sunday's night.
The last logs before it died are:
So nothing really informative. The WAN port was not responding. I don't know about LAN.
Unfortunately, the device is in remote location and the users there were not able to test anything. I'll keep trying tho.
Simple restart (reconnect to power supply) "fixed" the case. Question is - for how long?
I don't want to set daily reboot schedule. I don't believe that's a proper solution imo.
I understand. What I want to learn is still not answered.
Is this a WAN issue or does the router completely crash? That's what I want to learn by pinging the LAN IP address which I ask you to perform earlier in the conversation.
If you could clarify, I can decide if this goes to the dev to collect the debug log or if I can try something else.
- Copy Link
- Report Inappropriate Content
Hi,
Today I had the same issue. Managed to test few things:
1. While the problem occurs, the WiFi is working (you can connect) but there is no internet, same on cable
2. The PING from PC to router works perfectly fine (1ms)
3. It looks like WAN goes down. Unfostunately I can't check what's happening with it since router is managed by remote controller
4. The LED of LAN on Huawei does not light up as if there's no connection between Huawei and TP-Link. Not sure about LED of TP-Linku - user didn't provide proper photo of that device :/
What I saw is that restarting ISP's modem can restore internet access, so not whole TP-Link setup.
Next thing I want to check is if simple WAN cable reconnection will be enoug to restore internet access.
Before you say it's ISP's problem - I know it looks this way but please, keep in mind that this problem occurs in both locations that I'm supporting - Main Office with fiber connection (mediaconverter from ISP => TP-Link Router) as well as Site Office (Huawei LTE modem => TP-Link Router).
In both location the router is ER706W.
Both places get Public IP directly from ISP into TP-Link router (Huawei modem is set in bridge mode).
Therefore I'm not so sure it's the same issue from two different ISPs.
I'll keep you updated once the WAN port reconnection test is done.
Here are some logs from the time of problem:
- Copy Link
- Report Inappropriate Content
Hi @RaRu
Thanks for posting in our business forum.
RaRu wrote
Hi,
Today I had the same issue. Managed to test few things:
1. While the problem occurs, the WiFi is working (you can connect) but there is no internet, same on cable
2. The PING from PC to router works perfectly fine (1ms)
3. It looks like WAN goes down. Unfostunately I can't check what's happening with it since router is managed by remote controller
4. The LED of LAN on Huawei does not light up as if there's no connection between Huawei and TP-Link. Not sure about LED of TP-Linku - user didn't provide proper photo of that device :/
What I saw is that restarting ISP's modem can restore internet access, so not whole TP-Link setup.
Next thing I want to check is if simple WAN cable reconnection will be enoug to restore internet access.
Before you say it's ISP's problem - I know it looks this way but please, keep in mind that this problem occurs in both locations that I'm supporting - Main Office with fiber connection (mediaconverter from ISP => TP-Link Router) as well as Site Office (Huawei LTE modem => TP-Link Router).
In both location the router is ER706W.
Both places get Public IP directly from ISP into TP-Link router (Huawei modem is set in bridge mode).
Therefore I'm not so sure it's the same issue from two different ISPs.
I'll keep you updated once the WAN port reconnection test is done.
Here are some logs from the time of problem:
LAN seems to be good. Does the WAN still get an IP address?
What's the WAN connection type?
If one site is down, the other site will not work, of course. Does this crash both sites when you test locally?
Will this crash the WAN connection of the other site?
- Copy Link
- Report Inappropriate Content
Hi, Thanks for your reply.
How should I check that: Does the WAN still get an IP address?
Since I can't log into router, it has no connection to controller, ping from WAN do not response.
Only one site crashes at the time. One crash do not cause a crast on the other site.
- Copy Link
- Report Inappropriate Content
Information
Helpful: 0
Views: 243
Replies: 14
Voters 0
No one has voted for it yet.