(Solved) Strange behavior shown in System log with the latest firmware on ER605
Several strange things have happened.
1.: As I no longer use the two WAN port on the ER605, have only one ISP for a while, therefore I thought I should disable the second WAN port (WAN/LAN1) because it has been disconnected anyway and I had already disabled Load Balancing. So I unchecked that port in Network->WAN->WAN Mode. To my surprise it affected the VLAN configuration. It copied the VLAN setup for port4 to port2. All the VLANs that had been set to be tagged on Port4, appeared with the same on Port2 which is the WAN/LAN1 port. I had to remove each unwanted tags manually.
After that, still the WAN/LAN1 port appears in System Status as Link Down. And it started a never ending sequence seen in System Log:
2.: If I tried to change Dynamic IP to Static, it even showed in System Status as Online, oddly! It scared me as I was not sure if it would send some internet traffic to that physically disconnected port.
And it showed this in System Log:
What should I do? I don't want unnecessary activities or failure in the router.
The web interface is already buggy. Sometimes it loads the pages lacking some elements. Look at this one, for example, lacking the config tabs for the WAN ports:
- Copy Link
- Subscribe
- Bookmark
- Report Inappropriate Content
The problem returned again after a while and next time I was at the location and wanted to check the admin page and saw the strange system logs again, after navigating on the page a few seconds, the device froze completely, not just the admin page. The whole local network and the internet connection stopped working.
After a reboot I managed to make it work again but couldn't risk entering the admin page again, until another day when I had the change to RESET and REBUILD the whole system again. One could say it was clearly bug in the restored backup config. However, when I started to rebuild the config, it still presented some strange behavior.
I started to add the additional LANs, after 4 or 5 already added I realized that the device did not save the name change and the VLAN number change from LAN and 1 to the desired one. So, I decided to delete what I had already done, it started to behave so sluggish to remove these LANs and what I had already deleted, appeared again after a page reload.
I had to push reset on the device again and - just in case - after it had finished, I desconneted/reconnected the device again.
It forced me to try to rebuild the config in a stricted order. It's unfortunate and clear at this stage, in the weak condition of the device you can't just change your mind and remove/re-add something later on. Especially in LAN tab you can't define the list order (the IDs). If you modify a LAN, it will appear at the end of the list.
After adding about 30-35 LANs, the device started to slow down significantly (at least for the processes to be completed which took longer than the webUI indicates)
Either you do this process really slowly waiting after each LANs added or you experience the sluggish behavior. When you finished adding everything you wanted, it still process a lot, maxing the CPU usage for many minutes. Patience is key now.
Just navigating on the page whenever the CPU usage is high, so the device is busy, it tends to log you out instead of showing the subpage you wanted to load.
So, at least, finally I managed to remove the second WAN falsely showing on the System Status page and the related system logs.
- Copy Link
- Report Inappropriate Content
Hi @Arion
Thanks for posting in our business forum.
1. Try the steps. I think this might be an issue with the cache.
2. If this WAN/LAN1 is still stuck as WAN, I recommend you set it as WAN > save > reboot. And repeat the steps to change it to LAN.
- Copy Link
- Report Inappropriate Content
I tried the steps you had recommended. No success.
And when I try to save the change of the WAN mode, it shows an infinite loop prompt, I have to reload the page to be able to navigate on the page again:
After reload it shows as if it had saved the change but the problem remains. When I set it to have only one WAN port checked, it first chooses some random (number of random) VLANs tagged on the Port2 (WAN/LAN1), and even if I changed it to a single VLAN untagged, it still shows these:
It seems I should have kept it as 2 WAN ports activated (with one of them disconnected physically), that way it may not register those failure logs.
I don't want to reset the router because I don't think that restoring the backup would help, and I wouldn't like to reconfigure the whole router with 50 VLANs and its related ACL rules, just to realize that it's a bug in the new firmware(s), again...
Something is certainly going on with the cache and I did't really set a heavy duty on the router, no VPNs, no complicated anything, other than those 50 isolated VLANs from each other. And at the time when I experience these cache issues, there are hardly any traffic on the local network. The web interface tends to log me out frequently when I go to another subpage.
Edit:
Observe that the MAC address that failed to optain the IP address for WAN/LAN1 end with "61" and the MAC address for WAN/LAN1 or even for the other WAN port shown in System Status page ends with "62" and "63".
- Copy Link
- Report Inappropriate Content
Hi @Arion
Thanks for posting in our business forum.
Arion wrote
I tried the steps you had recommended. No success.
And when I try to save the change of the WAN mode, it shows an infinite loop prompt, I have to reload the page to be able to navigate on the page again:
After reload it shows as if it had saved the change but the problem remains. When I set it to have only one WAN port checked, it first chooses some random (number of random) VLANs tagged on the Port2 (WAN/LAN1), and even if I changed it to a single VLAN untagged, it still shows these:
It seems I should have kept it as 2 WAN ports activated (with one of them disconnected physically), that way it may not register those failure logs.
I don't want to reset the router because I don't think that restoring the backup would help, and I wouldn't like to reconfigure the whole router with 50 VLANs and its related ACL rules, just to realize that it's a bug in the new firmware(s), again...
Something is certainly going on with the cache and I did't really set a heavy duty on the router, no VPNs, no complicated anything, other than those 50 isolated VLANs from each other. And at the time when I experience these cache issues, there are hardly any traffic on the local network. The web interface tends to log me out frequently when I go to another subpage.
Edit:
Observe that the MAC address that failed to optain the IP address for WAN/LAN1 end with "61" and the MAC address for WAN/LAN1 or even for the other WAN port shown in System Status page ends with "62" and "63".
Let me conclude your issue here. Don't divert into other issues before we fix the original issue.
Your original issue would be:
1. WAN/LAN1 is disabled, but still looks like it is a WAN, and repeats the DHCP request in the log.
2. GUI glitches out.
I don't think we need to move on the VLAN thing. I will not move on to that until we fix the 1 and 2.
First part:
As you said that you tried the steps, the WAN tab still incorrectly shows? I suspect if you literally listened to me. But I am not seeing the same thing.
Isn't this the tab showing correctly?
So issue 2 is resolved.
About the "infinite loop", nah. You don't seem to be familiar with the system. It reboots the router if you change the WAN. This requires a reboot to enable the logic interface.
This is under the reboot process and you will see this until it finishes the reboot. It does not hurt anything if you reboot it after you eye-confirms the router has entered the reboot process.
Second part about your misunderstandings and misconceptions of the whole system:
We now focus on the WAN/LAN1 which you say does not work properly. You cannot change it back to LAN. Is that what you described?
#1:
Arion wrote
To my surprise it affected the VLAN configuration. It copied the VLAN setup for port4 to port2. All the VLANs that had been set to be tagged on Port4, appeared with the same on Port2 which is the WAN/LAN1 port. I had to remove each unwanted tags manually.
After that, still the WAN/LAN1 port appears in System Status as Link Down.
Conclusion: totally normal. Try to create a new VLAN, and see how it works on every port. I don't repeat it here as it is very simple for you to do a little test.
#2:
Arion wrote
Conclusion: totally normal. As long as you have a cable plugged in and it shows up as UP like this.
It is not online. Online Detection will display "Online" or "Offline".
#3 I believe this log is your WAN/LAN1 working as the WAN port. If it is in WAN, mode and it generates this, I would expect this normal as well.
Unless you are telling me it is LAN mode, I would never comprehend why a LAN port is trying to grab a DHCP. It would only be a client(which is in WAN mode) to request the server(upstream).
P.S.
If you believe it is a bug with the new firmware, try to revert it. Let's see if you have the same thing on the previous firmware. If you don't change anything on the new firmware, reverting it does not hurt anything.
If your backup file has corrupted, it does not matter what firmware you use. I begin to suspect if it is your firmware corrupts prior to the firmware upgrade. If it corrupts, you should reset and start over. There is no way to fix the corrupted backup if you don't what you changed which leads to the bad file.
I don't think you should start a multi-thread conversation as you don't seem to understand the product well. That does not do anything good to you.
- Copy Link
- Report Inappropriate Content
Oh, boy... where should I start. Your rude reply really does not help.
But let's start from the easy part:
"Unless you are telling me it is LAN mode, I would never comprehend why a LAN port is trying to grab a DHCP. It would only be a client(which is in WAN mode) to request the server(upstream)."
It was changed to LAN port, unchecked the WAN/LAN1 in WAN mode. Even though it went to infinite loop "Processing...", if I reloaded the page after a while, it showed it as if the modification had been saved. The fact that I could reload the page, contradicts your statement as if it rebooted the router! It never rebooted the router after a WAN port change in the past and if it did, I couldn't reach the web page, nor could access the internet and the local network. In my config the rebooting time is over 10 minutes.
Nevertheless, I rebooted the device, both on the web interface and after it finished rebooting I did a physical reboot again, disconnecting the power cable.
After this change, the WAN mode page shows this:
It should be in LAN mode, right? However, there comes those system logs "WAN/LAN1: DHCP client sending DHCP-DISCOVERY timeout." repeatedly in every 50 second.
Please, ask your colleagues if they also think it's normal.
That's why I started this thread.
One thing I have not tried yet though, connecting a client to effectively use that port as a LAN port. I just simply configured it as LAN, chose a VLAN (instead of the randomly chosen ones - I know it's not an important issue, I just mentioned it to give you all info about all the odd behavior.) and did not plug anything in it. I don't think it is required for the router to work properly.
#2:
I said that if I changed (when WAN/LAN1 was still configured as WAN port) dynamic to Static, just for the sake of test, it showed as Link Up. I incorrectly used the word "online", I apologize. But the fact is, it showed as Link Up for that unplugged port.
Please, ask your colleagues if they also think it's normal.
If my backup file is corrupted which is possible, I will have to rebuild the configuration from the start. I may have already done this after the debucle of these last firmware updates, I'm not sure. But I'm compelled to do so, just to proof you that I did not start this thread without any reason and I may not be the weakest link in this discussion.
"I don't think you should start a multi-thread conversation as you don't seem to understand the product well. That does not do anything good to you."
Multi-thread? what are you talking about? Let's focus on the bug I just reported here, in a single thread!
In a few days, after I get to the location again to do a reset and reconfiguring the router, I'll post the results here. If it doesn't cure the problem, I will have to go back to an earlier firmware as you suggested but that means, I would have to reconfigure the router again and again in each previous firmware iteration as using a backup from a newer fw is not ideal to troubleshoot.
- Copy Link
- Report Inappropriate Content
Hi @Arion
Arion wrote
Oh, boy... where should I start. Your rude reply really does not help.
But let's start from the easy part:
"Unless you are telling me it is LAN mode, I would never comprehend why a LAN port is trying to grab a DHCP. It would only be a client(which is in WAN mode) to request the server(upstream)."
It was changed to LAN port, unchecked the WAN/LAN1 in WAN mode. Even though it went to infinite loop "Processing...", if I reloaded the page after a while, it showed it as if the modification had been saved. The fact that I could reload the page, contradicts your statement as if it rebooted the router! It never rebooted the router after a WAN port change in the past and if it did, I couldn't reach the web page, nor could access the internet and the local network. In my config the rebooting time is over 10 minutes.
Nevertheless, I rebooted the device, both on the web interface and after it finished rebooting I did a physical reboot again, disconnecting the power cable.
After this change, the WAN mode page shows this:
It should be in LAN mode, right? However, there comes those system logs "WAN/LAN1: DHCP client sending DHCP-DISCOVERY timeout." repeatedly in every 50 second.
Please, ask your colleagues if they also think it's normal.
That's why I started this thread.
One thing I have not tried yet though, connecting a client to effectively use that port as a LAN port. I just simply configured it as LAN, chose a VLAN (instead of the randomly chosen ones - I know it's not an important issue, I just mentioned it to give you all info about all the odd behavior.) and did not plug anything in it. I don't think it is required for the router to work properly.
#2:
I said that if I changed (when WAN/LAN1 was still configured as WAN port) dynamic to Static, just for the sake of test, it showed as Link Up. I incorrectly used the word "online", I apologize. But the fact is, it showed as Link Up for that unplugged port.
Please, ask your colleagues if they also think it's normal.
If my backup file is corrupted which is possible, I will have to rebuild the configuration from the start. I may have already done this after the debucle of these last firmware updates, I'm not sure. But I'm compelled to do so, just to proof you that I did not start this thread without any reason and I may not be the weakest link in this discussion.
"I don't think you should start a multi-thread conversation as you don't seem to understand the product well. That does not do anything good to you."
Multi-thread? what are you talking about? Let's focus on the bug I just reported here, in a single thread!
In a few days, after I get to the location again to do a reset and reconfiguring the router, I'll post the results here. If it doesn't cure the problem, I will have to go back to an earlier firmware as you suggested but that means, I would have to reconfigure the router again and again in each previous firmware iteration as using a backup from a newer fw is not ideal to troubleshoot.
I don't see why it is rude and please explain what you expect from me. Am I supposed to be tame instead of asking any more questions and explaining what might be the cause? Just report it?
You did not explain your situation clearly in the OP which is a total mess if you take a look at the OP again(which is my opinion and I cannot get a hold of your idea). While you re-visit your description, make sure you take yourself out of your god mode which you are the person who experienced it and you know the whole situation and your idea to be expressed. Read it as if you were the third person without any information provided just the OP and #3, not from your POV. You showed WAN/LAN 1 and DHCP requests at the same time and did not add the following information which is something I asked for and requested for later on.
(Fact: WAN/LAN1 sending the DHCP discovery which is normal if it is a WAN and is under Dynamic IP connection mode.)
Don't say that I did not show your post to others, my colleagues. I actually showed your post to someone else from the team yesterday before I replied to your #5 and she replied to me with a disjunctive sentence. Asking me if I am certain of what situation you described. Well, I think you might misunderstand the product and how it works. I think she could not get a hold of your idea before your last reply. What do you think of your description before #4(#1-3)?
I am just showing you that you need to do some other tests to make sure it is correctly configured and under this condition, if this happens, that means a problem.
Any problem should go through a standard, logical, and dialectic procedure to verify.
How am I supposed to do my work if I am always jumpy to throw a "bug" or "false alarm" to the dev and test teams? And I should stay out of this responsibility as it is not my case and under someone else's name? Blindly doing stuff without a workflow or basic logic and giving it away without getting your description and facts straight? That's not me and not my nature. Is that responsible as an employee not to verify everything before I write a bug report when it is potentially because of misconfiguration? Communication is to make things clear. I gotta verify your problems and make sure your symptoms are clear which was something you clarified until just then.
If you think you are explaining it right, okay, accept my apology for not being more sensible than you and it is my failure to report your issue to the dev directly. In that case, I think the dev should go straight for the phone, email, and forum bypassing the technical support team and forum team as we seem to be useless as the mind man to do some fact-checking and being dialectic to resolve the issues. I admit that the skills in the tech team vary but I am always trying my best.
I will throw this to the dev today. Will get you replied ASAP they responded to my email.
- Copy Link
- Report Inappropriate Content
Hi @Arion
Arion wrote
Oh, boy... where should I start. Your rude reply really does not help.
But let's start from the easy part:
"Unless you are telling me it is LAN mode, I would never comprehend why a LAN port is trying to grab a DHCP. It would only be a client(which is in WAN mode) to request the server(upstream)."
It was changed to LAN port, unchecked the WAN/LAN1 in WAN mode. Even though it went to infinite loop "Processing...", if I reloaded the page after a while, it showed it as if the modification had been saved. The fact that I could reload the page, contradicts your statement as if it rebooted the router! It never rebooted the router after a WAN port change in the past and if it did, I couldn't reach the web page, nor could access the internet and the local network. In my config the rebooting time is over 10 minutes.
Nevertheless, I rebooted the device, both on the web interface and after it finished rebooting I did a physical reboot again, disconnecting the power cable.
After this change, the WAN mode page shows this:
It should be in LAN mode, right? However, there comes those system logs "WAN/LAN1: DHCP client sending DHCP-DISCOVERY timeout." repeatedly in every 50 second.
Please, ask your colleagues if they also think it's normal.
That's why I started this thread.
One thing I have not tried yet though, connecting a client to effectively use that port as a LAN port. I just simply configured it as LAN, chose a VLAN (instead of the randomly chosen ones - I know it's not an important issue, I just mentioned it to give you all info about all the odd behavior.) and did not plug anything in it. I don't think it is required for the router to work properly.
#2:
I said that if I changed (when WAN/LAN1 was still configured as WAN port) dynamic to Static, just for the sake of test, it showed as Link Up. I incorrectly used the word "online", I apologize. But the fact is, it showed as Link Up for that unplugged port.
Please, ask your colleagues if they also think it's normal.
If my backup file is corrupted which is possible, I will have to rebuild the configuration from the start. I may have already done this after the debucle of these last firmware updates, I'm not sure. But I'm compelled to do so, just to proof you that I did not start this thread without any reason and I may not be the weakest link in this discussion.
"I don't think you should start a multi-thread conversation as you don't seem to understand the product well. That does not do anything good to you."
Multi-thread? what are you talking about? Let's focus on the bug I just reported here, in a single thread!
In a few days, after I get to the location again to do a reset and reconfiguring the router, I'll post the results here. If it doesn't cure the problem, I will have to go back to an earlier firmware as you suggested but that means, I would have to reconfigure the router again and again in each previous firmware iteration as using a backup from a newer fw is not ideal to troubleshoot.
So, the test team did not replicate this. This might be a rare case. No records of this behavior before.
I might need a remote session with you if necessary as we are moving forward. Just a heads up.
As it is shown as LAN but still sending the DHCP discovery per the log says, will it work as a LAN as the GUI shows?
Have you reset it? Did you keep a backup?
What is the result of your test?
- Copy Link
- Report Inappropriate Content
Edited: it had not solve the problem yet. Check my last post!
Luckily, I was able to fix the issue. Without resetting and re-configuring everything!
(I don't want to enter the discussion above, I still think I wrote the description well enough, the details were exactly as I presented.)
The problem was the slow behavior of the device, and it is happening since the last 2 or 3 firmware updates. Even though the admin page shows low CPU and moderate RAM usage, it struggles at various parts of the admin page, besides of the random log outs when moving to other menu sections.
So, what has happened is after disabling the second WAN port (WAN/LAN1) and appeared that "Processing... DO NOT operate the system." I did not wait that approx. 5 minutes I should. I just reloaded the admin page after 20-30 seconds or a minute. It created the broken setup which produced those system logs afterwards.
This week at the location again, I waited more patiently, thus I know how long it took to finish the process. But to be clear, it was not rebooting at all. It just needed that long.
Interestingly though, before disabling the second WAN port successfully, as it was already disabled in the broken setup, I first had to enable the WAN/LAN1. So this time It showed the "Processing..." message for a while, then changed to something like "Rebooting...", and it rebooted the device, indeed. It took about 12 minutes to get back ready (but I'm already familiar with this time frame from every previous rebootings).
Now it seems all good. There aren't any of those system logs.
Only thing is still remaining: the System Status page lists the WAN/LAN1 with Dynamic IP and Link down and 0.0.0.0 for some reason.
Whether it's normal, only the devs may know... I don't want to reset the device just to discover if there would be otherwise after a factory reset.
From now on, I have to deal with the slow and unstable behavior of the admin page. And be more patient with this router.
- Copy Link
- Report Inappropriate Content
Hi @Arion
Arion wrote
Luckily, I was able to fix the issue. Without resetting and re-configuring everything!
(I don't want to enter the discussion above, I still think I wrote the description well enough, the details were exactly as I presented.)
The problem was the slow behavior of the device, and it is happening since the last 2 or 3 firmware updates. Even though the admin page shows low CPU and moderate RAM usage, it struggles at various parts of the admin page, besides of the random log outs when moving to other menu sections.
So, what has happened is after disabling the second WAN port (WAN/LAN1) and appeared that "Processing... DO NOT operate the system." I did not wait that approx. 5 minutes I should. I just reloaded the admin page after 20-30 seconds or a minute. It created the broken setup which produced those system logs afterwards.
This week at the location again, I waited more patiently, thus I know how long it took to finish the process. But to be clear, it was not rebooting at all. It just needed that long.
Interestingly though, before disabling the second WAN port successfully, as it was already disabled in the broken setup, I first had to enable the WAN/LAN1. So this time It showed the "Processing..." message for a while, then changed to something like "Rebooting...", and it rebooted the device, indeed. It took about 12 minutes to get back ready (but I'm already familiar with this time frame from every previous rebootings).
Now it seems all good. There aren't any of those system logs.
Only thing is still remaining: the System Status page lists the WAN/LAN1 with Dynamic IP and Link down and 0.0.0.0 for some reason.
Whether it's normal, only the devs may know... I don't want to reset the device just to discover if there would be otherwise after a factory reset.
From now on, I have to deal with the slow and unstable behavior of the admin page. And be more patient with this router.
This is also something new to me which you did not mention it became slow. We don't have any agreement on the problem description, so this is it. End of the story for any further discussion on that.
ER605 V1 would be slow to boot up and this happens to other models like ER7206 V1 if you have too many parameters on it. ER7212PC would boot slower if the parameters become bulk.
I still don't know how many functions you have on it. If that boots 12 minutes, I would assume that is pretty many as in the lab, I use the test model, it would take 2-3 minutes to fully boot up after it's plugged in power and work okay.
BTW, FYI, the system is Linux-based and the models are separated for different hierarchies. No doubt that performance and overall experience would be different. If this model is that slow, you might need to consider a new version or a better one. Just let you know that this led to slowness because of the config, which will become frequent for your model. Just a heads-up based on what I know about the system.
- Copy Link
- Report Inappropriate Content
Sorry to say that I did not mean to refer to the long booting time when I mentioned the slow behavior that has been happening since the last 2 or 3 updates.
The long booting time has always been that long. I have imagined that it's due to the 49 VLANs (according to the logs it takes about 4 minutes to set the IP parametres for the VLANs) and then some other processes (50 ACL rules), it also does 3 times "DHCPS initializing" in the logs etc.
This wasn't my complain at all.
The slow behavior is happening on the admin page, loading certain submenus that have numerous entries to list.
It would still not seem strange for me.
What makes it suspicious and abnormal is the frequent log outs when I try to go to WAN menu or LAN menu or some others.
It is not NORMAL behavior logging me out when I just logged in a few seconds earlier.
Sometimes it does not load the current page proprerly, missing the chart for the CPU usage etc. It must be a cache management issue, as I have said multiple times.
It has hardly ever (or never) happened before v1.2.1 or so.
I'm just sayin'...
- Copy Link
- Report Inappropriate Content
The problem returned again after a while and next time I was at the location and wanted to check the admin page and saw the strange system logs again, after navigating on the page a few seconds, the device froze completely, not just the admin page. The whole local network and the internet connection stopped working.
After a reboot I managed to make it work again but couldn't risk entering the admin page again, until another day when I had the change to RESET and REBUILD the whole system again. One could say it was clearly bug in the restored backup config. However, when I started to rebuild the config, it still presented some strange behavior.
I started to add the additional LANs, after 4 or 5 already added I realized that the device did not save the name change and the VLAN number change from LAN and 1 to the desired one. So, I decided to delete what I had already done, it started to behave so sluggish to remove these LANs and what I had already deleted, appeared again after a page reload.
I had to push reset on the device again and - just in case - after it had finished, I desconneted/reconnected the device again.
It forced me to try to rebuild the config in a stricted order. It's unfortunate and clear at this stage, in the weak condition of the device you can't just change your mind and remove/re-add something later on. Especially in LAN tab you can't define the list order (the IDs). If you modify a LAN, it will appear at the end of the list.
After adding about 30-35 LANs, the device started to slow down significantly (at least for the processes to be completed which took longer than the webUI indicates)
Either you do this process really slowly waiting after each LANs added or you experience the sluggish behavior. When you finished adding everything you wanted, it still process a lot, maxing the CPU usage for many minutes. Patience is key now.
Just navigating on the page whenever the CPU usage is high, so the device is busy, it tends to log you out instead of showing the subpage you wanted to load.
So, at least, finally I managed to remove the second WAN falsely showing on the System Status page and the related system logs.
- Copy Link
- Report Inappropriate Content
Information
Helpful: 0
Views: 896
Replies: 10
Voters 0
No one has voted for it yet.