TL-SX3008F V1 lockup

TL-SX3008F V1 lockup

TL-SX3008F V1 lockup
TL-SX3008F V1 lockup
20 hours ago
Model: TL-SX3008F  
Hardware Version: V1
Firmware Version: SX3008Fv1_en_1.20.12_[20251031-rel75312]_up.bin

I installed the latest firmware (SX3008F(UN)_v1.20_1.20.12 Build 20251031.zip, SX3008Fv1_en_1.20.12_[20251031-rel75312]_up.bin) on the TL-SX3008F switch marked as "V1.0" (according to https://community.tp-link.com/en/business/forum/topic/649878, in this case V1.0 is equivalent to V1.20 and therefore can use that firmware version — am I understanding this correctly?). The switch is not put in production yet, and had only a single 10G connection over a DAC cable to a D-Link DGS-3130-54TS switch (BTW, I needed to explicitly specify “speed 10giga” on the D-Link side of the connection, otherwise the link on TL-SX3008F was continuously flapping and not passing the traffic). The only traffic to the switch was MSTP from the upstream switch, NTP initiated by the switch itself, and some HTTPS and SSH management traffic.

 

While trying to set up SSH access to the switch, I got the switch into a state where it stopped responding over network (first I noticed that it does not respond to SSH connection attempts, then it turned out that it does not respond to pings and even to ARP requests, although the 10G interface with a DAC cable through which I was communicating seemed to be still up). I happened to have the console cable connected, but the terminal emulator was not running at that time, therefore any console output which may have been printed at the problem time had not been captured. When I started the terminal emulator and pressed Enter, I got the "Login invalid." message (this was expected, as the switch should have been sitting at the login prompt), but there was no further output (no new login prompt) and no response to any further input on the console.

 

I found that the console still responds to the Break signal (interpreted as SysRq by the Linux kernel which is apparently running on the switch), and tried to collect some debugging data, but that data is rather large (228 KB of text), and I'm not sure about the best way to upload it here (the “File” button on the editor toolbar does not seem to work — I get a “Failed to upload” message).

 

Then I tried to reboot the switch by sending Break+b on the console, but got this error instead:

 

SysRq : Resetting
CPU1: stopping
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O 3.10.70 #2
[<c0015190>] (unwind_backtrace+0x0/0xfc) from [<c001171c>] (show_stack+0x10/0x18)
[<c001171c>] (show_stack+0x10/0x18) from [<c0013928>] (handle_IPI+0x100/0x134)
[<c0013928>] (handle_IPI+0x100/0x134) from [<c00085b4>] (armada_370_xp_handle_irq+0xa8/0xbc)
[<c00085b4>] (armada_370_xp_handle_irq+0xa8/0xbc) from [<c000dd00>] (__irq_svc+0x40/0x50)
Exception stack(0xcf863fa8 to 0xcf863ff0)
3fa0:                   ffffffed 0029b000 c0b8564c 0861e0ac cf862000 c0b844a4
3fc0: c03ade28 c0bb3a84 c0bb3a84 562f5842 00000001 00000000 01000000 cf863ff0
3fe0: c005126c c000f2ec 60000013 ffffffff
[<c000dd00>] (__irq_svc+0x40/0x50) from [<c000f2ec>] (arch_cpu_idle+0x2c/0x48)
[<c000f2ec>] (arch_cpu_idle+0x2c/0x48) from [<003a2704>] (0x3a2704)
------------[ cut here ]------------
Kernel BUG at c0095c64 [verbose debug info unavailable]
Internal error: Oops - BUG: 0 [#1] SMP ARM
Modules linked in: ethdriver(O) mvMbusDrv(O) mvIntDrv(O) mvDmaDrv(O)
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.70 #2
task: c0b881c0 ti: c0b7c000 task.ti: c0b7c000
PC is at __get_vm_area_node.isra.32+0x11c/0x128
LR is at get_vm_area_caller+0x3c/0x50
pc : [<c0095c64>]    lr : [<c00964a0>]    psr: 20000113
sp : c0b7ddb0  ip : c0b7c000  fp : 00000001
r10: fff00000  r9 : d0800000  r8 : c001a74c
r7 : 00000001  r6 : 000000d0  r5 : 00000001  r4 : c0b7ddb0
r3 : d0800000  r2 : 00000001  r1 : 00010000  r0 : 00001000
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c53c7d  Table: 0eeb806a  DAC: 00000015
Process swapper/0 (pid: 0, stack limit = 0xc0b7c238)
Stack: (0xc0b7ddb0 to 0xc0b7e000)
dda0:                                     000003ff c000efb8 c0162f68 000f1018
ddc0: 00000104 00001000 00000000 c0b897d8 c001a74c f1018000 00000001 c00964a0
dde0: fff00000 000000d0 c001a74c 00000104 c0b897b0 c0017978 00000004 00000000
de00: f1018107 00000001 00000009 c0bb3a94 00000001 c0017a68 c001a74c 00000001
de20: c0bb4044 c0b92150 0000000c c00176f8 c03ad648 c001a74c c001a698 000003e8
de40: c0b92150 c0b9ad08 20000193 c000f36c c01b2640 c0b8b944 00000062 c01b28a0
de60: c0b855e8 c0be3080 00000101 00000000 c0b7e0c0 00000061 00000000 c01c4c68
de80: c0b8d9c0 c0b8d9c0 c0e0d6f8 c006b8a4 c0be3080 000000cc 00000061 40000193
dea0: 00000000 c01c62a4 cef6bd50 c0be3080 000000cc 00000013 c0be315c c01ca178
dec0: c01ca154 cedd4c00 00000000 c01c3fac c01c3f70 cedd4bc0 cf88ce58 00000000
dee0: 00000000 00000013 cf88ce00 c00650dc c03a9f64 c0babb40 00000024 cf88ce00
df00: cf88ce58 00000000 c0be0340 000003ff c0be0340 c0b7df78 c0b84428 c0065268
df20: cf88ce00 00000013 00000000 c0067b6c 00000013 c0064ac0 c0b7ac5c c000efb4
df40: c000f2ec 60000013 00000001 c0008558 c005126c c000f2ec 60000013 ffffffff
df60: c0b7dfac c0bb3a84 562f5842 00000001 00000000 c000dd00 ffffffed 00293000
df80: c0b8564c 1fa52436 c0b7c000 c0b844a4 c03ade28 c0bb3a84 c0bb3a84 562f5842
dfa0: 00000001 00000000 01000000 c0b7dfc0 c005126c c000f2ec 60000013 ffffffff
dfc0: c0e0a3c0 c04f0a14 ffffffff ffffffff c04f0548 00000000 00000000 c0519020
dfe0: 00000000 10c53c7d c0b84424 c051901c c0b88f0c 00008074 00000000 00000000
[<c0095c64>] (__get_vm_area_node.isra.32+0x11c/0x128) from [<c00964a0>] (get_vm_area_caller+0x3c/0x50)
[<c00964a0>] (get_vm_area_caller+0x3c/0x50) from [<c0017978>] (__arm_ioremap_pfn_caller+0x124/0x1a4)
[<c0017978>] (__arm_ioremap_pfn_caller+0x124/0x1a4) from [<c0017a68>] (__arm_ioremap_caller+0x54/0x60)
[<c0017a68>] (__arm_ioremap_caller+0x54/0x60) from [<c00176f8>] (__arm_ioremap+0x14/0x20)
[<c00176f8>] (__arm_ioremap+0x14/0x20) from [<c001a74c>] (mvebu_restart+0xb4/0xec)
[<c001a74c>] (mvebu_restart+0xb4/0xec) from [<c000f36c>] (machine_restart+0x28/0x5c)
[<c000f36c>] (machine_restart+0x28/0x5c) from [<c01b28a0>] (__handle_sysrq+0x104/0x174)
[<c01b28a0>] (__handle_sysrq+0x104/0x174) from [<c01c4c68>] (serial8250_rx_chars+0xf8/0x208)
[<c01c4c68>] (serial8250_rx_chars+0xf8/0x208) from [<c01c62a4>] (serial8250_handle_irq+0xc0/0xc8)
[<c01c62a4>] (serial8250_handle_irq+0xc0/0xc8) from [<c01ca178>] (dw8250_handle_irq+0x24/0x5c)
[<c01ca178>] (dw8250_handle_irq+0x24/0x5c) from [<c01c3fac>] (serial8250_interrupt+0x3c/0xc0)
[<c01c3fac>] (serial8250_interrupt+0x3c/0xc0) from [<c00650dc>] (handle_irq_event_percpu+0x50/0x198)
[<c00650dc>] (handle_irq_event_percpu+0x50/0x198) from [<c0065268>] (handle_irq_event+0x44/0x68)
[<c0065268>] (handle_irq_event+0x44/0x68) from [<c0067b6c>] (handle_level_irq+0x90/0x100)
[<c0067b6c>] (handle_level_irq+0x90/0x100) from [<c0064ac0>] (generic_handle_irq+0x24/0x30)
[<c0064ac0>] (generic_handle_irq+0x24/0x30) from [<c000efb4>] (handle_IRQ+0x38/0x94)
[<c000efb4>] (handle_IRQ+0x38/0x94) from [<c0008558>] (armada_370_xp_handle_irq+0x4c/0xbc)
[<c0008558>] (armada_370_xp_handle_irq+0x4c/0xbc) from [<c000dd00>] (__irq_svc+0x40/0x50)
Exception stack(0xc0b7df78 to 0xc0b7dfc0)
df60:                                                       ffffffed 00293000
df80: c0b8564c 1fa52436 c0b7c000 c0b844a4 c03ade28 c0bb3a84 c0bb3a84 562f5842
dfa0: 00000001 00000000 01000000 c0b7dfc0 c005126c c000f2ec 60000013 ffffffff
[<c000dd00>] (__irq_svc+0x40/0x50) from [<c000f2ec>] (arch_cpu_idle+0x2c/0x48)
[<c000f2ec>] (arch_cpu_idle+0x2c/0x48) from [<c04f0a14>] (start_kernel+0x2a0/0x2f8)
Code: e1a00004 e3a04000 eb002370 eaffffeb (e7f001f2)
---[ end trace 86b55d378f1f2680 ]---
Kernel panic - not syncing: Fatal exception in interrupt

 

Now the switch no longer responds even to Break signals on the serial port, and looks like the only way to recover it would be to cycle the power, which might not happen until Jan 12, so I can't do anything more to investigate the problem at the moment (it's really unfortunate that the switch does not seem to have any kind of hardware watchdog enabled).

 

Is this a known bug in the SX3008Fv1_en_1.20.12_[20251031-rel75312]_up.bin firmware? Or should I use only the older firmware (SX3008Fv1_en_1.20.0_[20231011-rel42220]_up.bin) which is listed for the TL-SX3008F V1 model (not SX3008F V1.2) on the support site? Although apparently there is a similar lockup report (https://community.tp-link.com/en/business/forum/topic/816848) even for that old firmware version.

  0      
  0      
#1
Options
1 Reply
Re:TL-SX3008F V1 lockup
20 hours ago

I attached the SysRq debug output which I collected from the console as a zip archive (attaching a text file directly did not work).

File:
SX3008F-20260101-debug.zipDownload
  0  
  0  
#2
Options