kernel oops on scp large file

Bug #1052735 reported by Ike Panhc
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical HWE Eilt
Fix Released
Critical
Ike Panhc
linux-armadaxp (Ubuntu)
Fix Released
Critical
Ike Panhc
Quantal
Fix Released
Critical
Ike Panhc

Bug Description

This is based on 3.5 based kernel at http://kernel.ubuntu.com/git?p=hwe/ubuntu-quantal-armadaxp.git;a=summary

When scp a large file into armadaxp system, console shows kernel oops.

[10395.086252] ------------[ cut here ]------------
[10395.090894] Kernel BUG at c00ede34 [verbose debug info unavailable]
[10395.097178] Internal error: Oops - BUG: 0 [#1] SMP ARM
[10395.102329] Modules linked in: dm_multipath dm_raid45 dm_mirror dm_region_hash dm_log
[10395.110256] CPU: 0 Not tainted (3.5.0-1600-armadaxp #2)
[10395.115850] PC is at kfree+0x58/0xac
[10395.119438] LR is at skb_free_head+0x4c/0x54
[10395.123722] pc : [<c00ede34>] lr : [<c05201a4>] psr: 40070093
[10395.123722] sp : eb225d30 ip : eb225d50 fp : eb225d4c
[10395.135230] r10: ebaf239c r9 : eb224000 r8 : 00000b50
[10395.140469] r7 : 00000770 r6 : eb0ab000 r5 : a0070013 r4 : eb0a3480
[10395.147014] r3 : c18d3560 r2 : 00000000 r1 : 0002b0ab r0 : eb0ab000
[10395.153559] Flags: nZcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[10395.160801] Control: 10c53c7d Table: 2b12806a DAC: 00000015
[10395.166563] Process sshd (pid: 8662, stack limit = 0xeb2242f0)
[10395.172412] Stack: (0xeb225d30 to 0xeb226000)
[10395.176783] 5d20: 00000000 eb0a3480 00000001 eb0a3480
[10395.184986] 5d40: eb225d5c eb225d50 c05201a4 c00edde8 eb225d74 eb225d60 c0520398 c0520164
[10395.193189] 5d60: ebaf2040 00003890 eb225ddc eb225d78 c05732e0 c05201b8 ef3404f8 ef2a8040
[10395.201391] 5d80: 00000000 ef3404c0 00000000 ebaf2088 00000000 00000000 00000000 00000001
[10395.209594] 5da0: 00000000 eb225e40 eb225de4 00000000 ef2a8040 eb225e40 00004000 eb225ea0
[10395.217797] 5dc0: eec6c6c0 eb225e20 00000040 eb225e40 eb225e14 eb225de0 c0592030 c0572ad8
[10395.226000] 5de0: 00000040 00000000 eb225df4 eb225df8 c02f872c 00000000 eb225e14 00000000
[10395.234202] 5e00: 00004000 eb225ea0 eb225e8c eb225e18 c0519190 c0591fb0 00000040 00000000
[10395.242405] 5e20: ec76e680 00000001 00000040 00004000 eec6c6c0 ec4c1000 00000000 eb225e40
[10395.250607] 5e40: 00000000 00000000 eb225e98 00000001 00000000 00000000 00000040 eb225ea0
[10395.258810] 5e60: ef2a8078 00000000 00000000 eb225f78 ec71d6c0 00004000 eb224000 00000000
[10395.267012] 5e80: eb225f3c eb225e90 c00f7bb4 c0519070 00000000 00000000 bef62b64 00000770
[10395.275214] 5ea0: 00000000 00000000 00000000 00000001 ffffffff ec71d6c0 00000000 00000000
[10395.283416] 5ec0: 00000000 00000000 ef2a8040 eb225ed8 00000000 00000000 00000000 00000000
[10395.291619] 5ee0: eb225e20 00000000 00004000 00004000 00004000 eb225f00 c00f8140 c02c376c
[10395.299822] 5f00: ef2a8040 ef3404c0 eb224000 c08db9e8 eb225f34 eb225f20 ec71d6c0 bef5f2d4
[10395.308025] 5f20: ec71d6c0 bef5f2d4 00004000 eb225f78 eb225f6c eb225f40 c00f8670 c00f7b20
[10395.316228] 5f40: eb225f9c eb225f50 c06540c0 00000000 00000000 ec71d6c0 bef5f2d4 00004000
[10395.324430] 5f60: eb225fa4 eb225f70 c00f877c c00f85c4 000000d3 00000000 00000000 00000000
[10395.332633] 5f80: b6fb3f50 00000003 bef5f2d0 b6f54f10 00000003 c000db88 00000000 eb225fa8
[10395.340835] 5fa0: c000d9c0 c00f8748 00000003 bef5f2d0 00000003 bef5f2d4 00004000 bef5f2d0
[10395.349038] 5fc0: 00000003 bef5f2d0 b6f54f10 00000003 b6faf87c b6f965a8 000021c3 b8946bd0
[10395.357240] 5fe0: 00000000 bef5f2bc b6f78f37 b6c522fc 40070010 00000003 00000000 00000000
[10395.365437] Backtrace:
[10395.367908] [<c00edddc>] (kfree+0x0/0xac) from [<c05201a4>] (skb_free_head+0x4c/0x54)
[10395.375758] r6:eb0a3480 r5:00000001 r4:eb0a3480 r3:00000000
[10395.381486] [<c0520158>] (skb_free_head+0x0/0x54) from [<c0520398>] (__kfree_skb+0x1ec/0x2c4)
[10395.390039] [<c05201ac>] (__kfree_skb+0x0/0x2c4) from [<c05732e0>] (tcp_recvmsg+0x814/0xb24)
[10395.398497] r5:00003890 r4:ebaf2040
[10395.402113] [<c0572acc>] (tcp_recvmsg+0x0/0xb24) from [<c0592030>] (inet_recvmsg+0x8c/0xa4)
[10395.410495] [<c0591fa4>] (inet_recvmsg+0x0/0xa4) from [<c0519190>] (sock_aio_read+0x12c/0x154)
[10395.419127] r6:eb225ea0 r5:00004000 r4:00000000
[10395.423799] [<c0519064>] (sock_aio_read+0x0/0x154) from [<c00f7bb4>] (do_sync_read+0xa0/0xe0)
[10395.432350] [<c00f7b14>] (do_sync_read+0x0/0xe0) from [<c00f8670>] (vfs_read+0xb8/0x184)
[10395.440460] r7:eb225f78 r6:00004000 r5:bef5f2d4 r4:ec71d6c0
[10395.446188] [<c00f85b8>] (vfs_read+0x0/0x184) from [<c00f877c>] (sys_read+0x40/0x74)
[10395.453949] r8:00004000 r7:bef5f2d4 r6:ec71d6c0 r5:00000000 r4:00000000
[10395.460736] [<c00f873c>] (sys_read+0x0/0x74) from [<c000d9c0>] (ret_fast_syscall+0x0/0x30)
[10395.469021] r8:c000db88 r7:00000003 r6:b6f54f10 r5:bef5f2d0 r4:00000003
[10395.475804] Code: 1593301c e5932000 e3120080 1a000000 (e7f001f2)
[10395.482005] ---[ end trace f327461be6edf7de ]---

This bug only affect 3.5 based kernel.

Ike Panhc (ikepanhc)
Changed in eilt:
status: New → Confirmed
Changed in hwe-eilt:
status: New → Confirmed
importance: Undecided → Critical
Changed in eilt:
importance: Undecided → Critical
assignee: nobody → Ike Panhc (ikepanhc)
Changed in hwe-eilt:
assignee: nobody → Ike Panhc (ikepanhc)
Ike Panhc (ikepanhc)
tags: added: ike-radar
Changed in linux-armadaxp (Ubuntu Quantal):
status: Confirmed → In Progress
Changed in hwe-eilt:
status: Confirmed → In Progress
Changed in eilt:
status: Confirmed → In Progress
Revision history for this message
Rob Herring (r-herring) wrote :
Download full text (4.1 KiB)

I think this may also affect highbank and be present on 3.6 as well. The backtrace is similar.

Using pktgen will hit the problem quickly.

kernel BUG at /build/buildd/linux-3.5.0/mm/slub.c:3474! [45/192]
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
Modules linked in: pktgen ipmi_devintf ipmi_si ipmi_msghandler rtc_pl031
CPU: 0 Not tainted (3.5.0-10-highbank #10-Ubuntu)
PC is at kfree+0x144/0x154
LR is at __kfree_skb+0x14/0xcc
pc : [<c00d8548>] lr : [<c035077c>] psr: 400f0113
sp : c05cfd58 ip : c05f8440 fp : 00000000
r10: ecead056 r9 : c035077c r8 : c05f8440
r7 : ed982800 r6 : ecead04e r5 : ed31cd80 r4 : ed31cd80
r3 : 00000000 r2 : ecead000 r1 : c180f054 r0 : c11be000
Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: 10c5387d Table: 2d22c04a DAC: 00000015
Process swapper/0 (pid: 0, stack limit = 0xc05ce2f0)
Stack: (0xc05cfd58 to 0xc05d0000)
fd40: c05f8440 c05cfd70
fd60: ee1a5640 ed31cd80 ed31cd80 ecead04e ed982800 c05f8440 00000000 ecead056
fd80: 00000000 c035077c ed861800 c03af410 ed861800 00000001 00000000 00000800
fda0: 0100140a c103140a 00000000 00000000 00000000 00000000 00000000 00000000
fdc0: 00000000 00000000 c05d7c28 00000608 ed861800 00000000 c05d7c48 ed31cd80
fde0: c05d7c40 c035943c ffffffff c05d7c10 00000003 00000000 c180f054 ed31cd80
fe00: 00000000 c05d7c48 ee85a000 ed31cd80 00000000 00000001 ee85a000 00000000
fe20: 0000003c 00010040 ed31cd80 c035a1c4 0000003c 00010040 606180ad 12b5507a
fe40: c001be34 07735940 f1d3fce0 ed861c80 ed861c80 ed861c80 00000000 c02d444c
fe60: 00000000 00670bc2 00670bb9 00670bc2 00000002 c0620f00 00000040 ed861ccc
fe80: ed861ccc 0000001f ffffffff 7fffffff 0226932a ed861ccc 00000001 00000040
fea0: c05d00c0 c35cf980 0000012c 00000000 c05ce000 c035be90 c35cf988 00670bbb
fec0: 00000070 00000001 0000000c c05d0090 c05ce000 c05d0080 00000003 c0620cc0
fee0: 00000102 c0029514 00000070 00000000 0000000a 00000000 ee00c594 c05ce000
ff00: c05cd108 c05ce000 00000070 00000000 413fc090 c0429e24 00000000 c0029a24
ff20: c05d76c8 c000f15c fee00100 c05d65c4 c05cff50 c05cff84 c05ce000 c00084d0
ff40: c000f430 600f0013 ffffffff c000de40 0000001f c05d6b68 00000000 00000000
ff60: c05ce000 c05ce000 c05ff548 c05db850 c05ce000 413fc090 c0429e24 00000000
ff80: 0000001f c05cff98 c000f42c c000f430 600f0013 ffffffff c000f408 c000f630
ffa0: c042a240 c05d74a8 00000000 c05ff480 c35cb880 ffffffff 3fffffff c059e7fc
ffc0: ffffffff ffffffff c059e2f0 00000000 00000000 c05c5ba4 00000000 10c5387d
ffe0: c05d659c c05c5ba0 c05db844 0000406a 00000000 00008044 00000000 00000000
[<c00d8548>] (kfree+0x144/0x154) from [<c035077c>] (__kfree_skb+0x14/0xcc)
[<c035077c>] (__kfree_skb+0x14/0xcc) from [<c03af410>] (arp_process+0x60/0x5d0)
[<c03af410>] (arp_process+0x60/0x5d0) from [<c035943c>] (__netif_receive_skb+0x558/0x600)
[<c035943c>] (__netif_receive_skb+0x558/0x600) from [<c035a1c4>] (netif_receive_skb+0x1c/0xa8)
[<c035a1c4>] (netif_receive_skb+0x1c/0xa8) from [<c02d444c>] (xgmac_poll+0x3ac/0x51c)
[<c02d444c>] (xgmac_poll+0x3ac/0x51c) from [<c035be90>] (net_rx_action+0x128/0x1dc)
[<c035be90>] (net_...

Read more...

Revision history for this message
Rob Herring (r-herring) wrote :

After more testing, this seems to only be a pktgen related issue for highbank.

Revision history for this message
Ike Panhc (ikepanhc) wrote :

@Rob.

Thanks for your info. Looks like different issue for me also.

(gdb) list *kfree+0x58
0xc00ede34 is in kfree (/home/ikepanhc/linux-armadaxp-3.2.0/mm/slab.c:505).
500 }
501
502 static inline struct kmem_cache *page_get_cache(struct page *page)
503 {
504 page = compound_head(page);
505 BUG_ON(!PageSlab(page));
506 return (struct kmem_cache *)page->lru.next;
507 }
508
509 static inline void page_set_slab(struct page *page, struct slab *slab)

Revision history for this message
Li Li (lli5) wrote :

@Ike, I'm curious how you trigger this issue with scp (When scp a large file into armadaxp system, console shows kernel oops.). I tried to scp files up to 1GB to AXP but didn't observe any kernel oops.

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Which kernel you are using? Please use the 3.5 based kernel

Revision history for this message
Ike Panhc (ikepanhc) wrote :

Looks like this is the root cause

2486,2487c2487
< CONFIG_NET_SKB_RECYCLE=y
< CONFIG_NET_SKB_RECYCLE_DEF=1
---
> # CONFIG_NET_SKB_RECYCLE is not set

Revision history for this message
Ike Panhc (ikepanhc) wrote :
Changed in hwe-eilt:
status: In Progress → Fix Committed
Changed in linux-armadaxp (Ubuntu Quantal):
status: In Progress → Fix Committed
Changed in eilt:
status: In Progress → Fix Committed
tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.4 KiB)

This bug was fixed in the package linux-armadaxp - 3.5.0-1602.3

---------------
linux-armadaxp (3.5.0-1602.3) quantal; urgency=low

  [ Jani Monoses ]

  * [Config] fdr updateconfigs after rebase.

  [ Upstream Kernel Changes ]

  * Rebase on Ubuntu-3.5.0-16.25

  [ Ubuntu: 3.5.0-16.25 ]

  * SAUCE: input: Cypress PS/2 Trackpad fix multi-source, double-click
    - LP: #1055788
  * [Config] revert '[Config] enable CONFIG_X86_X32=y'
    - LP: #1041883
  * vmwgfx: corruption in vmw_event_fence_action_create()
  * drm/nvd0/disp: hopefully fix selection of 6/8bpc mode on DP outputs
    - LP: #1058088
  * drm/nv50-/gpio: initialise to vbios defaults during init
    - LP: #1058088
  * igb: A fix to VF TX rate limit
    - LP: #1058188
  * igb: Add switch case for supported hardware to igb_ptp_remove.
    - LP: #1058188
  * igb: Support the get_ts_info ethtool method.
    - LP: #1058188
  * igb: Streamline RSS queue and queue pairing assignment logic.
    - LP: #1058188
  * igb: Update firmware info output
    - LP: #1058188
  * igb: Version bump
    - LP: #1058188
  * igb: reset PHY in the link_up process to recover PHY setting after
    power down.
    - LP: #1058188
  * igb: Fix for failure to init on some 82576 devices.
    - LP: #1058188
  * igb: correct hardware type (i210/i211) check in igb_loopback_test()
    - LP: #1058188
  * igb: don't break user visible strings over multiple lines in
    igb_ethtool.c
    - LP: #1058188
  * igb: add delay to allow igb loopback test to succeed on 8086:10c9
    - LP: #1058188
  * igb: fix panic while dumping packets on Tx hang with IOMMU
    - LP: #1058188
  * igb: Fix register defines for all non-82575 hardware
    - LP: #1058188
  * e1000e: use more informative logging macros when netdev not yet
    registered
    - LP: #1058219
  * e1000e: Cleanup code logic in e1000_check_for_serdes_link_82571()
    - LP: #1058219
  * e1000e: Program the correct register for ITR when using MSI-X.
    - LP: #1058219
  * e1000e: advertise transmit time stamping
    - LP: #1058219
  * e1000e: 82571 Tx Data Corruption during Tx hang recovery
    - LP: #1058219
  * e1000e: fix panic while dumping packets on Tx hang with IOMMU
    - LP: #1058219
  * e1000: Combining Bitwise OR in one expression.
    - LP: #1058221
  * e1000: advertise transmit time stamping
    - LP: #1058221
  * e1000: Small packets may get corrupted during padding by HW
    - LP: #1058221
  * sched: Fix migration thread runtime bogosity
    - LP: #1057593
  * ACER: Add support for accelerometer sensor
    - LP: #1055433
  * ACER: Fix Smatch double-free issue
    - LP: #1055433
  * SAUCE: HID: ntrig: change default value of logical/physical
    width/height to 1
    - LP: #1044248

  [ Ubuntu: 3.5.0-16.24 ]

  * SAUCE: ata_piix: add a disable_driver option
    - LP: #994870
  * (pre-stable) drm/radeon: make 64bit fences more robust v3 (3.5 stable)
    - LP: #1029582
  * SAUCE: ALSA: hda - use both input paths on Conexant auto parser
    - LP: #1037642
  * SAUCE: ALSA: hda - fix control names for multiple speaker out on
    IDT/STAC
    - LP: #1046734
  * SAUCE: ALSA: hda/via - don't report presence on HPs with no presence
    support
    - LP: #1052499
  * S...

Read more...

Changed in linux-armadaxp (Ubuntu Quantal):
status: Fix Committed → Fix Released
Ike Panhc (ikepanhc)
Changed in eilt:
status: Fix Committed → Fix Released
Changed in hwe-eilt:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.