instabilities with highmem activated

Bug #633227 reported by Sebastien JAN
52
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux-ti-omap4 (Ubuntu)
Fix Released
High
Canonical ARM Developers
Maverick
Fix Released
High
Unassigned
Natty
Fix Released
High
Canonical ARM Developers

Bug Description

Seen on Maverick: 2.6.35-903.9
HW: pandaboard ES2.0

Using following kernel memory allocation (in bootargs):
mem=460M@0x80000000 mem=512M@0xA0000000

Instabilities have been observed in 2 different ways:

1) The following memtester test:
sudo memtester -p 0xb0000000 120
Fails in few seconds with a "illegal instruction” error.
Then various behaviors can be seen: the UI can freeze, shell commands be unavailable. The systems works well again after a reboot.

2) By doing a native build of a kernel package (with file-system on SD card, and kernel sources on an NFS mount):
After 15mins to 1h30, a "compiler error: bus error" triggers and the build stops (and the platform hangs).

This issue cannot be reproduced if using mem=460M@0x80000000 mem=256M@0xA000000.
This issue cannot be reproduced with highmen deactivated from the kernel config.
This issue can be reproduced with 'nosmp' in kernel command line.

Tags: armel omap4 panda
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Upstream thread about this issue: http://lkml.org/lkml/2010/9/8/425

Brian is currently testing it to see if it actually fixes our problem. Will also test it later on.

Revision history for this message
Sebastien JAN (sebjan) wrote :

Patch tested:

1) I was not able to reproduce the memtester problem with it. So we shall take this patch.

2) But the bus error when building the kernel package are still here.
So there is still another issue pending.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

With the patch the kernel seems to be better, but still have mem issues.

This happened when trying to run memtester while compiling the kernel:
ubuntu@panda-maverick-usb:~$ sudo memtester -p 0xb0000000 120
memtester version 4.1.3 (32-bit)
Copyright (C) 2010 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 120MB (125829120 bytes)
Loop 1:
  Stuck Address : setting 0Connection to panda-maverick-usb.local closed.

<----- Disconnected from my ssh connection

Kernel Build:
/home/ubuntu/kernel/ubuntu-maverick/kernel/wait.c: In function ‘remove_wait_queue’:
/home/ubuntu/kernel/ubuntu-maverick/kernel/wait.c:289: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.4/README.Bugs> for instructions.
make[3]: *** [kernel/wait.o] Error 1
make[2]: *** [kernel] Error 2
make[2]: *** Waiting for unfinished jobs....
/home/ubuntu/kernel/ubuntu-maverick/mm/vmalloc.c: In function ‘T.534’:
/home/ubuntu/kernel/ubuntu-maverick/mm/vmalloc.c:902: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.4/README.Bugs> for instructions.
  CC mm/pagewalk.o
gcc: Internal error: Segmentation fault (program cc1)
Please submit a full bug report.
See <file:///usr/share/doc/gcc-4.4/README.Bugs> for instructions.
make[3]: *** [mm/vmalloc.o] Segmentation fault
make[3]: *** Waiting for unfinished jobs....
gcc: Internal error: Segmentation fault (program cc1)
Please submit a full bug report.
See <file:///usr/share/doc/gcc-4.4/README.Bugs> for instructions.
make[3]: *** [mm/pagewalk.o] Segmentation fault
make[2]: *** [mm] Error 2
make[1]: *** [sub-make] Error 2
make[1]: Leaving directory `/home/ubuntu/kernel/ubuntu-maverick'
make: *** [/home/ubuntu/kernel/ubuntu-maverick/debian/stamps/stamp-build-omap4] Error 2

Then the went into a very unstable state (failing next memtest run).

Revision history for this message
Bryan Wu (cooloney) wrote :

I reproduced this issue on my omap4 board,

---
[ 1446.044982] Unhandled fault: imprecise external abort (0x1c06) at 0x00518a0c
[ 1446.052429] Internal error: : 1c06 [#1] PREEMPT SMP
[ 1446.057556] last sysfs file: /sys/kernel/uevent_seqnum
[ 1446.062957] Modules linked in: twl4030_pwrbutton
[ 1446.067810] CPU: 0 Not tainted (2.6.35-903-omap4 #11+bounce)
[ 1446.074127] PC is at __dabt_usr+0x2c/0x40
[ 1446.078338] LR is at 0x334a6c
[ 1446.081420] pc : [<c0544eac>] lr : [<00334a6c>] psr: 40000093
[ 1446.081451] sp : ef2bdfb0 ip : 0000000c fp : 00710bb4
[ 1446.093414] r10: bea272dc r9 : bea272a8 r8 : 41b2d4e0
[ 1446.098876] r7 : 00000001 r6 : 00000000 r5 : 00000028 r4 : ffffffff
[ 1446.105712] r3 : 40000030 r2 : 00057e14 r1 : bea272dc r0 : 10c53c7d
[ 1446.112518] Flags: nZcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[ 1446.120056] Control: 10c53c7d Table: af3e404a DAC: 00000015
[ 1446.126037] Process cc1 (pid: 8806, stack limit = 0xef2bc2f8)
[ 1446.132049] Stack: (0xef2bdfb0 to 0xef2be000)
[ 1446.136596] dfa0: bea272dc 00000001 00000002 01279f40
[ 1446.145141] dfc0: bea272dc 00000028 00000000 00000001 41b2d4e0 bea272a8 bea272dc 00710bb4
[ 1446.153686] dfe0: 0000000c bea27158 00466209 00057e14 40000030 ffffffff 00000000 00000000
[ 1446.162231] Code: e9406000 e51f0048 e5900000 ee010f10 (ebec154b)
----

As discussed with Ricardo, we plan to try upstream mainline kernel on our ES2.0 for this issue.

Revision history for this message
Bryan Wu (cooloney) wrote :

After some discussion with Nicolas Pitre, I did following testing:

1. Just build kernel in MMC SD card.
The same result, It fails as our reported.

2. Add mem=512M vmalloc=1G to kernel boot command line,
Kernel doesn't boot on my board at all

3. Set CONFIG_VMSPLIT_2G=y
Building kernel can run longer, but eventually it fails due to the same error

4. Add flush_cache_all() at the begin of dma_cache_maint_page() function
---
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 592f05d..f7f083a 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -448,6 +448,9 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset,
         * optimized out.
         */
        size_t left = size;
+
+ flush_cache_all();
+
        do {
                size_t len = left;
                void *vaddr;
---

It's similar to test 3. Building kernel can run longer, but eventually it fails due to the same error

Error message:
---
  CC [M] net/ipv4/netfilter/nf_nat_standalone.o
/tmp/ccUqqYy4.s: Assembler messages:
/tmp/ccUqqYy4.s:59470: Error: can't resolve value for symbol `.LASF2946'
make[5]: *** [drivers/staging/pohmelfs/net.o] Error 1
make[4]: *** [drivers/staging/pohmelfs] Error 2
make[3]: *** [drivers/staging] Error 2
make[2]: *** [drivers] Error 2
make[2]: *** Waiting for unfinished jobs....
  CC [M] net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.o
  CC [M] net/ipv4/netfilter/nf_conntrack_proto_icmp.o
  CC [M] net/ipv4/netfilter/nf_conntrack_l3proto_ipv4_compat.o
  CC [M] net/ipv4/netfilter/nf_nat_core.o
  CC [M] net/ipv4/netfilter/nf_nat_helper.o
  CC [M] net/ipv4/netfilter/nf_nat_proto_unknown.o
  CC [M] net/ipv4/netfilter/nf_nat_proto_common.o
  CC [M] net/ipv4/netfilter/nf_nat_proto_tcp.o
  CC [M] net/ipv4/netfilter/nf_nat_proto_udp.o
  CC [M] net/ipv4/netfilter/nf_nat_proto_icmp.o
  CC [M] net/ipv6/ah6.o
  CC [M] net/ipv4/netfilter/nf_defrag_ipv4.o
  CC [M] net/ipv6/esp6.o
  CC [M] net/ipv4/netfilter/nf_nat_amanda.o
  CC [M] net/ipv4/netfilter/nf_nat_ftp.o
  CC [M] net/ipv6/ipcomp6.o
  CC [M] net/ipv4/netfilter/nf_nat_h323.o
  CC [M] net/ipv6/xfrm6_tunnel.o
[ 5364.381378] Unhandled fault: imprecise external abort (0x1406) at 0x42a3e000
In file included from /home/ubuntu/ubuntu-maverick/net/ipv6/xfrm6_tunnel.c:29:
/home/ubuntu/ubuntu-maverick/include/net/xfrm.h:550: internal compiler error: Bus error
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.4/README.Bugs> for instructions.
make[4]: *** [net/ipv6/xfrm6_tunnel.o] Error 1
make[3]: *** [net/ipv6] Error 2
make[3]: *** Waiting for unfinished jobs....
  CC [M] net/ipv4/netfilter/nf_nat_irc.o
  CC [M] net/ipv4/netfilter/nf_nat_pptp.o
----

Thanks a lot,
-Bryan

Oliver Grawert (ogra)
Changed in linux-ti-omap4 (Ubuntu Maverick):
milestone: none → ubuntu-10.10
importance: Undecided → High
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

With kernel Ubuntu-2.6.35-903.13 we have a workaround to use 1GB and avoid using highmem. While the highmem issue is not fixed, at least this can be the way to go for Maverick.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-ti-omap4 - 2.6.35-903.13

---------------
linux-ti-omap4 (2.6.35-903.13) maverick; urgency=low

  [ Bryan Wu ]

  * SAUCE: Fix a buidling error when turn off CONFIG_SMP
    - LP: #647890
  * [Config] Enable CONFIG_VMSPLIT_2G=y for OMAP4
    - LP: #633227

  [ Ricardo Salveti de Araujo ]

  * [Config] OMAP: Enable needed Errata for OMAP4 to work with DMA based
    device drivers

  [ Upstream Kernel Changes ]

  * ARM: do not define VMALLOC_END relative to PAGE_OFFSET
 -- Tim Gardner <email address hidden> Thu, 30 Sep 2010 17:04:40 +0100

Changed in linux-ti-omap4 (Ubuntu Maverick):
status: New → Fix Released
Revision history for this message
Oliver Grawert (ogra) wrote :

we still cant use the full 1G, while the patch improves it and 768M are now usable with special bootoptions, i will reopen so we can track the issue until it is completely fixed.

(the bootoptions we use in the current images: mem=460M@0x80000000 mem=256M@0xA0000000)

Changed in linux-ti-omap4 (Ubuntu):
status: Fix Released → Confirmed
milestone: ubuntu-10.10 → ubuntu-11.04
Changed in linux-ti-omap4 (Ubuntu Natty):
assignee: nobody → Canonical ARM Developers (canonical-arm-dev)
Revision history for this message
Bryan Wu (cooloney) wrote :

I built the latest 2.6.37 based Linaro kernel for omap4 panda, which is disabled HIGHMEM. But I still can see this bug when I was building the kernel natively on the omap4 board.

Since Linaro kernel is very close to upstream mainline, I believe mainline kernel also has this issue.

-Bryan

Revision history for this message
Bryan Wu (cooloney) wrote :
Download full text (16.2 KiB)

I tried build 2.6.37 mainline kernel which was just released with the Ubuntu Natty ti-omap4 kernel config. 2.6.37 kernel boots fine with our Ubuntu Natty minimal root file system. But I can still reproduce this issue.

I will post this issue to upstream mail list for more attention.

==
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 2.6.37+ (roc@tangerine) (gcc version 4.4.5 20100824 (prerelease) (Ubuntu/Linaro 4.4.4-9ubuntu2) ) 1
[ 0.000000] CPU: ARMv7 Processor [411fc092] revision 2 (ARMv7), cr=10c53c7f
[ 0.000000] CPU: VIPT nonaliasing data cache, VIPT aliasing instruction cache
[ 0.000000] Machine: OMAP4 Panda board
[ 0.000000] Memory policy: ECC disabled, Data cache writealloc
[ 0.000000] OMAP4430 ES2.0
[ 0.000000] SRAM: Mapped pa 0x40300000 to va 0xfe400000 size: 0xe000
[ 0.000000] FIXME: omap44xx_sram_init not implemented
[ 0.000000] On node 0 totalpages: 262144
[ 0.000000] free_area_init_node: node 0, pgdat c06f13c0, node_mem_map c0772000
[ 0.000000] Normal zone: 1536 pages used for memmap
[ 0.000000] Normal zone: 0 pages reserved
[ 0.000000] Normal zone: 195072 pages, LIFO batch:31
[ 0.000000] HighMem zone: 512 pages used for memmap
[ 0.000000] HighMem zone: 65024 pages, LIFO batch:15
[ 0.000000] PERCPU: Embedded 7 pages/cpu @c0f7a000 s7104 r8192 d13376 u32768
[ 0.000000] pcpu-alloc: s7104 r8192 d13376 u32768 alloc=8*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260096
[ 0.000000] Kernel command line: console=ttyS2,115200n8 console=ttyO2,115200n8 noinitrd root=/dev/mmcblk0p2 rootdelay=1 ip=dG
[ 0.000000] PID hash table entries: 4096 (order: 2, 16384 bytes)
[ 0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[ 0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 0.000000] allocated 5242880 bytes of page_cgroup
[ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[ 0.000000] Memory: 1024MB = 1024MB total
[ 0.000000] Memory: 1026756k/1026756k available, 21820k reserved, 262144K highmem
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] vector : 0xffff0000 - 0xffff1000 ( 4 kB)
[ 0.000000] fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB)
[ 0.000000] DMA : 0xffc00000 - 0xffe00000 ( 2 MB)
[ 0.000000] vmalloc : 0xf0800000 - 0xf8000000 ( 120 MB)
[ 0.000000] lowmem : 0xc0000000 - 0xf0000000 ( 768 MB)
[ 0.000000] pkmap : 0xbfe00000 - 0xc0000000 ( 2 MB)
[ 0.000000] modules : 0xbf000000 - 0xbfe00000 ( 14 MB)
[ 0.000000] .init : 0xc0008000 - 0xc003d000 ( 212 kB)
[ 0.000000] .text : 0xc003d000 - 0xc06a9bec (6579 kB)
[ 0.000000] .data : 0xc06aa000 - 0xc06f6ec0 ( 308 kB)
[ 0.000000] SLUB: Genslabs=13, HWalign=32, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[ 0.000000] Preemptable hierarchical RCU implementation.
[ 0.000000] RCU-based detection of stalled CPUs is disabled.
[ 0.000000] Verbose stalled-CPUs detectio...

Revision history for this message
Bryan Wu (cooloney) wrote :

This config file is based on Ubuntu Natty TI OMAP4 config file

tags: added: omap4 panda
Revision history for this message
Bryan Wu (cooloney) wrote :

Post this bug to upstream linux-omap mail list [1]. And tested 2 patches from Santosh, these 2 patches don't fix this issue.

Sebjan helped to test the kernel which turn off CONFIG_SMP, still reproduce this issue.

[1]: http://www.spinics.net/lists/linux-omap/msg43778.html

Revision history for this message
Jamie Bennett (jamiebennett) wrote :

According to http://www.spinics.net/lists/linux-omap/msg44055.html it seem disabling CONFIG_HIGHMEM at least hides this problem.

Revision history for this message
Sebastien JAN (sebjan) wrote :

Yes, I confirm: I have been runnning kernel native builds for around 40hours without reproducing the issue with CONFIG_HIGHMEM deactivated (this was the only change I did compared to the kernel config provided by Bryan, above in #11.

Revision history for this message
Sebastien JAN (sebjan) wrote :

I did re-test with mainline 2.6.38-rc2 kernel (plus few patches from Linux-Omap tree / omap-testing branch required for booting the pandaboard on top of .38-rc2).

I was able to reproduce the issue with highmem activated (after 20 minutes or 2hours of native kernel build).

I was not able to reproduce the issue with highmem deactivated and running native kernel builds for more than 60hours.

Attached is the kernel config I was using (with highmem deactivated) to boot a Ubuntu file-system with my 2.6.38-rc2+ kernel.

Revision history for this message
Bryan Wu (cooloney) wrote :

I tried latest 2.6.38-rc4 kernel with Angstrom file system. System is very unstable when I building kernel on it. Sometimes BUG() oops filed, sometimes system froze totally. Please find my log here:
http://pastebin.ubuntu.com/566990/
http://pastebin.ubuntu.com/566987/

A friend of mine, Ming Lei who is helping maintaining musb driver of OMAP tried building kernel on his mass production Panda board. He said he never met this unstable issue on that. And He recalled that he might met this issue before on EA Panda board.

So Sebastien, do you have latest mass production Panda board for testing? I'm suspecting this issue is hardware related.

Thanks,
-Bryan

Revision history for this message
warmcat (andy-warmcat) wrote :

@Bryan.... is there any chance you can try specifically your SD Card that fails in Ming Lei's new board? There are a lot of possible variables like x-loader, U-Boot and kernel version, even maybe rootfs actions that can be eliminated as the sole cause of the problem if that same card works on his device.

Revision history for this message
Marcin Juszkiewicz (hrw) wrote :

Bryan: I have this problem on EA1, had it also on A1 (which was not mine so cant test anymore).

Revision history for this message
Ming Lei (tom-leiming) wrote :

Seems A1 board is more stable than EA1 version.

I had one EA1 pandaboard before, but was destroyed, so I buy another A1 board.

Attach the kernel building log(build_log.tar.gz) and the config file(20110214-panda.log.tar.gz)
used in building kernel for my A1 pandaboard.

From the log, you may see I can succeed in building kernel on Pandaboard for 2 times
and no any kernel error has been observed, both highmem and 3G/1G option are enabled

In fact, I have tried doing the below also to reproduce the issue by building kernel on
Pandaboard A1, but did not succeed in triggering the issue on my board:
         - building kernel on SD card
         - use gcc-4.5.1
         - use same kernel parameters with Brywu
         - building kernel after consuming 512M memory by test code wrote by me

So does the issue only exist in old EA1 hardware? Maybe we may ask other guys
to try reproducing the issue on their A1 board to see if the same result can be got.

Revision history for this message
Marcin Juszkiewicz (hrw) wrote :

Ming: can you attach kernel+modules?

Revision history for this message
Ming Lei (tom-leiming) wrote :

kernel : 2.6.38-rc4 + patch-v2.6.38-rc4-next-20110208.bz2 + fix_timer_lockdep_warning.patch(this attachment file)
config : see config.tar.gz of last attachment log.tar.gz

Revision history for this message
Ming Lei (tom-leiming) wrote :

kernel and modules binary files:

see attachment

Revision history for this message
Marcin Juszkiewicz (hrw) wrote :

Ming: which your kernel my pandaboard booted and shutdowned 20s later. Next boots shutdowned during kernel boot. Just like in bug 708883

Revision history for this message
Ming Lei (tom-leiming) wrote :

Hi Marcin,

Suppose you use ubuntu rootfs, usb(ehci-omap) does not work on my pandaboard with
the kernel and modules(even with ubuntu 10 and 11 release for omap4), but is OK if I run
the same kernel and modules on Angstrom.

I guess some Ubuntu applications does some special things which may casue usb exception
on my pandaboard.

Maybe your issue is related with it.

If you'd like to, you may test the kernel and modules on Angstrom rootfs, which can be
built with the link below.

http://narcissus.angstrom-distribution.org/

Revision history for this message
Marcin Juszkiewicz (hrw) wrote :

Ming: I will take my 8GB sd which has Angstrom on it and will test.

Long time ago I was one of Ångström developers.

Revision history for this message
Marcin Juszkiewicz (hrw) wrote :

I am sorry but could not make it today. Next week I am at emdebian sprint without pandaboard.

Revision history for this message
Ming Lei (tom-leiming) wrote :

Anyone who has panda A1 board may try to reproduce the issue.

If at least two guys may observe no such issue on A1, we can
confirm it is a hardware-dependent issue.

Also we can run memory test in uboot or linux(use memtester) to
see if any hardware related issue can be found about DDR on EA1.
Bryan has one EA1 board, maybe can try it.

Revision history for this message
warmcat (andy-warmcat) wrote :

FWIW, I tried to reproduce this with CONFIG_HIGHMEM and 460M + 512M, confirmed in /proc/meminfo that there is highmem and set it compiling a kernel repeatedly. After 5 hours it didn't fail. This is with Seb Jan's 2.6.38-rc2 tree.

I'll test it further later.

Revision history for this message
warmcat (andy-warmcat) wrote :

(#28 above is on an A1 board with ES2.1 silicon)

Revision history for this message
Jamie Bennett (jamiebennett) wrote :

Subscribing Nicolas for commentary.

Revision history for this message
Paolo Pisati (p-pisati) wrote :
Download full text (4.1 KiB)

flag@flag-desktop:~$ cat /proc/cmdline
console=ttyO2,115200n8 ro elevator=noop vram=32M mem=460M@0x80000000 mem=512M@0xA0000000 root=UUID=12bb2147-1997-42a8-b7e0-8e44ed0649ef fixrtc

flag@flag-desktop:~$ sudo ./devmem2 0x4A002204
/dev/mem opened.
Memory mapped at address 0x40119000.
Value at address 0x4A002204 (0x40119204): 0x3B95C02F

natty/ti-omap4:

flag@flag-desktop:~$ uname -a
Linux flag-desktop 2.6.38-1207-omap4 #10 SMP PREEMPT Thu Mar 31 18:12:49 CEST 2011 armv7l GNU/Linux

-8+ hours of continuous kernel compilation (rootfs on sd, linux source on nfs) with no ill effects
-strangely memtester doesn't even start:

flag@flag-desktop:~$ sudo memtester -p 0xb0000000 120
memtester version 4.1.3 (32-bit)
Copyright (C) 2010 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 120MB (125829120 bytes)
failed to mmap /dev/mem for physical memory: Operation not permitted

modifying boot.scr from "mem=512M@0xA0000000" to "mem=256M@0xA0000000" makes memtest work for hours with no problem.

on the other hand, with a maverick/ti-omap4 kernel:

flag@flag-desktop:~$ uname -a
Linux flag-desktop 2.6.35-903-omap4 #22 SMP PREEMPT Thu Mar 31 17:34:59 CEST 2011 armv7l GNU/Linux

flag@flag-desktop:~$ sudo memtester -p 0xb0000000 120
memtester version 4.1.3 (32-bit)
Copyright (C) 2010 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffff000
want 120MB (125829120 bytes)
Loop 1:
  Stuck Address : setting 0
Connection to panda closed.

on the serial console:

                                                                               [ 65.264556] Kernel panic - not syncing: Attempted to kill init!
[ 65.270751] [<80046788>] (unwind_backtrace+0x0/0xe4) from [<80546ff4>] (panic+0x58/0xe4)
[ 65.279235] [<80546ff4>] (panic+0x58/0xe4) from [<8007e9d8>] (find_new_reaper+0x6c/0x98)
[ 65.287719] [<8007e9d8>] (find_new_reaper+0x6c/0x98) from [<80080004>] (forget_original_parent+0x28/0x128)
[ 65.297821] [<80080004>] (forget_original_parent+0x28/0x128) from [<80080114>] (exit_notify+0x10/0x134)
[ 65.307617] [<80080114>] (exit_notify+0x10/0x134) from [<800804fc>] (do_exit+0x2c4/0x34c)
[ 65.316192] [<800804fc>] (do_exit+0x2c4/0x34c) from [<80080614>] (do_group_exit+0x90/0xc0)
[ 65.324829] [<80080614>] (do_group_exit+0x90/0xc0) from [<8008f524>] (get_signal_to_deliver+0x2bc/0x2f4)
[ 65.334808] [<8008f524>] (get_signal_to_deliver+0x2bc/0x2f4) from [<80042a3c>] (do_signal+0x50/0x1d4)
[ 65.344512] [<80042a3c>] (do_signal+0x50/0x1d4) from [<80042bd8>] (do_notify_resume+0x18/0x50)
[ 65.353546] [<80042bd8>] (do_notify_resume+0x18/0x50) from [<8003faf0>] (work_pending+0x1c/0x20)
[ 65.362792] CPU0: stopping
[ 65.362823] [<80046788>] (unwind_backtrace+0x0/0xe4) from [<800450bc>] (ipi_cpu_stop+0x30/0x5c)
[ 65.362854] [<800450bc>] (ipi_cpu_stop+0x30/0x5c) from [<8003f268>] (do_IPI+0xc0/0x104)
[ 65.362884] [<8003f268>] (do_IPI+0xc0/0x104) from [<8054a208>] (__irq_svc+0x48/0xe0)
[ 65.362884] Exception stack(0xbecd9ea0 to 0xbecd9ee8)
[ 65.362884] 9ea0: 00000200 80fe4620 00000000 00000aa3 becd9fb0 800000...

Read more...

Revision history for this message
Sebastien JAN (sebjan) wrote :

@paolo: I don't think the memtester -p test is relevant: when using the -p option with the whole memory address-able by the kernel you can expect some kernel memory corruption (which you did verify actually):
Here is the memtester man entry for the -p option:
      -p PHYSADDR
              tells memtester to test a specific region of memory starting at physical address PHYSADDR (given in hex), by
              mmap(2)ing /dev/mem. This is mostly of use to hardware developers, for testing memory-mapped I/O devices and simi‐
              lar. Note that the memory region will be overwritten during testing, so it is not safe to specify memory which is
              allocated for the system or for other applications; doing so will cause them to crash. If you absolutely must test a
              particular region of actual physical memory, arrange to have that memory allocated by your test software, and hold it
              in this allocated state, then run memtester on it with this option.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Using latest kernel available (2.6.38-1207-omap4) and I'm not able to reproduce this issue anymore.

My panda is up for more than 2 days, doing more than 20 kernel builds and still no kernel message and no error. Will let it building more but it seems that I'm finally unable to reproduce this bug.

Revision history for this message
Eshwar Andhavarapu (eshwar.andhavarapu) wrote :

is this kernel in the latest ubuntu natty omap4 build?

Revision history for this message
Radu Cristescu (radu.c) wrote :

@Eshwar: yes, daily build 20110409 has kernel 2.6.38-1207-omap4.

Revision history for this message
Ricardo Salveti (rsalveti) wrote : Re: [Bug 633227] Re: instabilities with highmem activated

On Sun, Apr 10, 2011 at 5:40 AM, Eshwar Andhavarapu
<email address hidden> wrote:
> is this kernel in the latest ubuntu natty omap4 build?

Yes, package linux-image-2.6.38-1207-omap4 - 2.6.38-1207.10.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Building kernel now for more than 3 days, with more than 30 builds and still nor error. I believe we can finally change our boot args to use "mem=460M@0x80000000 mem=512M@0xA0000000".

Revision history for this message
Sebastien JAN (sebjan) wrote :

That's a good news!

Note however that the mem arguments shall be "mem=456M@0x80000000 mem=512M@0xA0000000": the size shall be a multiple of 8MB (unless this constraint would have been reduced by Russel recently?).

Revision history for this message
Oliver Grawert (ogra) wrote :

the default arcgs are changed to teh above at the image builder now, new installs should use the full 1G ... can we close this bug now (and leave it for re-opening if it re-occurs) ?

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Closing for now, please reopen in case the bug still happens.

Changed in linux-ti-omap4 (Ubuntu Natty):
status: Confirmed → Fix Released
Revision history for this message
Gregoire Gentil (gregoire-gentil) wrote :

I'm not sure to understand. This bug had nothing to do with the kernel . It was a bug in x-load.

Revision history for this message
Nicolas Dechesne (ndec) wrote :

@gregoire: why do you believe it's a xloader issue? i don't think we have any evidence of this at this point.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.