omapdss DISPC error: GFX_FIFO_UNDERFLOW

Bug #732912 reported by Tom Gall
34
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro Linux
Fix Released
High
Unassigned
linux-linaro-omap (Ubuntu)
Fix Released
High
Linaro Power Management Working Group

Bug Description

observed using hwpack for omap3 dated 03102011, using the linaro-developer image dated 03102011

The following message is found via dmesg shortly after boot

[ 76.143890] omapdss DISPC error: GFX_FIFO_UNDERFLOW, disabling GFX

No other messages are seen related to this message. The system had completed booting, network was up and bash was just sitting at the prompt

Further it seems out of the blue. The system had coml

Revision history for this message
Tom Gall (tom-gall) wrote :

This just showed up when running the ALIP image dated 03102011 on a beagle C4 using the omap3-x11 hwpack dated 03102011.

The moment the mapdss DISPC error: GFX_FIFO_UNDERFLOW, disabling GFX unsuprisingly the graphical session disappeared from ALIP.

Here's a dmesg dump

[ 15.639648] usbcore: registered new interface driver asix
[ 25.179840] type=1400 audit(1299805192.173:2): apparmor="STATUS" operation="profile_load" name="/sbin/dhclient" pid=480 comm="apparmor_parse"
[ 25.212463] type=1400 audit(1299805192.212:3): apparmor="STATUS" operation="profile_load" name="/usr/lib/NetworkManager/nm-dhcp-client.actio"
[ 25.213256] type=1400 audit(1299805192.212:4): apparmor="STATUS" operation="profile_load" name="/usr/lib/connman/scripts/dhclient-script" pi"
[ 25.389617] type=1400 audit(1299805192.384:5): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient" pid=491 comm="apparmor_pa"
[ 25.390686] type=1400 audit(1299805192.384:6): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.ac"
[ 25.423156] type=1400 audit(1299805192.423:7): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script""
[ 25.651153] type=1400 audit(1299805192.650:8): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient" pid=495 comm="apparmor_pa"
[ 25.652252] type=1400 audit(1299805192.650:9): apparmor="STATUS" operation="profile_replace" name="/usr/lib/NetworkManager/nm-dhcp-client.ac"
[ 25.653015] type=1400 audit(1299805192.650:10): apparmor="STATUS" operation="profile_replace" name="/usr/lib/connman/scripts/dhclient-script"
[ 27.140533] eth0: link up, 100Mbps, full-duplex, lpa 0xCDE1
[ 27.247039] type=1400 audit(1299805194.244:11): apparmor="STATUS" operation="profile_replace" name="/sbin/dhclient" pid=503 comm="apparmor_p"
[ 27.295471] eth0: link up, 100Mbps, full-duplex, lpa 0xCDE1
[ 30.844970] audit_printk_skb: 15 callbacks suppressed
[ 30.845001] type=1400 audit(1299805197.845:17): apparmor="STATUS" operation="profile_load" name="/usr/sbin/tcpdump" pid=680 comm="apparmor_p"
[ 37.500396] eth0: no IPv6 routers present
[ 95.997070] omapdss DISPC error: GFX_FIFO_UNDERFLOW, disabling GFX

Revision history for this message
Tom Gall (tom-gall) wrote :

After a warm restart I see the following :

[ 5.712554] registered taskstats version 1
[ 5.717163] fbcvt: 1280x720@60: CVT Name - .921M9-R
[ 5.722747] usb 1-2.1.2: new low speed USB device using ehci-omap and address 6
[ 5.747558] Console: switching to colour frame buffer device 160x45
[ 6.760253] omap_i2c omap_i2c.1: controller timed out
[ 6.774566] twl: i2c_read failed to transfer all messages
[ 6.780364] _regulator_enable: VPLL2: is_enabled() failed: -110
[ 6.786956] omapfb omapfb: Failed to enable display 'dvi'
[ 6.793426] omapfb omapfb: failed to setup omapfb
[ 6.798400] Division by zero in kernel.
[ 6.798461] [<c006d618>] (unwind_backtrace+0x0/0xf8) from [<c0337094>] (Ldiv0+0x8/0x10)
[ 6.798522] Division by zero in kernel.
[ 6.798553] [<c006d618>] (unwind_backtrace+0x0/0xf8) from [<c0337094>] (Ldiv0+0x8/0x10)
[ 6.798706] Division by zero in kernel.
[ 6.798736] [<c006d618>] (unwind_backtrace+0x0/0xf8) from [<c0337094>] (Ldiv0+0x8/0x10)
[ 6.802764] Division by zero in kernel.
[ 6.802795] [<c006d618>] (unwind_backtrace+0x0/0xf8) from [<c0337094>] (Ldiv0+0x8/0x10)
[ 6.802856] Division by zero in kernel.
[ 6.802856] [<c006d618>] (unwind_backtrace+0x0/0xf8) from [<c0337094>] (Ldiv0+0x8/0x10)

Revision history for this message
Tom Gall (tom-gall) wrote :
Download full text (17.0 KiB)

A further cold restart. same error message after 96 seconds.

[ 96.624816] omapdss DISPC error: GFX_FIFO_UNDERFLOW, disabling GFX

Here's the full dmesg:

root@linaro:~# dmesg
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 2.6.38-1000-linaro-omap (buildd@crabapple) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-3ubuntu3) ) #1-Ubuntu SMP Thu F)
[ 0.000000] CPU: ARMv7 Processor [411fc083] revision 3 (ARMv7), cr=10c53c7f
[ 0.000000] CPU: VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
[ 0.000000] Machine: OMAP3 Beagle Board
[ 0.000000] bootconsole [earlycon0] enabled
[ 0.000000] Reserving 12582912 bytes SDRAM for VRAM
[ 0.000000] Memory policy: ECC disabled, Data cache writeback
[ 0.000000] OMAP3430/3530 ES3.1 (l2cache iva sgx neon isp )
[ 0.000000] SRAM: Mapped pa 0x40200000 to va 0xfe400000 size: 0x10000
[ 0.000000] On node 0 totalpages: 62464
[ 0.000000] free_area_init_node: node 0, pgdat c07ff740, node_mem_map c08c8000
[ 0.000000] Normal zone: 512 pages used for memmap
[ 0.000000] Normal zone: 0 pages reserved
[ 0.000000] Normal zone: 61952 pages, LIFO batch:15
[ 0.000000] PERCPU: Embedded 8 pages/cpu @c0acb000 s9760 r8192 d14816 u32768
[ 0.000000] pcpu-alloc: s9760 r8192 d14816 u32768 alloc=8*4096
[ 0.000000] pcpu-alloc: [0] 0
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 61952
[ 0.000000] Kernel command line: console=tty0 console=ttyO2,115200n8 root=UUID=b8d95c91-422a-48a1-90d5-54fa509c9fd2 rootwait ro earlyprintk0
[ 0.000000] PID hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] allocated 1310720 bytes of page_cgroup
[ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[ 0.000000] Memory: 244MB = 244MB total
[ 0.000000] Memory: 232936k/232936k available, 29208k reserved, 0K highmem
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] vector : 0xffff0000 - 0xffff1000 ( 4 kB)
[ 0.000000] fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB)
[ 0.000000] DMA : 0xffc00000 - 0xffe00000 ( 2 MB)
[ 0.000000] vmalloc : 0xd0800000 - 0xf8000000 ( 632 MB)
[ 0.000000] lowmem : 0xc0000000 - 0xd0000000 ( 256 MB)
[ 0.000000] modules : 0xbf000000 - 0xc0000000 ( 16 MB)
[ 0.000000] .init : 0xc0008000 - 0xc005b000 ( 332 kB)
[ 0.000000] .text : 0xc005b000 - 0xc079bf58 (7428 kB)
[ 0.000000] .data : 0xc079c000 - 0xc0801190 ( 405 kB)
[ 0.000000] SLUB: Genslabs=13, HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] Hierarchical RCU implementation.
[ 0.000000] RCU-based detection of stalled CPUs is disabled.
[ 0.000000] NR_IRQS:402
[ 0.000000] Clocking rate (Crystal/Core/MPU): 26.0/332/500 MHz
[ 0.000000] Reprogramming SDRC clock to 332000000 Hz
[ 0.000000] GPMC revision 5.0
[ 0.000000] IRQ: Found an INTC at 0xfa200000 (revision 4.0) with 96...

Revision history for this message
Paul McKenney (paulmck) wrote :

Does the following help?

Try allocating some more VRAM for fb0 and fb1 and change 'rootdelay=1' to
'rootwait' and remove 'nohz=off'

(From http://<email address hidden>/msg19936.html.)

Revision history for this message
Tom Gall (tom-gall) wrote :

Current 0317 omap3 hwpack no matter what user space image (nano, alip etc) fails across the board.

Revision history for this message
Paul McKenney (paulmck) wrote :

This fails even with the boot parameters called out in #4 above?

John Rigby (jcrigby)
Changed in linux-linaro-omap (Ubuntu):
assignee: nobody → John Rigby (jcrigby)
John Rigby (jcrigby)
Changed in linux-linaro-omap (Ubuntu):
status: New → Confirmed
Revision history for this message
John Rigby (jcrigby) wrote :

This needs to be assigned to someone else. The similar bugs found doing a web search were all about having more than one plane enabled and doing rotations. In those cases it makes sense that the memory bandwidth was being exhausted. That should not be the case here where all we have is a console.

Changed in linux-linaro-omap (Ubuntu):
assignee: John Rigby (jcrigby) → nobody
Revision history for this message
John Rigby (jcrigby) wrote :

After conversation via email (thanks andy green) that this might have something to do with power management. Turns out that ondemand scaling is turned on about 60 seconds after boot. From etc/init.d/ondemand:

#! /bin/sh
### BEGIN INIT INFO
# Provides: ondemand
# Required-Start: $remote_fs $all
# Required-Stop:
# Default-Start: 2 3 4 5
# Default-Stop:
# Short-Description: Set the CPU Frequency Scaling governor to "ondemand"
### END INIT INFO

PATH=/sbin:/usr/sbin:/bin:/usr/bin

. /lib/init/vars.sh
. /lib/lsb/init-functions

case "$1" in
    start)
        start-stop-daemon --start --background --exec /etc/init.d/ondemand -- background
        ;;
    background)
        sleep 60 # probably enough time for desktop login

        for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
        do
                [ -f $CPUFREQ ] || continue
                echo -n ondemand > $CPUFREQ
        done
        ;;
    restart|reload|force-reload)
        echo "Error: argument '$1' not supported" >&2
        exit 3
        ;;
    stop)
        ;;
    *)
        echo "Usage: $0 start|stop" >&2
        exit 3
        ;;
esac

Revision history for this message
Jamie Bennett (jamiebennett) wrote :

John, can you see anything in the power management code that could starve the video buffer and never allow it to come back to life?

Revision history for this message
Jamie Bennett (jamiebennett) wrote :

Subscribing Amitk who may have more insight here.

Revision history for this message
John Rigby (jcrigby) wrote :

In response to Jamie's question (Amitk will know better). Display driver hw sucks from ram and blows to an lcd or other graphics . It has to keep up a certain speed to keep the output side happy. To deal with occasional starving on the input side there is usually a fifo. A deeper fifo can help smooth out times when the bus on the input side is over taxed like when some other higher priority block is using lots of memory band width. When the fifo goes completely empty and cannot provide data fast enough for the output that is an underflow which what is happening here.

So the interesting new bit of news is that this apparently happens with the CPU_FREQ scheduler gets set to ondemand. My understanding is that should only be adjusting the CPU clock and not a peripheral clock like the display controller. So this "shouldn't happen".

Revision history for this message
Amit Kucheria (amitk) wrote :

Vishwa added the DVFS code to for OMAP3. He must've seen this problem. Adding him.

Revision history for this message
Andy Doan (doanac) wrote :

I just moved to the latest code from:
  git://git.linaro.org/kernel/linux-linaro-2.6.38.git

and the problem seems to have gone away for me.

Revision history for this message
Steve Langasek (vorlon) wrote :

Assigning to the PM WG. Amit, Vishwa, any progress on fixing this? Is there some power management patch we could back out as a workaround?

Changed in linux-linaro-omap (Ubuntu):
assignee: nobody → Linaro Power Management Working Group (linaro-pm-wg)
importance: Undecided → High
Revision history for this message
vishwanath sripathy (vishwanath-bs) wrote :

I saw a comment saying issue is no longer seen in 2.6.38 Linaro kernel. So does it mean this issue is no longer valid?
If someone is blocked because of this, I would suggest to use performance governor instead of ondemand to avoid cpufreq impact.

Revision history for this message
Tom Gall (tom-gall) wrote :

Thanks for the work around suggestion.

Until the update of the kernel in the linaro hwpack for omap3, this issue will affect those running on beagle C hardware.

Revision history for this message
Steve Langasek (vorlon) wrote :

Right, Andy did say it's fixed for him in the current linaro tree - I think I misread that, sorry.

John has also confirmed that it works for him on the Beagle XM and is now testing on the Beagle C4. If that checks out, we should see kernel packages including the fix soon.

Revision history for this message
Andy Doan (doanac) wrote : Re: [Bug 732912] Re: omapdss DISPC error: GFX_FIFO_UNDERFLOW

On 04/12/2011 09:38 AM, vishwanath sripathy wrote:
> I saw a comment saying issue is no longer seen in 2.6.38 Linaro kernel. So does it mean this issue is no longer valid?

It looks like it must have either been temporarily fixed and is now
broken, or my testing was incorrect last time.

I'm on commit:
 4447ee554813bb9d7eef8963722c6c72b554750c

from git://git.linaro.org/kernel/linux-linaro-2.6.38.git and its broken

Revision history for this message
Tom Gall (tom-gall) wrote :

Wait. This issue never showed up on the Xm. We should test this on a Beagle Cx if it hasn't been.

Revision history for this message
Mounir Bsaibes (mounir-bsaibes) wrote :

Saw the same error on IGEP with 2011-04-08 build. See https://bugs.launchpad.net/linaro-graphics-wg/+bug/729374
which maybe a duplicate of this bug.

Revision history for this message
John Rigby (jcrigby) wrote :

Tom in right, we never saw this on XM. In addition newer upstream does not work at all for my on any omap3. Not sure if it is my tree, my config or upstream. Chasing this now.

Revision history for this message
John Rigby (jcrigby) wrote :

Confirming Andy's report. This is still broken in git://git.linaro.org/kernel/linux-linaro-2.6.38.git. Again to recap the problem occurs when the governor is switched to ondemand.

Revision history for this message
John Rigby (jcrigby) wrote :

More info, commenting out the lowest two operating points for 34xx makes the problem go away:

diff --git a/arch/arm/mach-omap2/opp3xxx_data.c b/arch/arm/mach-omap2/opp3xxx_data.c
index e71c5ba..06b6297 100644
--- a/arch/arm/mach-omap2/opp3xxx_data.c
+++ b/arch/arm/mach-omap2/opp3xxx_data.c
@@ -105,11 +105,13 @@ struct omap_volt_data omap36xx_vddcore_volt_data[] = {
 /* OPP data */

 static struct omap_opp_def __initdata omap34xx_opp_def_list[] = {
+#if 0
        /* MPU OPP1 */
        OPP_INITIALIZER("mpu", true, 125000000, OMAP3430_VDD_MPU_OPP1_UV),
        /* MPU OPP2 */
        OPP_INITIALIZER("mpu", true, 250000000, OMAP3430_VDD_MPU_OPP2_UV),
        /* MPU OPP3 */
+#endif
        OPP_INITIALIZER("mpu", true, 500000000, OMAP3430_VDD_MPU_OPP3_UV),
        /* MPU OPP4 */
        OPP_INITIALIZER("mpu", true, 550000000, OMAP3430_VDD_MPU_OPP4_UV),

Revision history for this message
John Rigby (jcrigby) wrote :

So for the packaged kernel we have an ugly but working work-around. The upstream issue remains.

Revision history for this message
Andy Doan (doanac) wrote :

On 04/13/2011 10:21 AM, John Rigby wrote:
> More info, commenting out the lowest two operating points for 34xx makes
> the problem go away:
>
> diff --git a/arch/arm/mach-omap2/opp3xxx_data.c b/arch/arm/mach-omap2/opp3xxx_data.c
> index e71c5ba..06b6297 100644
> --- a/arch/arm/mach-omap2/opp3xxx_data.c
> +++ b/arch/arm/mach-omap2/opp3xxx_data.c
> @@ -105,11 +105,13 @@ struct omap_volt_data omap36xx_vddcore_volt_data[] = {
> /* OPP data */
>
> static struct omap_opp_def __initdata omap34xx_opp_def_list[] = {
> +#if 0
> /* MPU OPP1 */
> OPP_INITIALIZER("mpu", true, 125000000, OMAP3430_VDD_MPU_OPP1_UV),
> /* MPU OPP2 */
> OPP_INITIALIZER("mpu", true, 250000000, OMAP3430_VDD_MPU_OPP2_UV),
> /* MPU OPP3 */
> +#endif
> OPP_INITIALIZER("mpu", true, 500000000, OMAP3430_VDD_MPU_OPP3_UV),
> /* MPU OPP4 */
> OPP_INITIALIZER("mpu", true, 550000000, OMAP3430_VDD_MPU_OPP4_UV),
>

Rather than surrounding than unconditionally disabling this, could we
change "#if 0" to something like "#ifndef CONFIG_OMAP2_DSS". That way
people could still enable modes when not using a graphical display?

Revision history for this message
John Rigby (jcrigby) wrote :

Will do.

Changed in linux-linaro:
importance: Undecided → High
John Rigby (jcrigby)
Changed in linux-linaro-omap (Ubuntu):
status: Confirmed → Fix Committed
Nicolas Pitre (npitre)
Changed in linux-linaro:
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-linaro-omap - 2.6.38-1002.3

---------------
linux-linaro-omap (2.6.38-1002.3) natty; urgency=low

  [ Upstream Fixes ]

  * MUSB: shutdown: Make sure block is awake before doing shutdown
    - LP: #745737
  * Fixed gpio polarity of gpio USB-phy reset.
    - LP: #747639

  [ Andy Green ]

  * LINARO: SAUCE: disable CONFIG_OMAP_RESET_CLOCKS
    - LP: #752900

  [ John Rigby ]

  * Rebase to new upstreams:
    Linux v2.6.38.1
    linaro-linux-2.6.38-upstream-29Mar2011
    Ubuntu-2.6.38-7.35
  * SAUCE: OMAP4: clock: wait for module to become accessible on
    a clk enable
    - LP: #745737
  * Rebase to new upstreams:
    Linux v2.6.38.2
    linaro-linux-2.6.38-upstream-5Apr2011
    Ubuntu-2.6.38-8.41
    - LP: #732842
  * Update configs for device tree, dvfs and lttng
  * LINARO: add building of dtb's
  * LINARO: SAUCE: Disable lowest operating freqs on omap34xx
    - LP: #732912
 -- John Rigby <email address hidden> Thu, 14 Apr 2011 12:16:06 -0600

Changed in linux-linaro-omap (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
DaMiEn667 (damien667) wrote :

http://e2e.ti.com/cfs-filesystemfile.ashx/__key/CommunityServer-Discussions-Components-Files/447/7220.0001_2D00_OMAP3EVM_2D00_Set_2D00_minimum_2D00_throughput_2D00_requirement_2D00_for_2D00_DSS.txt

TI guys fixed it on the EVM long ago... I applied their patch to the relative files for beagleboard and it fixed the issue on my beagleboard C4.

Revision history for this message
Andy Doan (doanac) wrote :

The fix DaMiEn667 mentioned in comment #28 never seems to have been mainlined.

I just did a quick test with the latest omap tree:
 commit: c7769fdf6a9c87c6c3ddca0103566eabc5e4db01

The underflwo now longer occurs, so I think the problem has already been fixed upstream.

Revision history for this message
Nicolas Pitre (npitre) wrote :

On Tue, 24 May 2011, Andy Doan wrote:

> The fix DaMiEn667 mentioned in comment #28 never seems to have been
> mainlined.
>
> I just did a quick test with the latest omap tree:
> commit: c7769fdf6a9c87c6c3ddca0103566eabc5e4db01
>
> The underflwo now longer occurs, so I think the problem has already
> been fixed upstream.

Could you bisect the upstream tree to find the fix? Note that you
should invert the logic i.e. mark the working commits as bad and the
non-working ones as good.

Revision history for this message
Tom Gall (tom-gall) wrote :

closing out / clean up.

Changed in linux-linaro:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.