gcc 4.5 breaks overo in latest x-loader (1.5.1)

Bug #813018 reported by Andy Doan
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linaro GCC
Won't Fix
Undecided
Unassigned
Linaro Ubuntu
Fix Released
Medium
Ricardo Salveti
x-loader (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Sounds like all of omap3 might be broke, but I know Overo is. I get this (it never gets to u-boot):
@ @ @D��` @ @

Related branches

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

The package is basically using upstream, so I guess we can bisect it to see what broken Overo.

If you check https://gitorious.org/x-loader/x-loader/commits/master, you'll see that there was a bunch of patches trying to consolidate omap 3 support, by removing some duplicated work, so I believe that this is what broke Overo support.

As you have the board in hands, can you try to bisect it?

Revision history for this message
Andy Doan (doanac) wrote :

I've bisected the problem its:

  5383891: OMAP3: Move get_sys_clkin_sel() function to not duplicate code

Revision history for this message
Andy Doan (doanac) wrote :

This is starting to look like some type of problem related to linking. I'll try and describe what I've found:

= Method 1 (WORKS)
I took the previous get_sys_clkin_sel function from overo.c and added it back where it previously was (ie just befre the prcm_init function). This of course would cause a symbol collision, so I renamed the function to get_sys_clkin_sela

This boots and works, unless I remove the logic from the function.

= Method 2 (FAILS)
I took the same get_sys_clkin_sel function above and appended it to the end of the file and again named it get_sys_clkin_sela. This would not boot.

I'm using Natty with:
 arm-linux-gnueabi-gcc (Ubuntu/Linaro 4.5.2-8ubuntu3) 4.5.2

Revision history for this message
Andy Doan (doanac) wrote :

I've narrowed the problem down more. I changed my compiler to be:

  arm-linux-gnueabi-gcc (Ubuntu/Linaro 4.4.5-15ubuntu1) 4.4.5

and everything works

Revision history for this message
Andy Doan (doanac) wrote :

linaro gcc 4.6 appears to work also:
  arm-linux-gnueabi-gcc (Ubuntu/Linaro 4.6.0-14ubuntu1~ppa5) 4.6.1

summary: - Overo broken in: Version 1.5.1+git20110715+fca7cd2-1ubuntu1
+ gcc 4.5 breaks overo in latest x-loader (1.5.1)
Revision history for this message
Michael Hope (michaelh1) wrote :

I had a quick look at the disassembly of board/overo/overo.c both at fca7cd29b6821df3e7d8c4369522f2a3d01a5d7b and with Andy's work-around of duplicating get_sys_clkin_sel(), calling it get_sys_clkin_sela(), and putting it directly ahead of prcm_init(). I saw no unexplained difference between the two versions of prcm_init().

BTW, I noticed that you're compiling with -marm. Switching to -mthumb shrinks the 23468 byte .bin file to 17668. The binary also includes the fairly expensive optimised for speed versions of __aeabi_uidiv, __aeabi_uidivmod, __aeabi_idiv, __aeabi_idivmod, __aeabi_ldiv0, and __aeabi_idiv0.

Also the -ffixed-r8 seems redundant. u-boot uses this for some type of platform pointer, but it's not used at all in x-load.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Andy, for Ubuntu LEB I could just force gcc4.4 to be used for Natty and gcc4.6 for Oneiric, but would be good if you could do some more testing to see if we can identify what is breaking your x-loader. Can you also disassembly your binaries to check if there's any relevant changes?

Changed in linaro-ubuntu:
milestone: none → 11.07
status: New → Confirmed
Changed in x-loader (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Changed in linaro-ubuntu:
importance: Undecided → Medium
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Michael, -marm was enable as we had issues with thumb in the past. The commit that changed it:

commit bcd83f847520a6e92fc3e8d2934f7e1c75380406
Author: LoïMinier <email address hidden>
Date: Mon Mar 14 13:01:36 2011 +0530

    Add -marm -fno-stack-protector to CFLAGS on ARM

    The Linaro-based arm-linux-gnueabi cross-compiler in Ubuntu defaults to
    Thumb 2 and enables the stack protector by default, both of which can
    break x-loader at runtime (the former breaks the build).

Maybe we could experiment with thumb again, but would be good to test it with all boards again.

Revision history for this message
Andy Doan (doanac) wrote :

I did a little more experimenting and have simplified the reproduction. The following change makes the problem go away:
##########################################
diff --git a/cpu/omap3/sys_info.c b/cpu/omap3/sys_info.c
index 0d592e9..4a172a8 100644
--- a/cpu/omap3/sys_info.c
+++ b/cpu/omap3/sys_info.c
@@ -246,6 +246,8 @@ u32 get_sysboot_value(void)
  */
 void get_sys_clkin_sel(u32 osc_clk, u32 *sys_clkin_sel)
 {
+ static int andy_var;
+ andy_var++;
        if (osc_clk == S38_4M)
                *sys_clkin_sel = 4;
        else if (osc_clk == S26M)
##########################################

I don't really have any experience looking at dis assemblies, but nothing seems obviously wrong to me.

I've attached the two dis assemblies for reference.

Revision history for this message
Andy Doan (doanac) wrote :
Revision history for this message
Andy Doan (doanac) wrote :

I don't do assembly, but I hope this helps narrow down the issue:

I did some build hacks:
 1) straight 4.5 compiled version
 2) 4.5 for everything except a special rule to build sys_info.c with 4.6

As we attempt #1 fails. Attempt #2 works. So now I have two sys_info.o files. That I can interchange to toggle the issue. I compared their disassembles and noticed this difference:

= GOOD =
00000000 <sr32>:
   0: e590c000 ldr ip, [r0]
   4: e92d4010 push {r4, lr}
   8: e3e04000 mvn r4, #0
   c: e1e04214 mvn r4, r4, lsl r2
= BAD =
   0: e92d4010 push {r4, lr}
   4: e3e04000 mvn r4, #0
   8: e590c000 ldr ip, [r0]
   c: e1e04214 mvn r4, r4, lsl r2

There was also small differences in "secure_unlock" and "try_unlock_memory", but I thought they looked close enough. I ran a hexeditor and adjusted my bad .o to match the snippet above from my good .o. The system now boots with this hacked file.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Andy, for Linaro Ubuntu 11.07 (natty based) I pushed a new x-loader package forcing it to build with 4.4, can you check if it works for you?

Deb: https://launchpad.net/~linaro-maintainers/+archive/overlay/+files/x-loader-omap3-overo_1.5.1%2Bgit20110715%2Bfca7cd2-1ubuntu2~natty2_armel.deb

At least this way we can release it for now.

Changed in linaro-ubuntu:
assignee: nobody → Ricardo Salveti (rsalveti)
status: Confirmed → Fix Committed
Revision history for this message
Andy Doan (doanac) wrote : Re: [Bug 813018] Re: gcc 4.5 breaks overo in latest x-loader (1.5.1)

On 07/25/2011 11:49 PM, Ricardo Salveti wrote:
> Andy, for Linaro Ubuntu 11.07 (natty based) I pushed a new x-loader
> package forcing it to build with 4.4, can you check if it works for you?

It works. Thanks.

Changed in linaro-ubuntu:
status: Fix Committed → Fix Released
Revision history for this message
Michael Hope (michaelh1) wrote :

Hey, could we get some help on the toolchain side in fixing this? It smells like a linker problem as the code isn't changing but the offsets are.

What's the best way to reproduce this and debug it?

Revision history for this message
Andy Doan (doanac) wrote :

On 08/10/2011 09:41 PM, Michael Hope wrote:
> What's the best way to reproduce this and debug it?

I use the x-loader from gitorious:

 $ git clone git://gitorious.org/x-loader/x-loader.git

I then build it with:

 $ make O=out ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- overo_config
 $ make O=out ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-

The resulting file, out/MLO, can then be copied to your sdcard to test.
It either boots or it doesn't

However - I just re-tested to verify my steps and discovered it now
works. I noticed somewhere along the way my 4.5 compiler was updated to:
  arm-linux-gnueabi-gcc (Ubuntu/Linaro 4.5.3-1ubuntu2~ppa1)

According to the bug notes, it was originally discovered broken in
4.5.2-8ubuntu3. So it appears a fix slipped in between these two builds.

Revision history for this message
Michael Hope (michaelh1) wrote :

I suspect it's a linker problem that only occurs when the generated code lines up just right. A small change in compiler gives a small change in code which hides the linker problem.

I'll see about reproducing that particular compiler. I think I've got a BeagleBoard here somewhere which should boot a Overo MLO.

Revision history for this message
Michael Hope (michaelh1) wrote :

Marking as wont fix as this bug is old and has a work around. Please reopen if the fault appears with later versions of GCC.

Changed in gcc-linaro:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.