Wrong code generated for 179.art with -mfpu=neon -O3 -marm

Bug #709453 reported by Kevin Jaget
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro GCC
Fix Released
High
Andrew Stubbs
Linaro GCC Tracking
Won't Fix
Undecided
Unassigned
gcc-4.5 (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

179.art from SPECfp 2000 runs incorrectly when built using "-marm -mcpu=cortex-a8 -mfloat-abi=softfp -mfpu=neon -O3". It works correctly replacing -mfpu=vfp in place of neon. It's also not a problem with thumb-2 code.

It looks like the bad code is in weightadj()

Specifically, the VFP code is :

VLDRGT d3,[r1,#8]
VLDRLE d3,[pc,#228] ; [0x8d90] = 0
VMUL.F64 d5,d5,d3

Basically, load d3 with either a value from memory or 0 depending on the results of a previous compare.

The corresponding code in the NEON binary :

VLDRGT d19,[r1,#8]
VMOV.I32 d19,#0
VMUL.F64 d18,d18,d19

The VMOV.I32 is the NEON equivalent - it's good since it eliminates a memory access. Unfortunately, it should be a conditional VMOV to match the condition VLDR in the vfp code.

This is a problem in the September, October and December releases. I haven't yet caught up to the January release so it may already be fixed?

Changed in gcc-linaro:
status: New → Confirmed
Revision history for this message
Ramana Radhakrishnan (ramana) wrote :

This is still a bug in the January release.

Looking at the dumps it appears as though this is because :

@(insn:TI 202 77 84 /home/ramrad01/benchmarks/cygdrive/d/benchspec/CFP2000/179.art/src/scanner.c:76 (cond_exec (unle (reg:CCFPE 24 cc)
@ (const_int 0 [0x0]))
@ (set (reg/v:DF 101 d19 [orig:244 result ] [244])
@ (const_double:DF 0.0 [0x0.0p+0]))) 3026 {*p *movdf_vfp} (expr_list:REG_DEAD (reg:CCFPE 24 cc)
@ (nil)))
vmov.i32 d19, #0

And not printing the condition code correctly.

Revision history for this message
Ramana Radhakrishnan (ramana) wrote :

There is no encoding for a conditional Neon instruction in
ARM state. Advanced SIMD only instructions can be used conditionally
only in IT blocks in Thumb state and IIRC even that is deprecated.

Thus the compiler shouldn't be using a vmov.i32 instruction in a pattern
that's being made conditional. The same problem is latent with Thumb2 ,
it's just that the compiler hasn't yet put a conditional vmov.i32
in an IT block.

Thus the patch that introduces this must be reverted and the fix
for improving loads of small const doubles which is a separate fix.
More on that in the other thread.

cheers
Ramana

Changed in gcc-linaro:
milestone: none → 4.5-2011.02-0
Michael Hope (michaelh1)
Changed in gcc-linaro:
status: Confirmed → Triaged
importance: Undecided → High
Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :

Just to confirm ...

Ramana, do you think it is this commit that causes the trouble?
 http://bazaar.launchpad.net/~linaro-toolchain-dev/gcc-linaro/4.5/revision/99350

If so, I can get that reverted right away. It would be better to fix it though, no?

Michael Hope (michaelh1)
Changed in gcc-linaro:
milestone: 4.5-2011.02-0 → 4.5-2011.03-0
Revision history for this message
Ramana Radhakrishnan (ramana) wrote :

Andrew ,

Missed your last comment. Yes that is the commit that needs reverting.

I would probably like to fix it up in a different manner and figure out why case 2 for *movdf_vfp doesn't get triggered and I *think* it's because of a different splitter in the backend that ends up splitting these into smaller moves through integer registers.

Ramana

Revision history for this message
Chung-Lin Tang (cltang) wrote :

I think Jie has a patch for this.
IIUC, the vmov.i32 alternative added to *movdf_vfp in that revision is not correct, it should be a separate pattern with "predicable" as "no".

Revision history for this message
Ramana Radhakrishnan (ramana) wrote :

Hi Chung-Lin,

Yes that is the reason for the original bug but I'm asking about the motivation of the original patch. Why is #0 being special cased when there is code in the ARM backend to generate fconstd's (which is the older syntax for the vmov.f64 instruction). As I said please investigate why that's not getting triggered ? I suspect it is because of a splitter in the backend.

cheers
Ramana

Revision history for this message
Chung-Lin Tang (cltang) wrote :

Hi Ramana,
I asked Julian, who did the VFPv3 support.
The constants supported by fconstd/vmov.f64 does not include 0.0; there's an inverted bit in the immediate expansion that makes a full 64-bit zero impossible to encode. It probably has to be done by the NEON integer vmov insns.

Revision history for this message
Michael Hope (michaelh1) wrote :

Assigned to Andrew to revert. Chung-Lin will open a new ticket if required when doing the actual fix.

See the notes at:
 https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings/2011-02-28

especially that this patch is not a win on the A9.

Changed in gcc-linaro:
assignee: nobody → Andrew Stubbs (ams-codesourcery)
Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :

I'm testing this now.

Changed in gcc-linaro:
status: Triaged → In Progress
Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :

Patch now reverted.

Changed in gcc-linaro:
status: In Progress → Fix Committed
Revision history for this message
Andrew Stubbs (ams-codesourcery) wrote :
Changed in gcc-linaro-tracking:
milestone: none → rejected
status: New → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gcc-4.5 - 4.5.2-5ubuntu1

---------------
gcc-4.5 (4.5.2-5ubuntu1) natty; urgency=low

  * For the natty release, do not pass --as-needed by default to the linker.
    The --as-needed flag will be default again for the o-series.

gcc-4.5 (4.5.2-5) unstable; urgency=low

  * Update to SVN 20110305 (r170696) from the gcc-4_5-branch.
    - Fix PR target/43810, PR fortran/47886, PR tree-optimization/47615,
      PR middle-end/47639, PR tree-optimization/47890, PR libfortran/47830,
      PR tree-optimization/46723, PR target/45261, PR target/45808,
      PR c++/46159, PR c++/47904, PR fortran/47886, PR libstdc++/47433,
      PR target/42240, PR fortran/47878, PR libfortran/47694.
  * Update the Linaro support to the 4.5-2011.03-0 release.
    - Fix LP: #705689, LP: #695302, LP: #710652, LP: #710623, LP: #721021,
      LP: #721021, LP: #709453.

gcc-4.5 (4.5.2-4) unstable; urgency=low

  * Update to SVN 20110222 (r170382) from the gcc-4_5-branch.
    - Fix PR target/43653, PR fortran/47775, PR target/47840,
      PR libfortran/47830.

  [ Matthias Klose ]
  * Don't apply a patch twice.
  * Build libgcc_s with -fno-stack-protector, when not building from the
    Linaro branch.
  * Backport proposed fix for PR tree-optimization/46723 from the trunk.

  [ Steve Langasek ]
  * debian/control.m4: add missing Multi-Arch: same for libgcc4; make sure
    Multi-Arch: same doesn't get set for libmudflap when building an
    Architecture: all cross-compiler package.
  * debian/rules2: use $libdir for libiberty.a.
  * debian/patches/gcc-multiarch-*.diff: make sure we're using the same
    set_multiarch_path definition for all variants.

  [ Sebastian Andrzej Siewior ]
  * PR target/44364
  * Remove -many on powerpcspe (__SPE__)
  * Remove classic FPU opcodes from libgcc if target has no support for them
    (powerpcspe)
 -- Matthias Klose <email address hidden> Sun, 06 Mar 2011 12:22:47 +0100

Changed in gcc-4.5 (Ubuntu):
status: New → Fix Released
Michael Hope (michaelh1)
Changed in gcc-linaro:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.