Opened 3 years ago

Closed 9 months ago

#7830 closed bug (fixed)

Error: operand out of range

Reported by: erikd Owned by:
Priority: highest Milestone: 7.10.3
Component: Compiler Version: 7.8.1-rc1
Keywords: Cc: pho@…, ptrommler@…
Operating System: Linux Architecture: powerpc
Type of failure: Installing GHC failed Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D1296
Wiki Page:

Description

Compiling on linux-powerpc and linux-powerpc64 I get:

"inplace/bin/ghc-stage1" -hisuf hi -osuf  o -hcsuf hc -static  -H32m -O -Werror -Wall \
   -H64m -O0    -package-name time-1.4.0.2 -hide-all-packages -i -ilibraries/time/.  \
   -ilibraries/time/dist-install/build -ilibraries/time/dist-install/build/autogen \
   -Ilibraries/time/dist-install/build -Ilibraries/time/dist-install/build/autogen \
   -Ilibraries/time/include   -optP-DLANGUAGE_Rank2Types -optP \
   DLANGUAGE_DeriveDataTypeable -optP-DLANGUAGE_StandaloneDeriving -optP-include \
   -optPlibraries/time/dist-install/build/autogen/cabal_macros.h -package base-4.7.0.0 \
   -package deepseq-1.3.0.2 -package old-locale-1.0.0.5 -Wall -XHaskell2010 -XRank2Types \
   -XDeriveDataTypeable -XStandaloneDeriving -XCPP -O2 -O -dcore-lint \
   -fno-warndeprecated-flags  -no-user-package-db -rtsopts -fno-warn-unused-do-bind \
   -fno-warn-deprecations -fno-warn-unused-imports -fno-warn-identities     -odir \
   libraries/time/dist-install/build -hidir libraries/time/dist-install/build -stubdir \ 
   libraries/time/dist-install/build  -dynamic-too -c libraries/time/./Data/Time/Format.hs \
   -o libraries/time/dist-install/build/Data/Time/Format.o -dyno \
   libraries/time/dist-install/build/Data/Time/Format.dyn_o
/tmp/ghc2806_0/ghc2806_1.s: Assembler messages:

/tmp/ghc2806_0/ghc2806_1.s:51766:0:
     Error: operand out of range (0x000000000000adf8 is not between 0xffffffffffff8000 and 0x0000000000007fff)

/tmp/ghc2806_0/ghc2806_1.s:51798:0:
     Error: operand out of range (0x000000000000ad90 is not between 0xffffffffffff8000 and 0x0000000000007fff)

/tmp/ghc2806_0/ghc2806_1.s:51830:0:
     Error: operand out of range (0x000000000000ad28 is not between 0xffffffffffff8000 and 0x0000000000007fff)

/tmp/ghc2806_0/ghc2806_1.s:51908:0:
     Error: operand out of range (0x000000000000ac1c is not between 0xffffffffffff8000 and 0x0000000000007fff)

Attachments (3)

0001-PPC-Fix-loads-of-PIC-data-for-offsets-32-k.patch (2.3 KB) - added by erikd 3 years ago.
Quick and dirty hack to fix 32 bit offset limitation.
0001-nativeGen-PPC-fix-16-bit-offsets-in-stack-handling.patch (9.5 KB) - added by erikd 10 months ago.
Patch for 7.10 branch
0001-Fix-merge-error.patch (681 bytes) - added by trommler 10 months ago.

Download all attachments as: .zip

Change History (51)

comment:1 Changed 3 years ago by igloo

  • difficulty set to Unknown
  • Milestone set to 7.8.1
  • Priority changed from normal to high

See also #7598

comment:2 Changed 3 years ago by erikd

Every instance of this "operand out of range" error involved the exact instruction:

  lwz     30, .Lnwa8-(1b)(31)

The target, .Lnwa8, is at the end of the file on line 65056. The instructions that are generating errors are in the range 51766 to 54954. There are no instances of that instruction before line 51766 and many in the line number range 55747 to 65045.

Obviously this instruction only works for small offsets. Need to read up on lwz and similar instructions.

comment:3 Changed 3 years ago by PHO

  • Cc pho@… added

comment:4 follow-up: Changed 3 years ago by PHO

Duplicate of #7814?

comment:5 in reply to: ↑ 4 Changed 3 years ago by heisenbug

Replying to PHO:

Duplicate of #7814?

Surely not a duplicate, but I have seen this too:

http://www.haskell.org/pipermail/ghc-devs/2013-April/000993.html

comment:6 Changed 3 years ago by erikd

The lwz instrunction only allows signed 16 bit offsets. Need to find something else.

comment:7 Changed 3 years ago by erikd

By hacking in a bunch of Show instances I've manged to find a pattern match in the PPC aseembler pretty printer for the code which pretty prints the problematic LWZ instruction:

LD II32

(RegReal (RealRegSingle 30)) (AddrRegImm (RegReal (RealRegSingle 31))

(ImmConstantDiff (ImmCLbl (AsmTempLabel _)) (ImmCLbl PicBaseLabel)))

I'm now going to try and make a temporary hack to the pprInst function to handle this special case differently and when that works, try to move the fix back into the CodeGen where it belongs.

comment:8 Changed 3 years ago by erikd

I have a rough hack work around for this but its not suitable for commiting to git yet. Basically in the pretty-printer (it really belongs in the codeGen and will be moved there) I rewrite:

    lwz     30, .label-(1b)(31)

to

    addis   30, 31, (.label-(1b))@ha
    lwz     30, (.label-(1b))@l(30)

This compiles but I then get an illegal instruction error in function cr_str. The call stack looks like this:

#0  0x0f3f3e24 in cr_str () from /home/erikd/Git/ghc-upstream/rts/dist/build/libHSrts-ghc7.7.20130623.so
#1  0x0f3df490 in stg_catchzh () from /home/erikd/Git/ghc-upstream/rts/dist/build/libHSrts-ghc7.7.20130623.so
#2  0x0f3cd01c in scheduleWaitThread () from /home/erikd/Git/ghc-upstream/rts/dist/build/libHSrts-ghc7.7.20130623.so
#3  0x0f3c73dc in rts_evalLazyIO () from /home/erikd/Git/ghc-upstream/rts/dist/build/libHSrts-ghc7.7.20130623.so
#4  0x0f3c9300 in hs_main () from /home/erikd/Git/ghc-upstream/rts/dist/build/libHSrts-ghc7.7.20130623.so
#5  0x10006ae4 in main ()

and the disassembled code looks like:

   0x0f3f3dec:  rlwimi  r1,r2,10,0,16
   0x0f3f3df0:  xoris   r2,r27,27237
   0x0f3f3df4:  ori     r20,r27,8293
   0x0f3f3df8:  xoris   r20,r19,25970
   0x0f3f3dfc:  oris    r4,r11,8448
   0x0f3f3e00:  .long 0xfffeae2c
   0x0f3f3e04:  .long 0xfffeae2c
   0x0f3f3e08:  .long 0xfffeae2c
   0x0f3f3e0c:  .long 0xfffeae2c
   0x0f3f3e10:  .long 0xfffeae2c
   0x0f3f3e14:  .long 0xfffeae2c
   0x0f3f3e18:  .long 0xfffeae2c
   0x0f3f3e1c:  .long 0xfffeae2c
   0x0f3f3e20:  .long 0xfffeae2c
=> 0x0f3f3e24:  .long 0xfffeae20
   0x0f3f3e28:  .long 0xfffeae20

with the illegal instruction indicated.

comment:9 Changed 3 years ago by erikd

I've broken my modification to the generated code into a standalone assembler program and stepped through it with gdb. I am now convinced that replacing this:

    bcl     20,31,1f
1:  mflr    31
    lwz     30, .mylabel-(1b)(31)

with this:

    bcl     20,31,1f
1:  mflr    31
    addis   30, 31, (.mylabel-(1b))@h
    lwz	    30, (.mylabel-(1b))@l(30)

is the correct thing to do which means the "illegal instruction error in function cr_str" problem in my last update is another different problem.

comment:10 Changed 3 years ago by erikd

I have applied my quick and dirty hack to allow 32 bit offsets for these load instructions to the ghc-7.6.3 release tarball and resulting compiler works.

Will attach a quick-and-dirty hack solution while I work on the illegal instruction issue.

Changed 3 years ago by erikd

Quick and dirty hack to fix 32 bit offset limitation.

comment:11 Changed 3 years ago by erikd

This patch only does anything useful when mk/build.mk has the following enabled:

GhcLibWays += dyn

comment:12 Changed 2 years ago by trommler

  • Cc ptrommler@… added

comment:13 Changed 2 years ago by erikd

I'd forgotten all about this ticket.

GHC now builds for me so I need to get back onto it.

comment:14 Changed 2 years ago by Erik de Castro Lopo <erikd@…>

In 67029f200c5512f8ba5b9b7c25a5d1131422ef8e/ghc:

PPC: Fix loads of PIC data with > 16 bit offsets (#7830).

Loads should now handle up to 32 bit offsets.

comment:15 Changed 2 years ago by erikd

  • Status changed from new to merge

comment:16 Changed 2 years ago by hvr

  • Priority changed from high to highest

comment:17 Changed 2 years ago by thoughtpolice

  • Version changed from 7.7 to 7.8.1-rc1

comment:18 Changed 2 years ago by thoughtpolice

  • Resolution set to fixed
  • Status changed from merge to closed

Merged in 7.8 for RC2.

comment:19 Changed 11 months ago by nomeata

  • Resolution fixed deleted
  • Status changed from closed to new

It seems that has resurfaced in 7.10, when compiling uuagc on powerpc:

/tmp/ghcf6f1_0/ghc_153.s: Assembler messages:

/tmp/ghcf6f1_0/ghc_153.s:81276:0:
     Error: operand out of range (0x0000000000008000 is not between 0xffffffffffff8000 and 0x0000000000007fff)

/tmp/ghcf6f1_0/ghc_153.s:81278:0:
     Error: operand out of range (0x0000000000008008 is not between 0xffffffffffff8000 and 0x0000000000007fff)

Erik, would you mind having a look?

comment:20 Changed 11 months ago by erikd

That's interesting. Git HEAD started displaying this build failure in my Jenkin instance about a week ago but I haven't had a chance to look at it yet. Weird thing is that the Git 7.10 branch build in my Jenkin instance is not failing.

Unfortunately, I may not get any time to look at this unless after ICFP.

Last edited 11 months ago by erikd (previous) (diff)

comment:21 Changed 11 months ago by thoughtpolice

  • Milestone changed from 7.8.1 to 8.0.1

Moving to 8.0.1 (since presumably we won't fix this in the 7.8 branch :)

comment:22 Changed 11 months ago by erikd

Looked through the git log to see if I could find possible breaking commits. The most obvious one was this:

commit d3c1dda60d0ec07fc7f593bfd83ec9457dfa7984
Author: Peter Trommler <ptrommler@acm.org>
Date:   Fri Jul 3 19:09:06 2015 +0200

Implement PowerPC 64-bit native code backend for Linux

but this commit builds fine. Currently running a git bisect.

comment:23 Changed 11 months ago by nomeata

Currently running a git bisect.

Great, keep us up-to-date.

comment:24 Changed 11 months ago by erikd

According to git bisect, the commit that broke PowerPC was:

commit 5d57087e314bd484dbe14958f9b422be3ac6641a
Author: Thomas Miedema <thomasmiedema@gmail.com>
Date:   Wed Aug 5 11:31:21 2015 +0200

Pretty: fix a broken invariant (#10735)

but since that doesn't touch any of the PowerPC specific code, the PowerPC code gen problem is simply a by-product of this patch.

comment:25 Changed 11 months ago by erikd

The generated code that causes a problem looks like:

	stw	31, 36328(1)
	lwz	31, 67(14)
	stw	29, 36336(1)
	lwz	29, 71(14)
	stw	28, 36344(1)
	lwz	28, 75(14)
	stw	26, 36352(1)

The problem is, that according to https://www-01.ibm.com/support/knowledgecenter/ssw_aix_71/com.ibm.aix.alangref/idalangref_stw_st_instrs.htm the offset is a signed 16 bit value and 36336 can'nt be represented in a signed 16 bit value.

comment:26 follow-up: Changed 11 months ago by trommler

This looks like code spilling registers onto the stack. But why are the offsets so large? Could you attach the entire assembler file?

I tried to reproduce on powerpc64 HEAD but compile failed to typecheck in package uualib. I'll try to reproduce with the commit you mentioned in comment:24.

comment:27 Changed 11 months ago by erikd

The file is too big to attach (even Bzip2 compressed its over 300k where as the limited is 256k) so I made it available here: http://www.mega-nerd.com/erikd/PprC.s.bz2

comment:28 Changed 10 months ago by erikd

Compiling git HEAD (currently c8d438fb02 but anything after commit 5d57087e31 is basically the same) fails on PowerPC due to the issue in this ticket (comment 19 and later). When it fails, the generated assembler for compiler/cmm/PprC.hs is about 300000 lines of code, but if I compile the tree at commit bcfae08c0b (the one before 5d57087e31) the generated assemble for this file is only 135297 lines of code.

Ignoring the stage1 compiler and looking at compiling compiler/cmm/PprC.hs with ghc-7.8.4 I see:

revision             wc -l PprC.s         lwz_count        stw_count
---------------------------------------------------------------------
commit bcfae08c0b       135297               27751           30921
HEAD (c8d438fb02)       279333               70407           77666

where

   lwz_count = grep lwz PprC.s | wc -l
   stw_count = grep stw PprC.s | wc -l

I think that huge increase in the number of lwz and stw instructions may be due to register spilling as suggested by @trommler.

Just out of curiosity I decided to do the same test on x86_64/linux and found:

revision             wc -l PprC.s
---------------------------------
commit bcfae08c0b       119246
HEAD (c8d438fb02)       246768

a near doubling of the code size.

Is it possible we are inlining something we shouldn't?

Last edited 10 months ago by erikd (previous) (diff)

comment:29 follow-ups: Changed 10 months ago by erikd

I noticed that since the above numbers are using the bootstrap compiler with -O0.

As a test, on PowerPC, changed -O0 to -O2 which cause the generated assembly file to grow to 302144 lines of code.

At @rwbarton's suggestion added NOINLINE pragmas to above_ and beside_ functions in compiler/utils/Pretty.hs and now the assembler file for compiler/cmm/PprC.hs is 158390 lines.

comment:30 Changed 10 months ago by erikd

Since the doubling of generated code size for compiler/cmm/PprC.hs is not PowerPC specific, opened ticket #10878.

comment:31 in reply to: ↑ 29 ; follow-up: Changed 10 months ago by erikd

Replying to erikd:

At @rwbarton's suggestion added NOINLINE pragmas to above_ and beside_ functions in compiler/utils/Pretty.hs and now the assembler file for compiler/cmm/PprC.hs is 158390 lines.

That worked when building the stage1 compiler using the bootstrap compiler, but it fails again when build the stage2 compiler with the stage1 compiler. The assembler file for compiler/cmm/PprC.hs compiled with the stage1 compiler is 220293 lines.

comment:32 in reply to: ↑ 29 Changed 10 months ago by heisenbug

Replying to erikd:

I noticed that since the above numbers are using the bootstrap compiler with -O0.

As a test, on PowerPC, changed -O0 to -O2 which cause the generated assembly file to grow to 302144 lines of code.

At @rwbarton's suggestion added NOINLINE pragmas to above_ and beside_ functions in compiler/utils/Pretty.hs and now the assembler file for compiler/cmm/PprC.hs is 158390 lines.

#8901 is a case where adding NOINLINE pragmas to the time library caused shrinkage in generated .o files by a factor ~5. Originally it also busted the 16-bit offsets on PPC32. Not sure whether the time maintainer followed my suggestions to add the pragmas or if everybody is just waiting for saner GHC inlining... :-(

comment:33 in reply to: ↑ 26 Changed 10 months ago by trommler

Replying to trommler:

This looks like code spilling registers onto the stack. But why are the offsets so large? Could you attach the entire assembler file?

This looks suspicious:

.Lc138b:
	addi	1, 1, -28336

I wonder why GHC allocates ~3500 spill slots (of size 8 bytes each) on top of the roughly 2000 that are allocated in the RTS before we enter Haskell land in StgCRun.

I tried to reproduce on powerpc64 HEAD but compile failed to typecheck in package uualib. I'll try to reproduce with the commit you mentioned in comment:24.

I cannot reproduce the issue on powerpc64 even though it has the same 16-bit displacement restriction.

I did a perf build, perhaps that makes a difference?! Did you use a build.mk file?

comment:34 in reply to: ↑ 31 ; follow-up: Changed 10 months ago by trommler

Replying to erikd:

Replying to erikd:

At @rwbarton's suggestion added NOINLINE pragmas to above_ and beside_ functions in compiler/utils/Pretty.hs and now the assembler file for compiler/cmm/PprC.hs is 158390 lines.

That worked when building the stage1 compiler using the bootstrap compiler, but it fails again when build the stage2 compiler with the stage1 compiler. The assembler file for compiler/cmm/PprC.hs compiled with the stage1 compiler is 220293 lines.

It looks like we have to support 32-bit loads from/stores to the stack too. Fortunately, we are looking at constant offset from the C stack pointer (r1) and the offsets are known at compilation time, so performance of the object code produced by GHC is not going to suffer much if at all.

I could prepare a patch next week and verify on powerpc64. @erikd could you help out building the patch on your machine?

comment:35 in reply to: ↑ 34 Changed 10 months ago by erikd

Replying to trommler:

I could prepare a patch next week and verify on powerpc64. @erikd could you help out building the patch on your machine?

Certainly!

comment:36 Changed 10 months ago by trommler

  • Owner set to trommler

comment:37 Changed 10 months ago by trommler

  • Differential Rev(s) set to Phab:D1296
  • Status changed from new to patch

I have uploaded a first attempt into Phabricator.

It validates on Linux x86_64. On powerpc64 (big endian) I see no more errors than before in validate.

comment:38 Changed 10 months ago by Ben Gamari <ben@…>

In b29f20e/ghc:

nativeGen PPC: fix > 16 bit offsets in stack handling

Implement access to spill slots at offsets larger than 16 bits.
Also allocation and deallocation of spill slots was restricted to
16 bit offsets. Now 32 bit offsets are supported on all PowerPC
platforms.

The implementation of 32 bit offsets requires more than one instruction
but the native code generator wants one instruction. So we implement
pseudo-instructions that are pretty printed into multiple assembly
instructions.

With pseudo-instructions for spill slot allocation and deallocation
we can also implement handling of the back chain pointer according
to the ELF ABIs.

Test Plan: validate (especially on powerpc (32 bit))

Reviewers: bgamari, austin, erikd

Reviewed By: erikd

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1296

GHC Trac Issues: #7830

comment:39 Changed 10 months ago by bgamari

  • Resolution set to fixed
  • Status changed from patch to closed

Merged.

comment:40 Changed 10 months ago by erikd

  • Milestone changed from 8.0.1 to 7.10.3

Building git HEAD (with @trommler's patch applied) using ghc-7.10.2 still fails because the stage-0 compiler (ghc-7.10.2) doesn't have this patch. However, building git HEAD with ghc-7.8.4 works fine.

Therefore requesting a back port of this patch to the 7.10 branch. Unfortunately @trommler's patch doesn't apply to the 7.10 branch so I'll get something working and attach it here.

Changed 10 months ago by erikd

Patch for 7.10 branch

comment:41 Changed 10 months ago by thomie

  • Status changed from closed to merge

comment:42 Changed 10 months ago by erikd

I've attached a version of @trommler's patch for the 7.10 branch. WIth this patch applied to ghc-7.10.2, that compiler can build git HEAD again.

comment:43 Changed 10 months ago by bgamari

  • Status changed from merge to closed

Thanks erikd! Applied to ghc-7.10.

Last edited 9 months ago by bgamari (previous) (diff)

Changed 10 months ago by trommler

comment:44 Changed 10 months ago by trommler

  • Owner trommler deleted
  • Resolution fixed deleted
  • Status changed from closed to new

Added patch for merge error in commit:e22d7dc.

Please merge onto ghc-7.10 branch.

comment:45 Changed 10 months ago by trommler

  • Status changed from new to patch

comment:46 Changed 10 months ago by trommler

  • Status changed from patch to merge

comment:47 Changed 10 months ago by trommler

Sorry, I hope this is right now. The attached patch 0001-Fix-merge-error.patch needs to be commited onto branch ghc-7.10.

Last edited 10 months ago by trommler (previous) (diff)

comment:48 Changed 9 months ago by bgamari

  • Resolution set to fixed
  • Status changed from merge to closed
Note: See TracTickets for help on using tickets.