Opened 22 months ago

Closed 21 months ago

Last modified 21 months ago

#7885 closed bug (fixed)

LLVM build broken

Reported by: gmainland Owned by:
Priority: normal Milestone:
Component: Compiler (LLVM) Version: 7.7
Keywords: Cc: carter.schonwald@…, dterei
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Building GHC failed Test Case:
Blocked By: Blocking:
Related Tickets: #7694, #7874 Differential Revisions:

Description

The LLVM build has been broken for a number of weeks. Unfortunately this happened before I started running nightly LLVM builds. Here is the error I see (taken from the nightly build log):

inplace/bin/dll-split compiler/stage2/build/.depend-v-dyn-p-dyn.haskell "DynFlags" "Annotations Avail Bag BasicTypes Binary Bitmap BlockId BreakArray BufWrite ByteCodeAsm ByteCodeInstr ByteCodeItbls ByteCodeLink CLabel Class CmdLineParser Cmm CmmCallConv CmmExpr CmmInfo CmmMachOp CmmNode CmmType CmmUtils CoAxiom CodeGen.Platform CodeGen.Platform.ARM CodeGen.Platform.NoRegs CodeGen.Platform.PPC CodeGen.Platform.PPC_Darwin CodeGen.Platform.SPARC CodeGen.Platform.X86 CodeGen.Platform.X86_64 Coercion Config Constants CoreArity CoreFVs CoreSubst CoreSyn CoreTidy CoreUnfold CoreUtils CostCentre DataCon Demand Digraph DriverPhases DynFlags Encoding ErrUtils Exception FamInstEnv FastBool FastFunctions FastMutInt FastString FastTypes Fingerprint FiniteMap ForeignCall Hoopl Hoopl.Dataflow HsBinds HsDecls HsDoc HsExpr HsImpExp HsLit HsPat HsSyn HsTypes HsUtils HscTypes Id IdInfo IfaceSyn IfaceType InstEnv InteractiveEvalTypes Kind ListSetOps Literal Maybes MkCore MkGraph MkId Module MonadUtils 
 Name NameEnv NameSet ObjLink OccName OccurAnal OptCoercion OrdList Outputable PackageConfig Packages Pair Panic Platform PlatformConstants PprCmm PprCmmDecl PprCmmExpr PprCore PrelNames PrelRules Pretty PrimOp RdrName Reg RegClass Rules SMRep Serialized SrcLoc StaticFlags StgCmmArgRep StgCmmClosure StgCmmEnv StgCmmLayout StgCmmMonad StgCmmProf StgCmmTicky StgCmmUtils StgSyn Stream StringBuffer TcEvidence TcType TyCon Type TypeRep TysPrim TysWiredIn Unify UniqFM UniqSet UniqSupply Unique Util Var VarEnv VarSet"
dll-split: internal error: scavenge_one: strange object -385875968
    (GHC version 7.7.20130502 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
Aborted (core dumped)
make[1]: *** [compiler/stage2/dll-split.stamp] Error 134
make: *** [all] Error 2
Command exited with non-zero status 2
4506.92user 192.03system 1:19:51elapsed 98%CPU (0avgtext+0avgdata 4147968maxresident)k
1400inputs+13715568outputs (3major+33440223minor)pagefaults 0swaps

Change History (20)

comment:1 Changed 22 months ago by carter

I reported (possibly the same?) problem a few days ago.

other ticket

Apparently part of the problem is LLVM 3.2? (at least according to thoughtpolice)

comment:2 Changed 22 months ago by gmainland

#7694 is the ticket for 3.2, which I filed. At the time LLVM 3.1 and 3.3svn worked to bootstrap GHC, but 3.2 didn't.

The problem is now more serious, as GHC no longer bootstraps with LLVM 3.1 (the nightly builds use 3.1).

comment:3 Changed 22 months ago by carter

do we know what commits these problems started at?

comment:4 Changed 22 months ago by gmainland

No.

comment:5 Changed 22 months ago by carter

  • Cc carter.schonwald@… added

comment:6 Changed 22 months ago by thoughtpolice

comment:7 Changed 22 months ago by carter

i'll start trying to git bisect to the commit that started it all. Step 1: i'm tracking down a commit state that i can build via llvm using llvm 3.2 and then binary searching forward.

comment:8 Changed 22 months ago by gmainland

As mentioned above, 3.2 never worked to bootstrap GHC, and then at some point 3.1 stopped working too. You're welcome to try and find a commit state that bootstraps with 3.2, but I suspect that you will have more luck finding one that bootstraps with 3.1.

comment:9 Changed 22 months ago by carter

OK, so if you checkout GHC code with the hash be66c4ef548a06a3eecc8e1eee0957884ce6ec98 from april 21st 1230pm by Ian Lynagh, ghc builds via llvm. (yes, llvm 3.2 ). I'll run the test suite on that version to make sure everything is kosher.

the make settings I did were as follows (which isn't in the standard sample make file ). Settings done mostly to speed up the build process.

I'll check that current HEAD fails with those settings then proceed to start incrementally doing the git bisection process over the next day or so

BuildFlavor = quickest-llvm

# -------- A Fast build With LLVM------------------------------------------------------

ifeq "$(BuildFlavour)" "quickest-llvm"

SRC_HC_OPTS        = -H64m -O0 -fllvm
GhcStage1HcOpts    = -O -fllvm
GhcStage2HcOpts    = -O0 -fllvm
GhcLibHcOpts       = -O0 -fllvm
SplitObjs          = NO
HADDOCK_DOCS       = NO
BUILD_DOCBOOK_HTML = NO
BUILD_DOCBOOK_PS   = NO
BUILD_DOCBOOK_PDF  = NO

endif

comment:10 Changed 22 months ago by carter

heres my test suite results for make WAYS=optllvm

Unexpected results from:
TEST="T5313 conc058 conc068 conc065 conc066 conc020 conc035 conc015 conc017 qq008 qq007 space_leak_001"

OVERALL SUMMARY for test run started at Sun May  5 20:03:06 EDT 2013
    3628 total tests, which gave rise to
   14341 test cases, of which
   12756 were skipped

      23 had missing libraries
    1536 expected passes
      14 expected failures

       0 caused framework failures
       1 unexpected passes
      11 unexpected failures

Unexpected passes:
   driver  T5313 (optllvm)

Unexpected failures:
   concurrent/should_run  conc015 [exit code non-0] (optllvm)
   concurrent/should_run  conc017 [exit code non-0] (optllvm)
   concurrent/should_run  conc020 [exit code non-0] (optllvm)
   concurrent/should_run  conc035 [exit code non-0] (optllvm)
   concurrent/should_run  conc058 [exit code non-0] (optllvm)
   concurrent/should_run  conc065 [exit code non-0] (optllvm)
   concurrent/should_run  conc066 [exit code non-0] (optllvm)
   concurrent/should_run  conc068 [exit code non-0] (optllvm)
   perf/space_leaks       space_leak_001 [stat too good] (optllvm)
   quasiquotation/qq007   qq007 [exit code non-0] (optllvm)
   quasiquotation/qq008   qq008 [exit code non-0] (optllvm)
 

comment:11 Changed 22 months ago by carter

I had no llvm build failures with master HEAD commit c041b6205936ce32dba0c7a41332650ee6d2daab
Here's the test results for make WAY=optllvm

Unexpected results from:
TEST="T5313 conc058 conc068 conc065 conc066 conc020 conc035 conc015 conc017 qq008 qq007 space_leak_001"

OVERALL SUMMARY for test run started at Sun May  5 22:21:05 EDT 2013
    3628 total tests, which gave rise to
   14341 test cases, of which
   12756 were skipped

      23 had missing libraries
    1536 expected passes
      14 expected failures

       0 caused framework failures
       1 unexpected passes
      11 unexpected failures

Unexpected passes:
   driver  T5313 (optllvm)

Unexpected failures:
   concurrent/should_run  conc015 [exit code non-0] (optllvm)
   concurrent/should_run  conc017 [exit code non-0] (optllvm)
   concurrent/should_run  conc020 [exit code non-0] (optllvm)
   concurrent/should_run  conc035 [exit code non-0] (optllvm)
   concurrent/should_run  conc058 [exit code non-0] (optllvm)
   concurrent/should_run  conc065 [exit code non-0] (optllvm)
   concurrent/should_run  conc066 [exit code non-0] (optllvm)
   concurrent/should_run  conc068 [exit code non-0] (optllvm)
   perf/space_leaks       space_leak_001 [stat too good] (optllvm)
   quasiquotation/qq007   qq007 [exit code non-0] (optllvm)
   quasiquotation/qq008   qq008 [exit code non-0] (optllvm)

I'll next try doing a clean build of HEAD with quick-llvm and see if that build fails or not... (because either a commit in the past day or so has fixed things, or the bug comes from the combination of llvm + optimization + the dylinking bits?)

As mentioned before, this is all using the llvm 3.2 release on os x, installed via mac homebrew

comment:12 Changed 22 months ago by gmainland

Unfortunately, a git hash of the ghc repository is not a complete repository state---one also needs the hashes of all the library repositories too (see utils/fingerprint.py, http://hackage.haskell.org/trac/ghc/wiki/Building/GettingTheSources#Trackingthefullrepositorystate).

The LLVM 3.2 bootstrap bug prevents ghc from even building. Is that not the failure you are seeing? It looks like you are just seeing some tests fail. The reported bootstrap failure occurred on Linux x64, so you may not see it at all.

For reference, here is the log of the failed perf-llbm build from April 21:

http://www.haskell.org/pipermail/ghc-builds/2013-April/000808.html

It contains a fingerprint, so you can completely reproduce the checkout state and see if that version of GHC+libraries bootstraps with LLVM 3.2 for you on Mac OS X.

comment:13 Changed 22 months ago by carter

It looks like head builds / bootstraps via llvm fine as long as stage 2 and the libs aren't optimized.

When I did the "quick llvm" build against head and the libs as of yesterday, I had the segfault build failure for stage 2 as I reported in the other ticket http://hackage.haskell.org/trac/ghc/ticket/7874. the only difference between that and the sucessful build was that the libraries were build with -O rather than -O0. So it might be a linker related issue that only happens when the llvm build is doing optimization?

I'll do a test build at that finger print state in the next day or so after I get some other work done.

comment:14 Changed 21 months ago by dterei

  • Cc dterei added

comment:15 Changed 21 months ago by carter

Has anyone dug into this more recently / tried doing the build now that 3.3 llvm is out?

i'll try to do a test build with that soon if thats still needed.

comment:16 Changed 21 months ago by dterei

I'm going to try to look into this soon but I struggle to find time these days to fix problems like this so don't hold your breath :).

LLVM 3.3 isn't out yet, comes out in about a week.

comment:17 Changed 21 months ago by gmainland

I believe the culprit was the switch to a dynamic GHC/GHCi. The LLVM back end does not yet work when building dynamically. I've changed mk/build.mk.sample to include the following bits for the perf-llvm and quick-llvm build flavours.

DYNAMIC_BY_DEFAULT   = NO
DYNAMIC_GHC_PROGRAMS = NO

With this change, both llvm build flavours now work for me on Linux x86_64. We'll see how the builders do tonight.

comment:18 Changed 21 months ago by gmainland

  • Resolution set to fixed
  • Status changed from new to closed

comment:19 Changed 21 months ago by carter

building quick-llvm crashes hard and early with llvm-head (3.4svn), see http://hackage.haskell.org/trac/ghc/ticket/7996

i'm testing doing a quick-llvm build with llvm-3.2 right now to see how that works out, will report back on that once its done.

comment:20 Changed 21 months ago by gmainland

It's important to distinguish between LLVM-generated code crashing and GHC generating LLVM code that LLVM doesn't like. #7996 is an example of the latter case. LLVM 3.2, on the other hand, generates bad binary code, which is a very different thing.

You need to update HEAD. See my response to #7996.

Note: See TracTickets for help on using tickets.