Opened 5 years ago

Closed 5 years ago

#7580 closed bug (fixed)

Building PrimOps.cmm on OS X with LLVM 3.2 fails

Reported by: thoughtpolice Owned by: thoughtpolice
Priority: normal Milestone:
Component: Compiler (LLVM) Version: 7.7
Keywords: Cc: dterei, simonmar,…, bgamari@…
Operating System: MacOS X Architecture: Unknown/Multiple
Type of failure: Building GHC failed Test Case:
Blocked By: Blocking: #7588, #7590
Related Tickets: #7571, #7575 Differential Rev(s):
Wiki Page:


Using LLVM 3.2 (release,) or LLVM 3.3svn, on my Mac OS X 10.8 machine, the stage1 compiler fails to compile PrimOps.cmm with this error:

$ make
===--- building phase 0
make -r --no-print-directory -f phase=0 phase_0_builds
make[1]: Nothing to be done for `phase_0_builds'.
===--- building phase 1
make -r --no-print-directory -f phase=1 phase_1_builds
make[1]: Nothing to be done for `phase_1_builds'.
===--- building final phase
make -r --no-print-directory -f phase=final all
"inplace/bin/ghc-stage1" -static -optc-DDEBUG  -H64m -O0 -fllvm -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -DCOMPILING_RTS -package-name rts  -dcmm-lint  -DDTRACE     -i -irts -irts/dist/build -irts/dist/build/autogen -Irts/dist/build -Irts/dist/build/autogen            -O0    -c rts/PrimOps.cmm -o rts/dist/build/PrimOps.debug_o
/usr/local/bin/opt: /var/folders/f6/rjtvxfp92j3ffvm3zs7hv7vh0000gn/T/ghc99887_0/ghc99887_0.ll:6800:17: error: invalid cast opcode for cast from 'i64' to 'i64'
  %ln1N5 = zext i64 1 to i64

This seems to occur inside stg_retryzh, which looks like:

==================== LLVM Code ====================
define  cc 10 void @stg_retryzh(i64* noalias nocapture %Base_Arg, i64* noalias nocapture %Sp_Arg, i64* noalias nocapture %Hp_Arg, i64 %R1_Arg, i64 %R2_Arg, i64 %R3_Arg, i64 %R4_Arg, i64 %R5_Arg, i64 %R6_Arg, i64 %SpLim_Arg) align 8 nounwind
  %ln1MS = call ccc i8* (i8*,i8*)* @stmStartTransaction( i8* %ln1MP, i8* %ln1MR ) nounwind
  %ln1MT = ptrtoint i8* %ln1MS to i64
  store i64 %ln1MT, i64* %lccV
  %ln1MU = load i64** %Base_Var
  %ln1MV = getelementptr inbounds i64* %ln1MU, i32 25
  %ln1MW = bitcast i64* %ln1MV to i64*
  %ln1MX = load i64* %ln1MW, !tbaa !4
  %ln1MY = add i64 %ln1MX, 8
  %ln1MZ = add i64 %ln1MY, 72
  %ln1N0 = load i64* %lccV
  %ln1N1 = inttoptr i64 %ln1MZ to i64*
  store i64 %ln1N0, i64* %ln1N1, !tbaa !5
  %ln1N2 = load i64* %lccU
  %ln1N3 = add i64 %ln1N2, 8
  %ln1N4 = add i64 %ln1N3, 0
  %ln1N5 = zext i64 1 to i64                <------ Invalid
  %ln1N6 = inttoptr i64 %ln1N4 to i64*
  store i64 %ln1N5, i64* %ln1N6, !tbaa !5
  %ln1N7 = load i64* %lccU
  %ln1N8 = add i64 %ln1N7, 8
  %ln1N9 = add i64 %ln1N8, 16
  %ln1Na = inttoptr i64 %ln1N9 to i64*
  %ln1Nb = load i64* %ln1Na, !tbaa !5
  store i64 %ln1Nb, i64* %R1_Var
  %ln1Nc = load i64** %Base_Var
  %ln1Nd = load i64** %Sp_Var
  %ln1Ne = load i64** %Hp_Var
  %ln1Nf = load i64* %R1_Var
  %ln1Ng = load i64* %SpLim_Var
  tail call cc 10 void (i64*,i64*,i64*,i64,i64,i64,i64,i64,i64,i64)* @stg_ap_v_fast( i64* %ln1Nc, i64* %ln1Nd, i64* %ln1Ne, i64 %ln1Nf, i64 undef, i
  ret void

Which would seem to correspond to this in PrimOps.cmm:

stg_retryzh /* no arg list: explicit stack layout */
  W_ frame_type;
  W_ frame;
  W_ trec;
  W_ outer;
  W_ r;
  (r) = ccall stmWait(MyCapability() "ptr", CurrentTSO "ptr", trec "ptr");
  if (r != 0) {
  } else {
    // Transaction was not valid: retry immediately
    ("ptr" trec) = ccall stmStartTransaction(MyCapability() "ptr", outer "ptr");
    StgTSO_trec(CurrentTSO) = trec;
    Sp = frame;
    R1 = StgAtomicallyFrame_code(frame);
    jump stg_ap_v_fast [R1];

I'm still investigating. I think this is related to #7571 and #7575.

Attachments (1)

0001-Fix-invalid-MachOp-coercions-when-extending-truncati.patch (3.1 KB) - added by thoughtpolice 5 years ago.
Add reference to trac # in commit msg

Download all attachments as: .zip

Change History (19)

comment:1 Changed 5 years ago by thoughtpolice

The CMM produced for PrimOps.cmm in this case looks like this:

stg_retryzh() //  []
         { info_tbl: []
           stack_info: arg_space: 0 updfr_space: Nothing
           (_ccl::I64) = call "ccall" ... stmStartTransaction(BaseReg - 24, _ccm::I64);
           I64[CurrentTSO + 8 + 72] = _ccl::I64;
           Sp = _cck::I64;
           R1 = I64[_cck::I64 + 8 + 0];
           call stg_ap_v_fast(R1) args: 8, res: 0, upd: 8;

So it looks like the store to Sp is introducing the zero-extended one & store, but I'm still looking as to why.

comment:2 Changed 5 years ago by thoughtpolice

There's logic to handle this in LlvmCodeGen.CodeGen but I'm not sure why it's not kicking in:

        sameConv from ty reduce expand = do
            x'@(env', vx, stmts, top) <- exprToVar env x
            let sameConv' op = do
                (v1, s1) <- doExpr ty $ Cast op vx ty
                return (env', v1, stmts `snocOL` s1, top)
            let toWidth = llvmWidthInBits dflags ty
            -- LLVM doesn't like trying to convert to same width, so
            -- need to check for that as we do get Cmm code doing it.
            case widthInBits from  of
                 w | w < toWidth -> sameConv' expand
                 w | w > toWidth -> sameConv' reduce
                 _w              -> return x'

comment:3 Changed 5 years ago by simonmar

difficulty: Unknown
Owner: set to dterei

I don't see where the cast is coming from. The LLVM code looks like it has a lot of loads and stores in it too - where do they come from? This is probably David T's area.

comment:4 Changed 5 years ago by thoughtpolice

Owner: changed from dterei to thoughtpolice

I have fixed this bug. But I cannot quite answer your question.

The bugfix: In a nutshell, if you look at that snipped I posted, the secret is the definition of exprToVar:

exprToVar :: LlvmEnv -> CmmExpr -> UniqSM ExprData
exprToVar env = exprToVarOpt env (wordOption (getDflags env))

The exprToVarOpt call essentially forces the return value of the expression. By default it is the word size of the build (64, here.) In this case, we detected that we needed to convert a i32 to an i64 (the w | w < toWidth branch.) So we execute sameConv', which invokes exprToVar, which turns vx into an LlvmVar with a width of the *native* word size. Which in this case, is i64. So the check passes, but the variable we substitute for the coercion is of the wrong type. Annoying.

The fix looks like this:

sameConv from ty reduce expand = do
            x'@(env', vx, stmts, top) <- exprToVarOpt env (EOption $ Just $ widthToLlvmInt from) x

I will create a patch and try to narrow down a test case. The patch I cooked up depends on #7571, which has a prerequisite patch and explanation of this same issue. I'll post it shortly.

As for the cast, I am not precisely sure where it comes from. It comes from the MO_UU_Conv MachOp, which just does an unsigned -> unsigned conversion (optionally extending/truncating.) I don't know why it seems to be inserted for the store to Sp.

As for the loads/stores - isn't that natural? It looks fine to me. Before we pass code to opt, the backend generates tons of loads and stores for all variable modification as opposed to using phi nodes for control flow. This is so that we can use the mem2reg pass in LLVM to essentially do the phi conversion for us, since we're essentially in CPS form at that point. The code generated by -ddump-llvm is of course before opt is run.

Looking at the code, it seems pretty in line with the snippet of Cmm from the -ddump-cmm bit I posted, no?

comment:5 Changed 5 years ago by thoughtpolice

Cc:… added
Status: newpatch

The patch is attached. No testcase yet, but being able to bootstrap the compiler is a pretty good one itself. :)

I apologize these raft of tickets (the last 3-4) are so intertwined. I've ended up doing more than I expected this weekend and I realize we have minimal manpower for reviews, so please tell me if I can expedite the review process at all.

Changed 5 years ago by thoughtpolice

Add reference to trac # in commit msg

comment:6 Changed 5 years ago by thoughtpolice

Blocking: 7588 added

comment:7 Changed 5 years ago by dterei

Blocking: 7589 added

comment:8 Changed 5 years ago by dterei

Blocked By: 7590 added

comment:9 Changed 5 years ago by bgamari

Cc: bgamari@… added

comment:10 Changed 5 years ago by dterei

Argh. I didn't realize all these patches are problems due to cross compilation.

Why are you using LLVM 3.3? It is unreleased and so not supported. Sure, we should support it, but you are throwing in two new problems now.

1) LLVM 3.2 and 3.3 are unsupported. 2) Cross compilation has bugs.

Is it possible to use LLVM 3.1 or at least 3.2? Or do they not work well enough with ARM? I'm trying to review your patches now but this fact will slow me down a little.

comment:11 Changed 5 years ago by thoughtpolice

I'm a bit confused by your response. But it is my fault for not communicating better as I should have.

For one, my OS X machine is running both 3.2 and 3.3svn, isolated. 3.2 is my primary installation. I mostly run both to narrow down bugs and out of curiosity, in this case it was identical output so there was otherwise no relevance. I should probably just not mentioned 3.3. I only have 3.3 for one personal thing I work on anyway. It's out of $PATH and out of mind. I just use 3.2 for everything else (including GHC.)

Two, as for ARM, this ticket isn't about ARM? I imagine you meant #7575 where I do have a bug I found on ARM? And I'm not cross compiling in any of these tickets. I'm compiling on the ARM device itself with an existing 7.4.1 build provided by Ubuntu - so this would (I expect) work the same as any self-hosted 32bit build, modulo whatever ARM bugs I find. My ARM build of LLVM 3.3 is significantly older than what is in trunk now (because it takes a while to build,) and is much closer to the 3.2 branch. I can just put 3.2 on it anyway - I have a script to manually automate LLVM builds (and by default it just built trunk, not a released version.

Likewise I am compiling here on OS X using LLVM 3.2. The host compiler is 64bit GHC 7.6.1. I am producing a 64bit compiler. The fix and patch for this issue were tested on that combo with that bootstrap compiler as well.

For reference, here's the low down on my machines:

Machine 1:

64-bit Quad core Mac OS X 10.8.2. xcodebuild -version reveals xcode version 4.5.2, build version 4G2008a. Host compiler is GHC 7.6.1, installed from a binary distribution. LLVM version is 3.2. This ticket is relevant to that.

Machine 2:

32-bit Quad core ARM Linux. Ubuntu 12.10 derivative. GCC version is 4.6.3. LLVM version is 3.3svn as of about 3 weeks ago, but I'm going to replace that with 3.2 stable now. Bootstrap compiler is GHC 7.4.1 from Ubuntu repositories. Tickets #7571 and #7575 were filed against this machine.

There is no cross compilation happening anywhere. All builds are standard with recent copies of HEAD. Karel Gardas has also reported success using my patch in #7571 for his ARM machine using LLVM 3.0, I believe. I reproduced #7571 with a hand-written Cmm file on OS X as well, which you can see in the ticket. (Please try that test without this patch applied using the LLVM backend, and see if it fails.)

I figured #7571 was probably due to the new code generator taking in a higher-level Cmm representation when parsing (because the bug came from the RTS, where a lot of Cmm was rewritten when the code generator went in.)

I don't know what else to say - most of these are bugs long before the version of LLVM even matters as much. #7571 #7575 and this ticket are all outright compiler panics before code generation ever happen for me, so the actual version of LLVM doesn't seem as relevant.

I unfortunately can't give you direct access to my machine OS X machine for various reasons. You are welcome to use my ARM machines if you'd like once I get them properly DMZ'd. Are there any other tests or examples you can think of? Could you try the testsuite test in #7571 with and without the patch?

comment:12 Changed 5 years ago by dterei

OK so now I'm not sure what is causing these bugs.

My own machine is very similar to yours. Latest generation 13" Macbook. So OS X 10.8.2, 64 bit. I can bootstrap GHC with LLVM 3.1 fine. As you said, for a lot of these tickets the LLVM version shouldn't matter, so I'm just not sure why you are getting these panics and I'm not. Especially as these panics are things I thought I had already dealt with 2 years ago... (as you've noted, there is code already in the backend to handle these cases, but it doesn't seem to trigger correctly on your machine but does fine on mine).

If our machines were vastly different or you were cross compiling... argh!

I'll look through your patches again then and try to see if I can figure anything out from them.

Are you using homebrew on your machine?

Hmmm... I don't have my OS X machine right now but I believe I may be using GHC 7.4.2. I'll check that and then try 7.6.1 if I'm indeed using 7.4.2.

comment:13 Changed 5 years ago by thoughtpolice

I do use homebrew, but I do not ever manage my GHC installation with it. I just installed the 7.6.1 binary distribution from here:

I will get LLVM 3.1 working on my machine and see what happens when I bootstrap with it instead. I can also get 7.4.2 working and see if we can narrow a specific configuration down.

Also, did you give a shot to the test I wrote in #7571 ? It should just be as easy as ghc-stage2 -c -fllvm T7571.cmm - This test should really be the ultimate clear case of whether one of these bugs does in fact manifest or not. It's worth noting that #7571 will only manifest during a bootstrap when the RTS is built in profiling mode, due to #ifdefery in the Cmm files. I don't think you can avoid a profiling version of the RTS, but it's worth keeping in mind.

I believe I also have an extra 'debug' patch for this ticket in my personal repository, that output a lot of helpful debugging info in tracking this down. I might dust it off, attach it, and step through what I see.

Also if you'd like to contact me directly via email or through more instantaneous means (like an instant messaging system or IRC or something) to help sort it out quicker, you're more than free to (I'm CC'd on this ticket, so you can see my email there.)

comment:14 Changed 5 years ago by dterei

Hmm... OK. So the profiling mode is probably a difference. Do the other tickets depend on this? The other is GHC 7.6.1, I am indeed using 7.4.2.

I'll run the test cases you supplied and see. However right now I get a GHC build that is built with -fasm, passing the testsuite for llvmopt 100% and can bootstrap GHC fine with LLVM. This is using the 'prof' build of GHC haven't tested others.

I'll try a little more to replicate and if I can't then sure, may be worth chatting more directly.

comment:15 Changed 5 years ago by thoughtpolice

I need the patch in #7571 in order to get to this point on my OS X machine. #7575 is ARM (or 32bit) only it seems, so while related I think we can rule it out for the moment.

comment:16 Changed 5 years ago by dterei

Blocked By: 7590 removed
Blocking: 7590 added

comment:17 Changed 5 years ago by dterei

Blocking: 7589 removed

comment:18 Changed 5 years ago by dterei

Resolution: fixed
Status: patchclosed

Closing as this bug only manifested due to the fix for #7571 not being correct. A new fix for that and #7575 means that the existing logic for detecting this should fire correctly.

Note: See TracTickets for help on using tickets.