Opened 2 years ago

Closed 2 years ago

Last modified 7 months ago

#11649 closed bug (fixed)

LLVM code generator produces mal-formed LLVM blocks

Reported by: bgamari Owned by: erikd
Priority: highest Milestone: 8.0.1
Component: Compiler (CodeGen) Version: 8.0.1-rc2
Keywords: Cc: erikd, kavon, angerman
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Compile-time crash Test Case:
Blocked By: Blocking:
Related Tickets: #9043 Differential Rev(s): Phab:D1996
Wiki Page:

Description (last modified by bgamari)

It appears that the recent addition (673efccb3b348e9daf23d9e65460691bbea8586e) of some instances for types in GHC.Generics has uncovered a bug in the LLVM code generator (at least on ARM). libraries/base/GHC/Generics.hs fails to compile with,

Entry block to function must not have predecessors!
label %cjGw
/home/ben/ghc/roots/llvm-3.7/bin/opt: libraries/base/GHC/Generics.ll: error: input module is broken!

The problematic block in question appears to be,

define internal ghccc void @rhSw_info$def(i32* noalias nocapture %Base_Arg, i32* noalias nocapture %Sp_Arg, i32* noal
ias nocapture %Hp_Arg, i32 %R1_Arg, i32 %R2_Arg, i32 %R3_Arg, i32 %R4_Arg, i32 %SpLim_Arg) align 4 nounwind prefix <{
i32, i32, i32}><{i32 65541, i32 0, i32 15}>
{
clpH:
  br label %clpH
}

Which arose from this Cmm,

==================== Post control-flow optimisations ====================
2016-02-26 10:39:28.656744 UTC


[section ""data" . lvl_rhSw_closure" {
     lvl_rhSw_closure:
         const lvl_rhSw_info;
 },
 lvl_rhSw_entry() //  []
         { info_tbl: [(clpH,
                       label: lvl_rhSw_info
                       rep:HeapRep static { Fun {arity: 1 fun_type: ArgSpec 5} })]
           stack_info: arg_space: 0 updfr_space: Nothing
         }
     {offset
       clpH:
           goto clpH;   // CmmBranch
     }
 }]

Which arose from this input Cmm,

==================== Cmm produced by new codegen ====================
2016-02-26 10:39:28.611641 UTC

[section ""data" . lvl_rhSw_closure" {
     lvl_rhSw_closure:
         const lvl_rhSw_info;
 },
 lvl_rhSw_entry() //  [R2]
         { info_tbl: [(clpD,
                       label: lvl_rhSw_info
                       rep:HeapRep static { Fun {arity: 1 fun_type: ArgSpec 5} })]
           stack_info: arg_space: 4 updfr_space: Just 4
         }
     {offset
       clpD:
           _shWa::P32 = R2;   // CmmAssign
           goto clpz;   // CmmBranch
       clpz:
           if ((old + 0) - <highSp> < SpLim) goto clpE; else goto clpF;   // CmmCondBranch
       clpE:
           R2 = _shWa::P32;   // CmmAssign
           R1 = lvl_rhSw_closure;   // CmmAssign
           call (stg_gc_fun)(R2, R1) args: 4, res: 0, upd: 4;   // CmmCall
       clpF:
           goto clpy;   // CmmBranch
       clpy:
           goto shWb;   // CmmBranch
       shWb:
           goto clpH;   // CmmBranch
       clpH:
           goto shWb;   // CmmBranch
     }
 }]

Which arose from this STG,

lvl_rhSw
  :: forall a_a99i.
     GHC.Generics.U1 a_a99i -> GHC.Generics.U1 [a_a99i]
[GblId,
 Arity=1,
 Caf=NoCafRefs,
 Str=DmdType <L,U>x,
 Unf=OtherCon []] =
    sat-only \r [eta_shWa]
        let-no-escape {
          some_v_shWb [Occ=LoopBreaker] :: GHC.Generics.U1 [a12_a99i]
          [LclId, Str=DmdType b] =
              NO_CCS[] \u [] some_v_shWb;
        } in  some_v_shWb;

Which arose from this Core (from -ddump-simpl),

lvl_rhSw :: forall a_a99i. U1 a_a99i -> U1 [a_a99i]
[GblId, Arity=1, Caf=NoCafRefs, Str=DmdType <L,U>x]
lvl_rhSw =
  \ (@ a12_a99i) _ [Occ=Dead] ->
    letrec {
      some_v_a99l [Occ=LoopBreaker] :: U1 [a12_a99i]
      [LclId, Str=DmdType b]
      some_v_a99l = some_v_a99l; } in
    some_v_a99l

Which is apparently the body of the some implementation in the Alternative instance for GHC.Generics.U1.

Change History (17)

comment:1 Changed 2 years ago by bgamari

Description: modified (diff)

The second-to-last pass output by verbose-core2core has this analogue to lvl_rhSw,

lvl_sfQZ :: forall a_a99i. U1 a_a99i -> U1 [a_a99i]
[LclId, Arity=1, Str=DmdType <L,U>x]
lvl_sfQZ =
  \ (@ a_a99i) (eta_a99j :: U1 a_a99i) ->
    letrec {
      some_v_a99l [Occ=LoopBreaker] :: U1 [a_a99i]
      [LclId, Str=DmdType b]
      some_v_a99l =
        case eta_a99j of wild_XkW [Dmd=<B,U>] { U1 -> some_v_a99l }; } in
    some_v_a99l

It seems that this arose out of,

$csome_a87d [Occ=LoopBreaker]
  :: forall a_a3vu. U1 a_a3vu -> U1 [a_a3vu]
[LclId, Arity=1, Str=DmdType]
$csome_a87d = GHC.Base.$dmsome @ U1 GHC.Generics.$fAlternativeU1

comment:2 Changed 2 years ago by bgamari

I was strangely unable to reproduce this on x86_64 with LLVM 3.7 and 116528c8429257a0ae855251fd266547bb23d01d.

Also, if I compile the module with -O0 things the issue is avoided.

Last edited 2 years ago by bgamari (previous) (diff)

comment:3 Changed 2 years ago by bgamari

This simpler testcase reproduces the issue on ARM and x86_64,

{-# LANGUAGE NoImplicitPrelude #-}
module Test where
import GHC.Base

data U1 p = U1

instance Functor U1 where
    fmap f U1 = U1

instance Applicative U1 where
    pure _ = U1
    U1 <*> U1 = U1

instance Alternative U1 where
    empty = U1
    U1 <|> U1 = U1
$ inplace/bin/ghc-stage1 Test.hs -O -fforce-recomp
[1 of 1] Compiling Test             ( Test.hs, Test.o )
Entry block to function must not have predecessors!
label %cPR
/home/ben/ghc/roots/llvm-3.7/bin/opt: /tmp/ghc26431_0/ghc_2.ll: error: input module is broken!
`opt' failed in phase `LLVM Optimiser'. (Exit code: 1)
Last edited 2 years ago by bgamari (previous) (diff)

comment:4 Changed 2 years ago by erikd

Not just a problem on Arm. Happens on x86_64 as well.

comment:5 Changed 2 years ago by erikd

Cc: erikd added

comment:6 Changed 2 years ago by bgamari

Explicitly defining many _ = U1; some _ = U1 works around the issue.

comment:7 Changed 2 years ago by bgamari

Indeed if you compile with the native code generator you'll find that compilation succeeds as expected, but the resulting executable loops when you attempt to evaluate some U1. I suspect overriding some and many is the right thing to do in this particular case.

That being said, apart from the Generics issue, there is clearly an LLVM code generator issue as well: we shouldn't be passing code to LLVM that it can't compile.

comment:8 Changed 2 years ago by bgamari

comment:9 Changed 2 years ago by bgamari

Owner: set to bgamari

comment:10 Changed 2 years ago by erikd

From IRC:

<bgamari> erikd, I think the best bet is to just examine the Cmm, look for procs whose entry blocks are self-referential and rewrite them, inserting an auxiliary block

<bgamari> it's a bit of a hack, but I suspect it is a fix

<bgamari> erikd, https://github.com/bgamari/ghc/tree/wip/llvm-block-loop

<bgamari> erikd, that was the beginning of my attempt

Last edited 2 years ago by erikd (previous) (diff)

comment:11 Changed 2 years ago by erikd

Owner: changed from bgamari to erikd

I have something that compiles but doesn't work yet. I'll take this over,

comment:12 Changed 2 years ago by erikd

Differential Rev(s): Phab:D1996
Status: newpatch

comment:13 Changed 2 years ago by Ben Gamari <ben@…>

In 92821ec9/ghc:

LlvmCodeGen: Fix generation of malformed LLVM blocks

Commit 673efccb3b uncovered a bug in LLVM code generation that produced
LLVM code that the LLVM compiler refused to compile:

    {
    clpH:
      br label %clpH
    }

This may well be a bug in LLVM itself. The solution is to keep the
existing entry label and rewrite the function as:

    {
    clpH:
      br label %nPV
    nPV:
      br label %nPV
    }

Thanks to Ben Gamari for pointing me in the right direction on this
one.

Test Plan: Build GHC with BuildFlavour=quick-llvm

Reviewers: hvr, austin, bgamari

Reviewed By: bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1996

GHC Trac Issues: #11649

comment:14 Changed 2 years ago by bgamari

Resolution: fixed
Status: patchclosed

Merged to ghc-8.0.

comment:15 Changed 7 months ago by angerman

Cc: kavon angerman added

Yikes. This still seems to be the case with LLVM5. We should probably upstream a fix for this.

comment:16 Changed 7 months ago by kavon

The entry block restriction isn't a bug in LLVM, it's just part of their IR design. I believe that not having any predecessors of the entry block simplifies some things related to computing dominators (plus, the way phi nodes work in LLVM makes it weird to write one for an entry block to merge incoming function arguments).

A better fix for this issue would also include a way to avoid making a pass over the Cmm procedure to determine which global registers need allocas (which can only be declared in the entry block).

Here's my proposal to solve both problems: For each CmmProc, after turning its Cmm blocks into LLVM blocks, we should generate a brand new entry basic block. This new entry block would include the allocas that we need for the function and then branch to the translated version of the original entry block.

comment:17 Changed 7 months ago by angerman

That's a fantastic idea!

I guess we could even name the initial block entry, and just allocate all the registers on the stack and just fall through into the first block. I'll give this a try with the llvmng. If this works, I might backport it to the llvm backend. It would also relinquish the need for the unique supply in the LlvmM I believe.

Note: See TracTickets for help on using tickets.