Opened 3 years ago

Closed 2 years ago

#9142 closed bug (fixed)

LLVM 3.5.0 rejects aliases used by LLVM codegen

Reported by: bgamari Owned by:
Priority: high Milestone: 7.10.1
Component: Compiler Version: 7.8.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking: #4213
Related Tickets: Differential Rev(s): Phab:D155
Wiki Page:

Description

LLVM HEAD has changed the behavior of aliases to reject references to things other than definitions. This unfortunately means that the handling of externs breaks,

"inplace/bin/ghc-stage1" -static  -H64m -O0 -fllvm -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -DCOMPILING_RTS -package-name rts -dcmm-lint      -i -irts -irts/dist/build -irts/dist/build/autogen -Irts/dist/build -Irts/dist/build/autogen           -O2    -c rts/Apply.cmm -o rts/dist/build/Apply.o
You are using a new version of LLVM that hasn't been tested yet!
We will try though...
Alias must point to a definition
i8* @"barf$alias"
Alias must point to a definition
i8* @"stg_gc_unpt_r1$alias"
Alias must point to a definition
i8* @"stg_apply_interp_info$alias"
Alias must point to a definition
i8* @"stg_yield_to_interpreter$alias"
Alias must point to a definition
i8* @"stg_ap_stack_entries$alias"
Alias must point to a definition
i8* @"__stg_gc_enter_1$alias"
Alias must point to a definition
i8* @"stg_upd_frame_info$alias"
LLVM ERROR: Broken module found, compilation aborted!

In reference to the following aliases,.

@barf = external global i8
@barf$alias = alias private i8* @barf
@stg_gc_unpt_r1 = external global i8
@stg_gc_unpt_r1$alias = alias private i8* @stg_gc_unpt_r1
@stg_apply_interp_info = external global i8
@stg_apply_interp_info$alias = alias private i8* @stg_apply_interp_info
@stg_yield_to_interpreter = external global i8
@stg_yield_to_interpreter$alias = alias private i8* @stg_yield_to_interpreter
@stg_ap_stack_entries = external global i8
@stg_ap_stack_entries$alias = alias private i8* @stg_ap_stack_entries
@__stg_gc_enter_1 = external global i8
@__stg_gc_enter_1$alias = alias private i8* @__stg_gc_enter_1
@stg_upd_frame_info = external global i8
@stg_upd_frame_info$alias = alias private i8* @stg_upd_frame_info

Attachments (2)

fix-externs-llvm-3.5.patch (2.7 KB) - added by bgamari 3 years ago.
Patch rewriting external references in a way that LLVM is more comfortable with
llvm-accept-bitcasts.patch (859 bytes) - added by bgamari 3 years ago.
Patch against LLVM HEAD to accept BitCast in Aliasee position

Download all attachments as: .zip

Change History (32)

comment:1 Changed 3 years ago by bgamari

There are two cases to worry about here, as shown in generateAliases,

  1. The case of an externally defined symbol where we don't know the type (where m_ty is Nothing)
  2. The case of a symbol for which we have a definition and therefore a type

Case 1 can be handled with something like the attached patch.

Frustratingly, case 2 cannot be as LLVM already knows the type and refuses to accept that we know what we are doing when we attempt to cast to a i8* (e.g. with a bitcast or inttoptr/ptrtoint). It seems that this will require an upstream fix.

Changed 3 years ago by bgamari

Attachment: fix-externs-llvm-3.5.patch added

Patch rewriting external references in a way that LLVM is more comfortable with

Changed 3 years ago by bgamari

Attachment: llvm-accept-bitcasts.patch added

Patch against LLVM HEAD to accept BitCast in Aliasee position

comment:2 Changed 3 years ago by bgamari

According to folks in #llvm, aliases are currently in a sad state and need to be rewritten. Attached above is a hack to allow bitcasts in aliasee position. I'll need to talk to more LLVM folks to see how to proceed from here.

Version 0, edited 3 years ago by bgamari (next)

comment:3 Changed 3 years ago by bgamari

Unfortunately this isn't quite enough,

/home/ben/trees/root-llvm-head/bin/opt: utils/hpc/dist-install/build/HpcParser.ll:44714:37: error: Alias must point to function or variable
@c3rB_str$alias = alias private i8* bitcast (%c3rB_str_struct* @c3rB_str to i8*)
                                    ^

comment:4 Changed 3 years ago by scpmw

Not quite sure I understand how your patch works - isn't this basically allocating a bunch of static null-initialized i8 fields to point to? That's hardly what we want, even if it compiles.

It would be really interesting to get a statement from LLVM people on this. If they think that we are mis-using aliases, that's reason enough to do something else. One possible solution could again be type aliases - but instead of writing the definitions at the end of the session we'd close the file and /prepend/ them. That could be a cheap and easy way to get the LLVM parser to like us again.

Last edited 3 years ago by scpmw (previous) (diff)

comment:5 in reply to:  4 Changed 3 years ago by bgamari

Replying to scpmw:

Not quite sure I understand how your patch works - isn't this basically allocating a bunch of static null-initialized i8 fields to point to? That's hardly what we want, even if it compiles.

Frankly I'm not too clear on this either. I modelled the change after similar changes made in the commit in question to testcases. However, you are right, it would appear that the local definitions will be used in lieu of the external symbols the references are supposed to use. All I know at the moment is that the code compiles. It likely does the wrong thing.

It would be really interesting to get a statement from LLVM people on this.

Indeed it would. At this point my next goal is to chase down the relevant LLVM people.

If they think that we are mis-using aliases, that's reason enough to do something else. One possible solution could again be type aliases - but instead of writing the definitions at the end of the session we'd close the file and /prepend/ them. That could be a cheap and easy way to get the LLVM parser to like us again.

comment:6 Changed 3 years ago by bgamari

There has been some discussion around this issue on llvmdev.

comment:7 Changed 3 years ago by bgamari

Blocking: 4213 added

comment:8 in reply to:  1 Changed 3 years ago by altaic

As I understand it from the thread on LLVMDev, the stumbling block is that we don't have a function definition's type until we see a call to it. LLVM doesn't care about the ordering of definitions and calls, so at least that's one thing we don't have to worry about. Several options came to mind in no particular order:

  1. Store untyped definitions and calls in some sort of data structure until we've got a matching pair, at which point we know the type of the definition and we can queue it to be emitted as soon as the current object is finished being emitted. Unfortunately, I think this would mean that non-function data would be stored until the entire stream has been read. Additionally, functions being stored is dependent on the distance between a definition and a call-- if they are maximally unsorted, we could be storing the whole stream in memory.
  1. Iterate through the cmm twice, collecting type information for definitions on the first pass, and emitting code on the second pass. The drawback is the extra time from processing the stream twice, and a bit of extra memory usage, dependent on the number of definitions.
  1. Generate invalid LLVM IR and mangle definitions with the type info before handing it off to LLVM. This has all of the drawbacks of 2, with the added drawback of more cringeworthy text mangling.
  1. Generate an auxiliary data structure when we're processing the STG which contains the type information to pass to the LLVM backend. I don't know how complicated this would be to implement, nor if the extra cruft in other stages of the compilation pipeline would be acceptable.
  1. Generate LLVM from STG rather than cmm. As I understand it, David considered this when designing the LLVM backend, but decided against it due to code duplication. I imagine this would be a lot more work than the other options, though it may not have the drawbacks the other options have.
  1. Convince the LLVM folks to reverse the change they made with aliases, or otherwise add a new feature to LLVM. Not sure if we were abusing aliases, or if our use case fits into some sort of feature they'd like to offer, but it might take awhile for an upstream fix.

comment:9 Changed 3 years ago by bgamari

Altaic, thanks for writing these down! I'd just like to mention one further complication (and please someone correct me if I'm wrong here):

It's not sufficient to just match up definitions and calls as in some cases (e.g. symbols defined outside of the current compilation unit) we will never see the definition we are looking for. In the case of functions, we can sometimes infer the type with the information in a CmmCall node, but as far as I can tell there's no guarantee that we will ever see one of these (e.g. we may only see the symbol in a CmmLabel node within an expression).

My thoughts on these options are,

  1. In light of the external symbol problem, I'm not convinced it's worth paying this much to back out types for LLVM. The compiler has already demonstrated that the program is well-typed so the exercise doesn't gain us any safety (other than rejecting the occasional code generator bug).
  2. See above
  3. Eww.
  4. I don't know how much work this work be either. Unless there's some user for this sort of information in downstream consumers other than the LLVM codegen I'm not convinced we want to go this route.
  5. An interesting idea that I hadn't considered. Perhaps if the C-- lowering code were refactored a bit the code duplication could be reduced. Perhaps (4) is better in this case? Either way, this sounds like a lot of work for relative little benefit (could LLVM better optimize the STG representation? I suspect not; C-- is about the right level of abstraction for LLVM's IR).
  6. This is probably the most likely option. I've been doing a pretty poor job in conveying our needs the LLVM folks but we'll see what they say. Our usecase doesn't seem too outlandish so there should be a path forward one way or another.

comment:10 Changed 3 years ago by scpmw

Function definitions aren't really the problem. Once we know the live global registers, we'd know their type (see llvmFunTy). We could /theoretically/ find these from a call in Cmm.

However, I feel like this is the wrong approach anyway. Even if we could enumerate all offending Cmm constructs at this point, there is no telling whether we will have more problematic Cmm in the future. We don't want to get into a game of whack-a-mole just for a rather cosmetic feature like streaming. Bringing Stg into the fold would most likely to make that problem even worse.

In my mind, we should wait for the LLVM situation to stabilize, then we'll implement either

  1. What we have right now, with possibly minor changes
  2. A step prepending aliases, if this is what LLVM likes better (unlikely, but I'll test it)
  3. A search-and-replace step where we first emit placeholders, then later fill in the types (your option 3)
  4. Remove streaming entirely (pretty much equivalent to your options 1 and 2)

comment:11 Changed 3 years ago by altaic

I'm looking in to option 5, mostly because I'd like to familiarize myself with the STG machine and Hoopl. I expect the other approaches discussed here will be more fruitful, however my unfamiliarity with GHC internals limit how much I can currently contribute.

comment:12 Changed 3 years ago by thoughtpolice

Milestone: 7.10.1

Moving to 7.10.1.

comment:14 Changed 3 years ago by scpmw

Haven't followed the discussion so much - so LLVM HEAD can alias arbitrary pointers? That's even more flexibility than we had before. So now we could just close this ticket and take another good look at #4213, right?

comment:15 Changed 3 years ago by bgamari

This morning I approached this issue with a fresh perspective and realized that the solutions is quite simple (unless I've missed something). The issue with new LLVMs is that aliases can't point to things that aren't definitions.

As it turns out, the only case where we emit aliases pointing to non-definitions is for external symbols where we emit an extern declaration and a corresponding alias. Unless I'm mistaken, there is no reason why both of these are necessary: we only use the alias. So, instead of emitting,

@stg_BCO_info = external global i8
@stg_BCO_info$alias = alias private i8* @stg_BCO_info

we should be emitting,

@stg_BCO_info$alias = external global i8

I've implemented this change in my [llvm-3.5-new branch https://github.com/bgamari/ghc/compare/llvm-3.5-new and it appears to work (or rather, the GHC build hasn't failed yet; I also have yet to verify that LLVM 3.4 still works).

Of course, there are still nice opportunities for cleaning up TNTC (as discussed in #4213), but at least this fixes the immediate problem.

comment:16 Changed 3 years ago by bgamari

Unfortunately it seems that this approach suffers from undefined references. I haven't had a chance to think about why but I'll have a look tomorrow.

comment:17 Changed 3 years ago by bgamari

Never mind, of course that won't work: the extern symbols under this scheme would be pointing to non-existent $alias symbols. If we instead change the names of the definitions themselves and then allow the aliases to take the actual names this plan might have a chance of working. Haven't yet thought about what this would require.

Last edited 3 years ago by bgamari (previous) (diff)

comment:18 in reply to:  14 Changed 3 years ago by bgamari

Replying to scpmw:

Haven't followed the discussion so much - so LLVM HEAD can alias arbitrary pointers? That's even more flexibility than we had before. So now we could just close this ticket and take another good look at #4213, right?

Unfortunately this ticket is still an active issue: while aliases can alias expressions, these expressions can still only refer to definitions. To restore the LLVM codegen to a functioning state we would need the LLVM people to again allow aliases to declarations as suggested by Reid http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-June/073464.html. I've pinged the thread as Rafael had some reservations to this idea that were apparently never addressed.

Last edited 3 years ago by bgamari (previous) (diff)

comment:19 Changed 3 years ago by bgamari

First the good news: I have figured out what LLVM code we need to generate to satisfy LLVM's type system. An annotated example can be found here https://github.com/bgamari/ghc-llvm/blob/master/test.ll .

Unfortunately, this seems to be a bit of a departure from our current approach. Quite a bit of code needs to be touched and I haven't yet been able to find a clean way to factor out the details of the alias generation.

comment:20 Changed 3 years ago by bgamari

I finally have a brutally hacked implementation working with both LLVM 3.4 and HEAD (r213478). Unfortunately, it's nowhere near merge-worthy. In fact, it's terrible. That being said, it (https://github.com/bgamari/ghc/compare/llvm-3.5-new) works and may serve as a model if someone has some time to do some refactoring. Otherwise I'm afraid my weekend is now over so I'll have to put this down for a while.

comment:21 Changed 3 years ago by bgamari

To describe the approach a bit more, I've inverted the roles of the aliases and symbol definitions. That is, symbol definitions take names of the form @symbol$def whereas @symbol is defined as an alias. The @symbol aliases have external linkage when appropriate, ensuring they are available for external references which can simply use an normal @symbol = external global declaration. This in effect moves the aliases from the point of use to the point of definition, satisfying LLVM's prohibition of aliases of declarations.

As I've currently implemented it this logic is scattered through the backend (although the fact that data and functions are so distinct in LLVM doesn't help things). Hopefully a bit more reflection will produce a more coherent solution. Ultimately I may hold off on this until after we know how we will fix TNTC as this will be another major refactoring.

comment:22 Changed 3 years ago by bgamari

Unfortunately the branch fails to build GHC itself due to overlapping symbol names when compiling raw cmm source. I suspect the problem is excessive visibility, although I'm not sure where.

Edit: Never mind, the excessive visibility fix in https://github.com/bgamari/ghc/commit/11cfb89e470081ed55088ea8bb50dcc9bc2d66c6 fixed this afterall but old build artifacts masked this fact. The branch appears to be fully functional.

Also, I've expanded the note in Base.hs explaining the behavior of the alias scheme so hopefully it will be somewhat clear to potential refactor-ers https://github.com/bgamari/ghc/compare/llvm-3.5-new#diff-5081021271d44e6dd3202f0840529a00R492.

Last edited 3 years ago by bgamari (previous) (diff)

comment:23 Changed 3 years ago by rwbarton

Differential Rev(s): Phab:D155

What's the status of this, waiting on review?

comment:24 Changed 3 years ago by bgamari

Yep; my initial concern was that a deeper refactoring was necessary to fix this cleanly but looking back over the patch it's not nearly as invasive as I recall. It could probably be merged as-is after review.

comment:25 Changed 3 years ago by jrp

To give this a bump, one possible test for this would be cabal install diagrams

It would be good to get this patch in for testing, now that brew for os x is providing llvm 3.5

Last edited 2 years ago by jrp (previous) (diff)

comment:26 Changed 2 years ago by sjy

Summary: LLVM HEAD rejects aliases used by LLVM codegenLLVM 3.5.0 rejects aliases used by LLVM codegen

comment:27 Changed 2 years ago by bgamari

jrp, I would love to try (and IIRC have successfully done builds of similar scale with this patch in the past) unfortunately the various AMP-related breakage throughout the ecosystem makes doing this quite tricky at the moment. Hopefully in a few weeks this will have cleared up.

comment:28 Changed 2 years ago by Herbert Valerio Riedel <hvr@…>

In e16a342d70b92fc8480793d3ec911853f0c31e44/ghc:

llvmGen: Compatibility with LLVM 3.5 (re #9142)

Due to changes in LLVM 3.5 aliases now may only refer to definitions.
Previously to handle symbols defined outside of the current commpilation
unit GHC would emit both an `external` declaration, as well as an alias
pointing to it, e.g.,

    @stg_BCO_info = external global i8
    @stg_BCO_info$alias = alias private i8* @stg_BCO_info

Where references to `stg_BCO_info` will use the alias
`stg_BCO_info$alias`. This is not permitted under the new alias
behavior, resulting in errors resembling,

    Alias must point to a definition
    i8* @"stg_BCO_info$alias"

To fix this, we invert the naming relationship between aliases and
definitions. That is, now the symbol definition takes the name
`@stg_BCO_info$def` and references use the actual name, `@stg_BCO_info`.
This means the external symbols can be handled by simply emitting an
`external` declaration,

    @stg_BCO_info = external global i8

Whereas in the case of a forward declaration we emit,

    @stg_BCO_info = alias private i8* @stg_BCO_info$def

Reviewed By: austin

Differential Revision: https://phabricator.haskell.org/D155

comment:29 Changed 2 years ago by nomeata

Is this something that will be merged into 7.8? (I’m not sure if Debian has llvm-3.4 around for as long as we will have 7.8 around.)

comment:30 in reply to:  29 Changed 2 years ago by thomie

Resolution: fixed
Status: newclosed

Replying to nomeata:

Is this something that will be merged into 7.8? (I’m not sure if Debian has llvm-3.4 around for as long as we will have 7.8 around.)

The commit didn't make it into the 7.8.4 release candidate, so I guess not.

From skimming Phab:D155, I get the impression this issue is fixed and tested.

Note: See TracTickets for help on using tickets.