Opened 6 years ago

Last modified 11 months ago

#5567 new task

LLVM: Improve alias analysis / performance

Reported by: dterei Owned by: dterei
Priority: normal Milestone:
Component: Compiler (LLVM) Version:
Keywords: Cc: brooks.brian@…, rwbarton, erikd, michalt, angerman
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


  • LLVM doesn't generate as good as code as we feel it should in many situations
  • Why?
    • We've often felt its a alias anlysis issue.
    • I'm a little more doubtful of that than others (I feel its part of the bigger problem, not the whole thing).
    • I think there may be some register allocation / instruction selection / live range splitting issue going on.
  • We could also do with looking at what optimisation passes we should run and in what order...

Here is some work Max did on the alias issue, his results for nofib weren't good:

So this ticket is just a high level ticket about figuring out and improving the performance of LLVM backend.

Change History (23)

comment:3 Changed 6 years ago by igloo

Milestone: 7.6.1

comment:4 Changed 6 years ago by dterei

Johan and I have started some work on this, wiki page about it is here:

comment:5 Changed 6 years ago by dterei

Pushed some patches for doing TBAA with LLVM:

0f15f8a76d334becf992a83870d0b327cc3c40b6 71e5ee7d1656444ad23d0610ddaf9fc99a58b190

No improvement to nofib, haven't tried other benchmarks to see.

comment:6 Changed 6 years ago by dterei

OK, tried this benchmark:

module Main(main) where

import Data.Array.Base
import Data.Array.IO
import Data.Array.MArray

main :: IO ()
main = do
    arr <- newArray_ (0, 200)
    go arr 2 0 100

go :: IOUArray Int Int -> Int -> Int -> Int -> IO ()
go arr stride x y | x < y     = do unsafeWrite arr (x * stride) 1337
                                   go arr stride (x + 1) y
                             | otherwise = return ()

And not working as I seem to be adding the TBAA info wrong. I think we need an unknown type (for pointers loaded from RX registers) that doesn't alias Sp.

comment:7 Changed 6 years ago by dterei

Pushed an improvement: e10589a505b44f4f0394500c6a0d2db5baa7f3f4

This gets the good code generated for the above benchmark improving performance by around 20%! Also added control to enable or disable if TBAA is used with: ba52053b95ccb417ca7ce08e85a45e49b5f49b0a

comment:8 Changed 5 years ago by igloo


comment:9 Changed 4 years ago by carter

heres the commits as linked into github

commit one

commit two

commit three that improves stuff

adding -llvm-tbaa flag

has anyone looked into this topic more recently? Is there any low hanging fruit in this area?

comment:10 Changed 4 years ago by dterei

No, no one has. Depends on how 'low' you want the fruit. I think we have a number of ideas that are worth trying. They are discussed on the wiki page and the tickets it links to:

Those ideas aren't too hard to implement and while we don't know, we do suspect they can be a big win in some situations. Finding someone with the time, knowledge and motivation to tackle them is hard though. It is a solid week or two of work probably to try them out properly.

comment:11 Changed 4 years ago by carter

ok, I'd be interested in spending some time on this ticket sometime in the next month or so. (or more realistically some part of the early summer, given how busy i am with work right now.)

I may try and dig into some simpler related tasks first, as a warm up/ improve my familiarity with some of the relevant bits, such as the task I just proposed in

comment:12 Changed 4 years ago by jstolarek

Cc: jan.stolarek@… added

comment:13 Changed 3 years ago by thoughtpolice


Moving to 7.10.1.

comment:14 Changed 3 years ago by brbr

Cc: brooks.brian@… added
difficulty: Unknown

comment:15 Changed 3 years ago by rwbarton

Cc: rwbarton added

comment:16 Changed 3 years ago by thoughtpolice


Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:17 Changed 3 years ago by erikd

Cc: erikd added

comment:18 Changed 2 years ago by michalt

Cc: michalt added

comment:19 Changed 2 years ago by thoughtpolice


Milestone renamed

comment:20 Changed 2 years ago by jstolarek

Cc: jan.stolarek@… removed

comment:21 Changed 21 months ago by thomie

Milestone: 8.0.1

comment:22 Changed 15 months ago by angerman

Cc: angerman added

comment:23 Changed 11 months ago by dobenour

Have the LLVM devs been asked about this?

I think that GHC's use of CPS (and its non-use of the C stack) might be one of the problems. LLVM is designed for C-style code and its optimizations might not fire on the IR that GHC generates.

I can think of a solution, but it would involve significant changes, both to the compiler and the RTS. The idea is to do a CPS => SSA transform and then lower to LLVM IR from the SSA form. The Haskell stack would become the C stack, with stack switching handled by the assembler code in StgRun. LLVM's GC support would be used to generate stack maps to be read by the RTS for GC purposes.

Fortunately, I believe that these tasks are relatively independent. I suspect (from what others have said) that an equivalent of LLVM's mem2reg for the STG stack is what is needed.

Note: See TracTickets for help on using tickets.