Opened 19 months ago

Last modified 15 months ago

#7297 new bug

LLVM incorrectly hoisting loads

Reported by: dterei Owned by: dterei
Priority: normal Milestone: 7.8.3
Component: Compiler (LLVM) Version: 7.7
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Difficulty: Unknown
Test Case: 367_letnoescape Blocked By:
Blocking: Related Tickets:

Description (last modified by dterei)

test 367_letnoescape fails under LLVM as a load of the HpLim register is hoisted out of the loop. So yielding is never done.

What I am not sure about right now is the best way to fix. Loads in LLVM can be annotated in a few different ways to fix this and not sure which one is the most 'correct'.

All the following work:

  • mark the load as volatile. (seems to give nicest code as well)
  • mark the load as atomic with either monotonic or seq_cst ordering.
  • mark the load as both volatile and atomic.

This bug while only affecting a single test case seems very serious and potentially indicative of a large problem. How well are we communicating the load/store threaded semantics to LLVM?

And what semantics do we need to communicate? I think we are fine other than the STG registers...

So making a bug for now as I don't know yet the best way to proceed without dedicating some time to reading LLVM docs and probably talking to the LLVM devs as the docs on the memory model are fairly confusing.

e.g., Code in question:

Bad version (LBB0_1 loops forever as load hoisted out):

r1Uf_info:                              # @r1Uf_info
# BB#0:                                 # %c1Vy
	movq	144(%r13), %rax
	decq	%r14
	.align	16, 0x90
.LBB0_1:                                # %tailrecurse
                                        # =>This Inner Loop Header: Depth=1
	incq	%r14
	testq	%rax, %rax
	jne	.LBB0_1
# BB#2:                                 # %c1VD
	movq	-8(%r13), %rax
	movl	$r1Uf_closure, %ebx
	jmpq	*%rax  # TAILCALL

Code when marked with atomic (either monatonic or seq_cst) or both atomic and volatile:

r1Uf_info:                              # @r1Uf_info
# BB#0:                                 # %c1Vy
	decq	%r14
	.align	16, 0x90
.LBB0_1:                                # %tailrecurse
                                        # =>This Inner Loop Header: Depth=1
	incq	%r14
	movq	144(%r13), %rax
	testq	%rax, %rax
	jne	.LBB0_1
# BB#2:                                 # %c1VD
	movq	-8(%r13), %rax
	movl	$r1Uf_closure, %ebx
	jmpq	*%rax  # TAILCALL

Code when marked volatile:

r1Uf_info:                              # @r1Uf_info
# BB#0:                                 # %c1Vy
	decq	%r14
	.align	16, 0x90
.LBB0_1:                                # %tailrecurse
                                        # =>This Inner Loop Header: Depth=1
	incq	%r14
	cmpq	$0, 144(%r13)
	jne	.LBB0_1
# BB#2:                                 # %c1VD
	movq	-8(%r13), %rax
	movl	$r1Uf_closure, %ebx
	jmpq	*%rax  # TAILCALL

Change History (7)

comment:1 Changed 19 months ago by dterei

  • Description modified (diff)

comment:2 Changed 19 months ago by dterei

  • Description modified (diff)

comment:3 Changed 19 months ago by dterei

  • Description modified (diff)

comment:4 Changed 19 months ago by dterei

  • Description modified (diff)

comment:5 Changed 18 months ago by igloo

  • Difficulty set to Unknown
  • Milestone set to 7.8.1

comment:6 Changed 15 months ago by dterei

Copying some info from an email (Memory model of Cmm?) with Simon Marlow so I can archive it finally:

The best reference for this is the code I wrote to identify conflicts in
the Cmm sinking pass, see the function 'conflicts':

http://www.haskell.org/ghc/dist/current/docs/html/libraries/ghc-7.7.20120902/src/CmmSink.html

But that won't tell you anything about whether reading a memory
location can be cached or not.

How does LLVM handle this?  I'm thinking that maybe we should have
an explicit "volatile load" operation, that would behave like a CmmLoad
but could not be hoisted.  It should be a CallishMachOp, because you
presumably want to specify ordering for a volatile load with respect to
other side-effecting operations.

comment:7 Changed 15 months ago by dterei

http://llvm.org/docs/LangRef.html#i_load

LLVM docs on volatile:

Certain memory accesses, such as loads, stores, and llvm.memcpys may
be marked volatile. The optimizers must not change the number of
volatile operations or change their order of execution relative to
other volatile operations. The optimizers may change the order of
volatile operations relative to non-volatile operations. This is not
Java's "volatile" and has no cross-thread synchronization behavior.
Note: See TracTickets for help on using tickets.