Opened 9 months ago

Closed 2 months ago

#15207 closed bug (fixed)

bad dwarf frame in stgRun.c when compiled with with gcc on mac and assembled by as/gcc/clang (aka apple clang assembler)

Reported by: carter Owned by:
Priority: normal Milestone: 8.8.1
Component: Runtime System Version: 8.4.3
Keywords: Cc: niteria
Operating System: MacOS X Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4781
Wiki Page:

Description (last modified by carter)

for current master, if i build on a mac / OSX high sierra environment with CC set as any flavor of recent GCC rather than apple clang, i get a failure "no open frame" when as (aka apple clang / apple llvm acting as the assembler) is run on the .s file produced by stgRun.c (rts/StgCRun.c to be exact)

this error message seems to come from https://github.com/llvm-mirror/llvm/blob/da4a2839d80ac52958be0129b871beedfe90136e/lib/MC/MCStreamer.cpp#L221

https://gist.github.com/cartazio/8cbfb3305e1daa4f7ffc3f6bb90a2891 has the gcc and clang style assembly for

/usr/local/bin/gcc-8 -fno-stack-protector -DTABLES_NEXT_TO_CODE -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -Irts/dist/build -Irts/dist/build/./autogen -fno-common -UPIC -DPIC -x assembler -c /var/folders/py/wgp_hj9d2rl3cx48yym_ynj00000gn/T/ghc22029_0/ghc_1.s -o rts/dist/build/StgCRun.debug_o

this is example asm for the the debug way

Attachments (6)

nofib-report-two.txt (211.9 KB) - added by carter 4 months ago.
comparsion of clang and gcc built GHC at commit 578012be13eb1548050d51c0a23bd1a98423f03e on mac
cool-nofib-threaded-n4-report.txt (212.5 KB) - added by carter 4 months ago.
threaded with time to cool nofib, N4
dirty-cool-nofib-threaded-n4-report.txt (154.4 KB) - added by carter 4 months ago.
even more time to cool threaded n4 nofib comparison
no-ht-dirty-cool-nofib-threaded-n4-report.txt (155.3 KB) - added by carter 4 months ago.
hyper threading disabled quad core threaded rts nofib
fustratedSTGCRun.s (127.2 KB) - added by carter 3 months ago.
gcc's asssembly for the offending file
clangSTGCRUN.s (8.7 KB) - added by carter 3 months ago.
happy stgCRun darwin asm

Download all attachments as: .zip

Change History (49)

comment:1 Changed 9 months ago by carter

i'm not sure if the issue is apple llvm/as being conservative in verifying the embedded dwarf or not, nor if more recent clang doesn't have that issue

comment:2 Changed 9 months ago by bgamari

Cc: niteria added
Description: modified (diff)

comment:3 Changed 9 months ago by carter

Cc: niteria removed
Description: modified (diff)

or if asm with dwarf as emitted by GCC doesn't play nice with (apple?)clang's assembler (which is the system assembler on OSX high sierra)

comment:4 Changed 9 months ago by carter

Cc: niteria added

comment:5 Changed 9 months ago by carter

filed radar 40690578 on the apple side in case this unique to apple's flavor of clang / llvm / as

comment:6 Changed 9 months ago by bgamari

Do you think you could paste the assembler produced by compiling the following program with clang -g -O0?

#include <stdio.h>

int hello(int a, double b) {
    printf("hello %d %d", 42, 12);
    return a*b;
}

comment:7 Changed 9 months ago by carter

here you go

	.section	__TEXT,__text,regular,pure_instructions
	.macosx_version_min 10, 13
	.globl	_hello                  ## -- Begin function hello
	.p2align	4, 0x90
_hello:                                 ## @hello
Lfunc_begin0:
	.file	1 "test.c"
	.loc	1 3 0                   ## test.c:3:0
	.cfi_startproc
## BB#0:
	pushq	%rbp
Lcfi0:
	.cfi_def_cfa_offset 16
Lcfi1:
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
Lcfi2:
	.cfi_def_cfa_register %rbp
	subq	$32, %rsp
	leaq	L_.str(%rip), %rax
	movl	$42, %esi
	movl	$12, %edx
	movl	%edi, -4(%rbp)
	movsd	%xmm0, -16(%rbp)
Ltmp0:
	.loc	1 4 5 prologue_end      ## test.c:4:5
	movq	%rax, %rdi
	movb	$0, %al
	callq	_printf
	.loc	1 5 12                  ## test.c:5:12
	cvtsi2sdl	-4(%rbp), %xmm0
	.loc	1 5 13 is_stmt 0        ## test.c:5:13
	mulsd	-16(%rbp), %xmm0
	.loc	1 5 12                  ## test.c:5:12
	cvttsd2si	%xmm0, %edx
	.loc	1 5 5                   ## test.c:5:5
	movl	%eax, -20(%rbp)         ## 4-byte Spill
	movl	%edx, %eax
	addq	$32, %rsp
	popq	%rbp
	retq
Ltmp1:
Lfunc_end0:
	.cfi_endproc
                                        ## -- End function
	.section	__TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
	.asciz	"hello %d %d"

	.section	__DWARF,__debug_str,regular,debug
Linfo_string:
	.asciz	"Apple LLVM version 9.1.0 (clang-902.0.39.1)" ## string offset=0
	.asciz	"test.c"                ## string offset=44
	.asciz	"/Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad" ## string offset=51
	.asciz	"hello"                 ## string offset=118
	.asciz	"int"                   ## string offset=124
	.asciz	"a"                     ## string offset=128
	.asciz	"b"                     ## string offset=130
	.asciz	"double"                ## string offset=132
	.section	__DWARF,__debug_abbrev,regular,debug
Lsection_abbrev:
	.byte	1                       ## Abbreviation Code
	.byte	17                      ## DW_TAG_compile_unit
	.byte	1                       ## DW_CHILDREN_yes
	.byte	37                      ## DW_AT_producer
	.byte	14                      ## DW_FORM_strp
	.byte	19                      ## DW_AT_language
	.byte	5                       ## DW_FORM_data2
	.byte	3                       ## DW_AT_name
	.byte	14                      ## DW_FORM_strp
	.byte	16                      ## DW_AT_stmt_list
	.byte	23                      ## DW_FORM_sec_offset
	.byte	27                      ## DW_AT_comp_dir
	.byte	14                      ## DW_FORM_strp
	.byte	17                      ## DW_AT_low_pc
	.byte	1                       ## DW_FORM_addr
	.byte	18                      ## DW_AT_high_pc
	.byte	6                       ## DW_FORM_data4
	.byte	0                       ## EOM(1)
	.byte	0                       ## EOM(2)
	.byte	2                       ## Abbreviation Code
	.byte	46                      ## DW_TAG_subprogram
	.byte	1                       ## DW_CHILDREN_yes
	.byte	17                      ## DW_AT_low_pc
	.byte	1                       ## DW_FORM_addr
	.byte	18                      ## DW_AT_high_pc
	.byte	6                       ## DW_FORM_data4
	.byte	64                      ## DW_AT_frame_base
	.byte	24                      ## DW_FORM_exprloc
	.byte	3                       ## DW_AT_name
	.byte	14                      ## DW_FORM_strp
	.byte	58                      ## DW_AT_decl_file
	.byte	11                      ## DW_FORM_data1
	.byte	59                      ## DW_AT_decl_line
	.byte	11                      ## DW_FORM_data1
	.byte	39                      ## DW_AT_prototyped
	.byte	25                      ## DW_FORM_flag_present
	.byte	73                      ## DW_AT_type
	.byte	19                      ## DW_FORM_ref4
	.byte	63                      ## DW_AT_external
	.byte	25                      ## DW_FORM_flag_present
	.byte	0                       ## EOM(1)
	.byte	0                       ## EOM(2)
	.byte	3                       ## Abbreviation Code
	.byte	5                       ## DW_TAG_formal_parameter
	.byte	0                       ## DW_CHILDREN_no
	.byte	2                       ## DW_AT_location
	.byte	24                      ## DW_FORM_exprloc
	.byte	3                       ## DW_AT_name
	.byte	14                      ## DW_FORM_strp
	.byte	58                      ## DW_AT_decl_file
	.byte	11                      ## DW_FORM_data1
	.byte	59                      ## DW_AT_decl_line
	.byte	11                      ## DW_FORM_data1
	.byte	73                      ## DW_AT_type
	.byte	19                      ## DW_FORM_ref4
	.byte	0                       ## EOM(1)
	.byte	0                       ## EOM(2)
	.byte	4                       ## Abbreviation Code
	.byte	36                      ## DW_TAG_base_type
	.byte	0                       ## DW_CHILDREN_no
	.byte	3                       ## DW_AT_name
	.byte	14                      ## DW_FORM_strp
	.byte	62                      ## DW_AT_encoding
	.byte	11                      ## DW_FORM_data1
	.byte	11                      ## DW_AT_byte_size
	.byte	11                      ## DW_FORM_data1
	.byte	0                       ## EOM(1)
	.byte	0                       ## EOM(2)
	.byte	0                       ## EOM(3)
	.section	__DWARF,__debug_info,regular,debug
Lsection_info:
Lcu_begin0:
	.long	107                     ## Length of Unit
	.short	4                       ## DWARF version number
Lset0 = Lsection_abbrev-Lsection_abbrev ## Offset Into Abbrev. Section
	.long	Lset0
	.byte	8                       ## Address Size (in bytes)
	.byte	1                       ## Abbrev [1] 0xb:0x64 DW_TAG_compile_unit
	.long	0                       ## DW_AT_producer
	.short	12                      ## DW_AT_language
	.long	44                      ## DW_AT_name
Lset1 = Lline_table_start0-Lsection_line ## DW_AT_stmt_list
	.long	Lset1
	.long	51                      ## DW_AT_comp_dir
	.quad	Lfunc_begin0            ## DW_AT_low_pc
Lset2 = Lfunc_end0-Lfunc_begin0         ## DW_AT_high_pc
	.long	Lset2
	.byte	2                       ## Abbrev [2] 0x2a:0x36 DW_TAG_subprogram
	.quad	Lfunc_begin0            ## DW_AT_low_pc
Lset3 = Lfunc_end0-Lfunc_begin0         ## DW_AT_high_pc
	.long	Lset3
	.byte	1                       ## DW_AT_frame_base
	.byte	86
	.long	118                     ## DW_AT_name
	.byte	1                       ## DW_AT_decl_file
	.byte	3                       ## DW_AT_decl_line
                                        ## DW_AT_prototyped
	.long	96                      ## DW_AT_type
                                        ## DW_AT_external
	.byte	3                       ## Abbrev [3] 0x43:0xe DW_TAG_formal_parameter
	.byte	2                       ## DW_AT_location
	.byte	145
	.byte	124
	.long	128                     ## DW_AT_name
	.byte	1                       ## DW_AT_decl_file
	.byte	3                       ## DW_AT_decl_line
	.long	96                      ## DW_AT_type
	.byte	3                       ## Abbrev [3] 0x51:0xe DW_TAG_formal_parameter
	.byte	2                       ## DW_AT_location
	.byte	145
	.byte	112
	.long	130                     ## DW_AT_name
	.byte	1                       ## DW_AT_decl_file
	.byte	3                       ## DW_AT_decl_line
	.long	103                     ## DW_AT_type
	.byte	0                       ## End Of Children Mark
	.byte	4                       ## Abbrev [4] 0x60:0x7 DW_TAG_base_type
	.long	124                     ## DW_AT_name
	.byte	5                       ## DW_AT_encoding
	.byte	4                       ## DW_AT_byte_size
	.byte	4                       ## Abbrev [4] 0x67:0x7 DW_TAG_base_type
	.long	132                     ## DW_AT_name
	.byte	4                       ## DW_AT_encoding
	.byte	8                       ## DW_AT_byte_size
	.byte	0                       ## End Of Children Mark
	.section	__DWARF,__debug_ranges,regular,debug
Ldebug_range:
	.section	__DWARF,__debug_macinfo,regular,debug
Ldebug_macinfo:
Lcu_macro_begin0:
	.byte	0                       ## End Of Macro List Mark
	.section	__DWARF,__apple_names,regular,debug
Lnames_begin:
	.long	1212240712              ## Header Magic
	.short	1                       ## Header Version
	.short	0                       ## Header Hash Function
	.long	1                       ## Header Bucket Count
	.long	1                       ## Header Hash Count
	.long	12                      ## Header Data Length
	.long	0                       ## HeaderData Die Offset Base
	.long	1                       ## HeaderData Atom Count
	.short	1                       ## DW_ATOM_die_offset
	.short	6                       ## DW_FORM_data4
	.long	0                       ## Bucket 0
	.long	261238937               ## Hash in Bucket 0
	.long	LNames0-Lnames_begin    ## Offset in Bucket 0
LNames0:
	.long	118                     ## hello
	.long	1                       ## Num DIEs
	.long	42
	.long	0
	.section	__DWARF,__apple_objc,regular,debug
Lobjc_begin:
	.long	1212240712              ## Header Magic
	.short	1                       ## Header Version
	.short	0                       ## Header Hash Function
	.long	1                       ## Header Bucket Count
	.long	0                       ## Header Hash Count
	.long	12                      ## Header Data Length
	.long	0                       ## HeaderData Die Offset Base
	.long	1                       ## HeaderData Atom Count
	.short	1                       ## DW_ATOM_die_offset
	.short	6                       ## DW_FORM_data4
	.long	-1                      ## Bucket 0
	.section	__DWARF,__apple_namespac,regular,debug
Lnamespac_begin:
	.long	1212240712              ## Header Magic
	.short	1                       ## Header Version
	.short	0                       ## Header Hash Function
	.long	1                       ## Header Bucket Count
	.long	0                       ## Header Hash Count
	.long	12                      ## Header Data Length
	.long	0                       ## HeaderData Die Offset Base
	.long	1                       ## HeaderData Atom Count
	.short	1                       ## DW_ATOM_die_offset
	.short	6                       ## DW_FORM_data4
	.long	-1                      ## Bucket 0
	.section	__DWARF,__apple_types,regular,debug
Ltypes_begin:
	.long	1212240712              ## Header Magic
	.short	1                       ## Header Version
	.short	0                       ## Header Hash Function
	.long	2                       ## Header Bucket Count
	.long	2                       ## Header Hash Count
	.long	20                      ## Header Data Length
	.long	0                       ## HeaderData Die Offset Base
	.long	3                       ## HeaderData Atom Count
	.short	1                       ## DW_ATOM_die_offset
	.short	6                       ## DW_FORM_data4
	.short	3                       ## DW_ATOM_die_tag
	.short	5                       ## DW_FORM_data2
	.short	4                       ## DW_ATOM_type_flags
	.short	11                      ## DW_FORM_data1
	.long	0                       ## Bucket 0
	.long	-1                      ## Bucket 1
	.long	193495088               ## Hash in Bucket 0
	.long	-113419488              ## Hash in Bucket 0
	.long	Ltypes0-Ltypes_begin    ## Offset in Bucket 0
	.long	Ltypes1-Ltypes_begin    ## Offset in Bucket 0
Ltypes0:
	.long	124                     ## int
	.long	1                       ## Num DIEs
	.long	96
	.short	36
	.byte	0
	.long	0
Ltypes1:
	.long	132                     ## double
	.long	1                       ## Num DIEs
	.long	103
	.short	36
	.byte	0
	.long	0

.subsections_via_symbols
	.section	__DWARF,__debug_line,regular,debug
Lsection_line:
Lline_table_start0:

comment:8 Changed 9 months ago by carter

running the build with clang-6 from llvm tells us something more useful

though still a slightly tricky error message cause theres no line numbers

clang-6.0  -fno-stack-protector -DTABLES_NEXT_TO_CODE -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -Irts/dist/build -Irts/dist/build/./autogen -fno-common -U__PIC__ -D__PIC__ -x assembler -c /var/folders/py/wgp_hj9d2rl3cx48yym_ynj00000gn/T/ghc34265_0/ghc_1.s -o rts/dist/build/StgCRun.thr_o
clang-6.0: warning: argument unused during compilation: '-fno-stack-protector' [-Wunused-command-line-argument]
clang-6.0: warning: argument unused during compilation: '-D TABLES_NEXT_TO_CODE' [-Wunused-command-line-argument]
clang-6.0: warning: argument unused during compilation: '-fno-common' [-Wunused-command-line-argument]
clang-6.0: warning: argument unused during compilation: '-U __PIC__' [-Wunused-command-line-argument]
clang-6.0: warning: argument unused during compilation: '-D __PIC__' [-Wunused-command-line-argument]
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives

comment:9 Changed 9 months ago by carter

this still doesn't answer why the assembly clang generates for the same file is accepted, but the one generated by gcc isnt'

comment:10 Changed 9 months ago by carter

ahah! clang-6.0 doesn't accept its own assembly!

 $
clang-6.0 -fno-stack-protector -DTABLES_NEXT_TO_CODE -fno-stack-protector -Wall -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Winline -Waggregate-return -Wpointer-arith -Wmissing-noreturn -Wnested-externs -Wredundant-decls -Wundef -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -DCOMPILING_RTS '-DFS_NAMESPACE=rts' -fno-strict-aliasing -fno-common -DDTRACE -Irts/dist/build/./autogen '-Werror=unused-but-set-variable' '-Wno-error=inline' -O2 -fomit-frame-pointer -g '-DRtsWay="rts_thr"' -w -DTHREADED_RTS -x c rts/StgCRun.c -o /var/folders/py/wgp_hj9d2rl3cx48yym_ynj00000gn/T/ghc34756_0/ghc_1.s -fno-common -U__PIC__ -D__PIC__ -Wimplicit -S -O2 -include /Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad/includes/ghcversion.h -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -Irts/dist/build -Irts/dist/build/./autogen -I/Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad/libraries/base/include -I/Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad/libraries/base/dist-install/build/include -I/Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad/libraries/integer-gmp/include -I/Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad/libraries/integer-gmp/dist-install/build/include -I/Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad/rts/dist/build -I/Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad/includes -I/Users/carter/WorkSpace/projects/active/ghc-head-may2018-clang-sad/includes/dist-derivedconstants/header
17:36:22 ~/W/p/a/ghc-head-may2018-clang-sad (master|…) $
clang-6.0  -fno-stack-protector -DTABLES_NEXT_TO_CODE -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -Irts/dist/build -Irts/dist/build/./autogen -fno-common -U__PIC__ -D__PIC__ -x assembler -c /var/folders/py/wgp_hj9d2rl3cx48yym_ynj00000gn/T/ghc34265_0/ghc_1.s -o rts/dist/build/StgCRun.thr_o
clang-6.0: warning: argument unused during compilation: '-fno-stack-protector' [-Wunused-command-line-argument]
clang-6.0: warning: argument unused during compilation: '-D TABLES_NEXT_TO_CODE' [-Wunused-command-line-argument]
clang-6.0: warning: argument unused during compilation: '-fno-common' [-Wunused-command-line-argument]
clang-6.0: warning: argument unused during compilation: '-U __PIC__' [-Wunused-command-line-argument]
clang-6.0: warning: argument unused during compilation: '-D __PIC__' [-Wunused-command-line-argument]
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives
<unknown>:0: error: this directive must appear between .cfi_startproc and .cfi_endproc directives

comment:11 Changed 9 months ago by carter

also apple clang (some v5 llvm variant) rejects the assembly emited by clang v6!

 /usr/bin/clang -fno-stack-protector -DTABLES_NEXT_TO_CODE -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -Irts/dist/build -Irts/dist/build/./autogen -fno-common -U__PIC__ -D__PIC__ -x assembler -c /var/folders/py/wgp_hj9d2rl3cx48yym_ynj00000gn/T/ghc34265_0/ghc_1.s -o rts/dist/build/StgCRun.thr_o
clang: warning: argument unused during compilation: '-fno-stack-protector' [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-D TABLES_NEXT_TO_CODE' [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-fno-common' [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-U __PIC__' [-Wunused-command-line-argument]
clang: warning: argument unused during compilation: '-D __PIC__' [-Wunused-command-line-argument]
clang -cc1as: fatal error: error in backend: No open frame

this definitely seems like a bug in llvm

EDIT: i can't reproduce this same issue 10 minutes later

Last edited 9 months ago by carter (previous) (diff)

comment:12 Changed 9 months ago by carter

I have a patch that seems to fix it:

comment:13 Changed 9 months ago by carter

Differential Rev(s): D4780

https://phabricator.haskell.org/D4780 for a candidate patch. Not perfect, and may have other issues, but fixes the mac build without impacting other platforms.

I'm not 100% sure i'm fixing it perfectly, but it definitly has zero impact on non-darwin platforms, and my understanding is this particular dwarf data only works in the presence of the RTS support for stack walking which is exclusive to ELF platforms?

Or is there a fundamental error in this fix?

comment:14 Changed 9 months ago by carter

i tested ben's alternative fix, D4781, and it seems to work (and be more useful as it doesn't remove the dwarf data, it just makes it more precise)

comment:15 Changed 9 months ago by bgamari

Unfortunately my patch appears to break on Linux.

Last edited 9 months ago by bgamari (previous) (diff)

comment:16 Changed 9 months ago by bgamari

Differential Rev(s): D4780Phab:D4781
Status: newpatch

comment:17 Changed 8 months ago by Ben Gamari <ben@…>

In 86210b23/ghc:

rts: Use .cfi_{start|end}proc directives

Test Plan: Validate using LLVM assembler

Reviewers: carter, erikd, simonmar

Reviewed By: simonmar

Subscribers: rwbarton, thomie

GHC Trac Issues: #15207

Differential Revision: https://phabricator.haskell.org/D4781

comment:18 Changed 8 months ago by bgamari

Status: patchinfoneeded

I believe this should now be fixed. Carter, do you think you could test the final patch?

comment:19 Changed 8 months ago by bgamari

Sigh; this is all just very unfortunate. It seems that there is something funny about Apple's clang (surprise surprise!) The patch as-merged breaks under standard Clang, which apparently inserts the necessary CFI directives into inline assembler, just as GCC does. I'm going to revert until we have a better understanding of what is going on here.

Carter, can you look into precisely when this happens? Specifically which Clang version are you seeing break? Which Clang versions work?

comment:20 Changed 8 months ago by carter

the issue is when using GCC on OSX with the system assembler, "as", which is a wrapper around llvm's assembler.

aka GCC for C code, llvm's AS(sembler) for assembling

comment:21 Changed 8 months ago by awson

Don't know anything about Apple's clang, but exactly the same way behaves Windows msvc‑abi‑targeted clang – it inserts nothing.

Generally, clang is able to mimic both gcc behaviour *and* native platform toolset behavior, and tries to be gcc compatible when targeting gnu triplet and native toolset compatible when targeting native triplet.

Thus, I believe changing #if defined(__clang__) to #if defined(__clang__) && !defined (__GNUC__) should suffice.

comment:22 Changed 8 months ago by bgamari

Milestone: 8.6.18.8.1

These won't be addressed for GHC 8.6.

comment:23 Changed 8 months ago by bgamari

At the moment I'm leaning towards discontinuing use of GCC and accepting the performance hit. I have had a number of Darwin users complain about various toolchain issues which arise from our strange mixture of toolchains. This issue makes me think that the pain that our use of GCC brings far outweighs the cost of Clang's poor TLS performance. It would be nice if someone could quantify the effect, however. Carter, perhaps you could build a GHC using GCC and benchmark this against 8.6.1-alpha1, which was built with Clang?

comment:24 Changed 5 months ago by carter

@Ben, i dropped the ball on that comparison, i'll add it to my queue :)

comment:25 Changed 5 months ago by carter

and or ask for some help from other mac users

comment:26 Changed 4 months ago by carter

i'll see about doing the nofib runs this weekend

Changed 4 months ago by carter

Attachment: nofib-report-two.txt added

comparsion of clang and gcc built GHC at commit 578012be13eb1548050d51c0a23bd1a98423f03e on mac

comment:27 Changed 4 months ago by carter

my GCC build is using my D4780 patch for stgcrun.c

notable difference: compile times are like 5% better on average with the GCC built ghc,

there also seems to be an average 2 percent advantage in GC times ... though i could be reading this wrong

comment:28 Changed 4 months ago by carter

seems theres still a performance gap in gcc vs clang, as per historical ticket https://ghc.haskell.org/trac/ghc/ticket/7602

comment:29 Changed 4 months ago by carter

disregard this nofib report, i didn't do threaded executions

comment:30 Changed 4 months ago by carter

i did two different threaded runs, thres a clean HUGE difference on clang vs gcc, favoring gcc in the GC timings, the one program where this is strongly not the case is k nucleo tide,

I'll attach these two sets of nofib runs here.

nofib report i name "dirty" is merely "i didn't clean and rebuild right before running nofib". so its likely to be cleaner measurements if thermal throttling were an issue (i used the intel power gadget during the first run, and I idn't see a difference in peak average frequency between the two, at least when eyeballing it)

Changed 4 months ago by carter

threaded with time to cool nofib, N4

Changed 4 months ago by carter

even more time to cool threaded n4 nofib comparison

Changed 4 months ago by carter

hyper threading disabled quad core threaded rts nofib

comment:31 Changed 3 months ago by carter

@Ben I just tried applying the diff from https://phabricator.haskell.org/D4781 on my mac... and it doesn't seem to work ..

comment:32 Changed 3 months ago by carter

this was on top of the current 8.6 branch as of today

comment:33 Changed 3 months ago by carter

in the middle of validating / checking a possibly slightly different patch that works

Kavon on IRC asked if the issues between platforms and clang vs gcc etc could be resolved by moving the inline assembly into its own file. I think that might work and remove dependency on platform specific CPP about ASM encodings

comment:34 Changed 3 months ago by carter

my build gets stuck at the following

"inplace/bin/ghc-stage1" -this-unit-id rts -shared -dynamic -dynload deploy -no-auto-link-packages -Lrts/dist/build -lffi -optl-Wl,-rpath -optl-Wl,@loader_path `cat rts/dist/libs.depend` rts/dist/build/STM.dyn_o rts/dist/build/Timer.dyn_o rts/dist/build/ThreadPaused.dyn_o rts/dist/build/Ticky.dyn_o rts/dist/build/ThreadLabels.dyn_o rts/dist/build/RtsMain.dyn_o rts/dist/build/ProfilerReport.dyn_o rts/dist/build/PathUtils.dyn_o rts/dist/build/LibdwPool.dyn_o rts/dist/build/Proftimer.dyn_o rts/dist/build/CheckUnload.dyn_o rts/dist/build/fs.dyn_o rts/dist/build/OldARMAtomic.dyn_o rts/dist/build/HsFFI.dyn_o rts/dist/build/RtsDllMain.dyn_o rts/dist/build/RtsSymbolInfo.dyn_o rts/dist/build/RaiseAsync.dyn_o rts/dist/build/ProfHeap.dyn_o rts/dist/build/Schedule.dyn_o rts/dist/build/Hash.dyn_o rts/dist/build/Trace.dyn_o rts/dist/build/Threads.dyn_o rts/dist/build/Hpc.dyn_o rts/dist/build/Weak.dyn_o rts/dist/build/ProfilerReportJson.dyn_o rts/dist/build/Task.dyn_o rts/dist/build/StgCRun.dyn_o rts/dist/build/ClosureFlags.dyn_o rts/dist/build/RetainerProfile.dyn_o rts/dist/build/Libdw.dyn_o rts/dist/build/Stats.dyn_o rts/dist/build/Interpreter.dyn_o rts/dist/build/Messages.dyn_o rts/dist/build/RtsUtils.dyn_o rts/dist/build/RtsSymbols.dyn_o rts/dist/build/xxhash.dyn_o rts/dist/build/LdvProfile.dyn_o rts/dist/build/Capability.dyn_o rts/dist/build/Printer.dyn_o rts/dist/build/Globals.dyn_o rts/dist/build/Adjustor.dyn_o rts/dist/build/RtsAPI.dyn_o rts/dist/build/Inlines.dyn_o rts/dist/build/TopHandler.dyn_o rts/dist/build/Linker.dyn_o rts/dist/build/Disassembler.dyn_o rts/dist/build/WSDeque.dyn_o rts/dist/build/StaticPtrTable.dyn_o rts/dist/build/Pool.dyn_o rts/dist/build/StgPrimFloat.dyn_o rts/dist/build/FileLock.dyn_o rts/dist/build/RtsStartup.dyn_o rts/dist/build/RetainerSet.dyn_o rts/dist/build/Profiling.dyn_o rts/dist/build/Sparks.dyn_o rts/dist/build/RtsFlags.dyn_o rts/dist/build/RtsMessages.dyn_o rts/dist/build/Stable.dyn_o rts/dist/build/Arena.dyn_o rts/dist/build/Heap.dyn_o rts/dist/build/hooks/StackOverflow.dyn_o rts/dist/build/hooks/LongGCSync.dyn_o rts/dist/build/hooks/OnExit.dyn_o rts/dist/build/hooks/FlagDefaults.dyn_o rts/dist/build/hooks/MallocFail.dyn_o rts/dist/build/hooks/OutOfHeap.dyn_o rts/dist/build/sm/Scav_thr.dyn_o rts/dist/build/sm/Storage.dyn_o rts/dist/build/sm/Evac.dyn_o rts/dist/build/sm/Sanity.dyn_o rts/dist/build/sm/GC.dyn_o rts/dist/build/sm/MarkWeak.dyn_o rts/dist/build/sm/Evac_thr.dyn_o rts/dist/build/sm/BlockAlloc.dyn_o rts/dist/build/sm/GCUtils.dyn_o rts/dist/build/sm/GCAux.dyn_o rts/dist/build/sm/Sweep.dyn_o rts/dist/build/sm/CNF.dyn_o rts/dist/build/sm/Compact.dyn_o rts/dist/build/sm/Scav.dyn_o rts/dist/build/sm/MBlock.dyn_o rts/dist/build/eventlog/EventLog.dyn_o rts/dist/build/eventlog/EventLogWriter.dyn_o rts/dist/build/linker/elf_util.dyn_o rts/dist/build/linker/elf_reloc.dyn_o rts/dist/build/linker/LoadArchive.dyn_o rts/dist/build/linker/elf_plt_arm.dyn_o rts/dist/build/linker/SymbolExtras.dyn_o rts/dist/build/linker/CacheFlush.dyn_o rts/dist/build/linker/Elf.dyn_o rts/dist/build/linker/elf_got.dyn_o rts/dist/build/linker/M32Alloc.dyn_o rts/dist/build/linker/elf_plt_aarch64.dyn_o rts/dist/build/linker/elf_plt.dyn_o rts/dist/build/linker/PEi386.dyn_o rts/dist/build/linker/MachO.dyn_o rts/dist/build/linker/elf_reloc_aarch64.dyn_o rts/dist/build/posix/GetEnv.dyn_o rts/dist/build/posix/Select.dyn_o rts/dist/build/posix/Signals.dyn_o rts/dist/build/posix/TTY.dyn_o rts/dist/build/posix/Itimer.dyn_o rts/dist/build/posix/OSThreads.dyn_o rts/dist/build/posix/GetTime.dyn_o rts/dist/build/posix/OSMem.dyn_o   rts/dist/build/Updates.dyn_o rts/dist/build/HeapStackCheck.dyn_o rts/dist/build/StgStdThunks.dyn_o rts/dist/build/Exception.dyn_o rts/dist/build/Apply.dyn_o rts/dist/build/StgMiscClosures.dyn_o rts/dist/build/Compact.dyn_o rts/dist/build/PrimOps.dyn_o rts/dist/build/StgStartup.dyn_o rts/dist/build/AutoApply.dyn_o  -fPIC -dynamic  -H32m -O -Wall   -Iincludes -Iincludes/dist -Iincludes/dist-derivedconstants/header -Iincludes/dist-ghcconstants/header -Irts -Irts/dist/build -DCOMPILING_RTS -DFS_NAMESPACE=rts -this-unit-id rts -dcmm-lint  -DDTRACE     -i -irts -irts/dist/build -Irts/dist/build -irts/dist/build/./autogen -Irts/dist/build/./autogen            -O2 -Wcpp-undef    -Wnoncanonical-monad-instances  -fno-use-rpaths   -o rts/dist/build/libHSrts-ghc8.6.2.dylib
0  0x10c2f3898  __assert_rtn + 129
1  0x10c30c983  mach_o::relocatable::Parser<x86_64>::parse(mach_o::relocatable::ParserOptions const&) + 3153
2  0x10c30274c  mach_o::relocatable::Parser<x86_64>::parse(unsigned char const*, unsigned long long, char const*, long, ld::File::Ordinal, mach_o::relocatable::ParserOptions const&) + 282
3  0x10c35a12c  ld::tool::InputFiles::makeFile(Options::FileInfo const&, bool) + 1014
4  0x10c35d4f7  ld::tool::InputFiles::parseWorkerThread() + 533
5  0x7fff59b6d661  _pthread_body + 340
6  0x7fff59b6d50d  _pthread_body + 0
A linker snapshot was created at:
	/tmp/libHSrts-ghc8.6.2.dylib-2018-10-12-212259.ld-snapshot
ld: Assertion failed: (cfiStartsArray[i] != cfiStartsArray[i-1]), function parse, file /Library/Caches/com.apple.xbs/Sources/ld64_Fall2018/ld64-409.12/src/ld/parsers/macho_relocatable_file.cpp, line 1939.
collect2: error: ld returned 1 exit status
`gcc-8' failed in phase `Linker'. (Exit code: 1)
rts/ghc.mk:315: recipe for target 'rts/dist/build/libHSrts-ghc8.6.2.dylib' failed
make[1]: *** [rts/dist/build/libHSrts-ghc8.6.2.dylib] Error 1
Makefile:122: recipe for target 'all' failed
make: *** [all] Error 2

i tried out using LLVM 7's ld.lld tool, and i had the same problme/error there..

iVe 1-2 ideas left i'm going to try

comment:35 Changed 3 months ago by carter

So a few things:

ben's patch on phabricator is inccorect, it needs to be changed as follows

-#if defined(__clang__)
+ #if !defined(__clang__) && defined(darwin_HOST_OS)
#define NEED_EXPLICIT_CFI_START_END
#endif

and the comment / explanation there is wrong the issue isn't building the file with CLANG. its building the inline asm with GCC, but using the OSX assembler.

even with the above tweak (which isolates the change to OSX and using gcc there), linking the RTS including one of the STGCRUN object files triggers an Assert in the system linker, both the xcode 10 cli tools LD and the LLVM 7 LD (ld.lld) both have this validation aasert

Changed 3 months ago by carter

Attachment: fustratedSTGCRun.s added

gcc's asssembly for the offending file

comment:36 Changed 3 months ago by carter

so i've some ideas:

1) are we placing end proc at the right spot?

2) i think it may be profitable / simplest to move the darwin code path to asm, using clang output as a reference which is going to be my next attachment :)

Changed 3 months ago by carter

Attachment: clangSTGCRUN.s added

happy stgCRun darwin asm

comment:37 Changed 3 months ago by carter

i put a patch up on https://phabricator.haskell.org/D5340

i'm still failing to get it to build though :(

comment:38 Changed 3 months ago by carter

(though it may be because i didn't reboot the build after editing the cabal.in file for the RTS)

comment:39 Changed 3 months ago by carter

the patch works, we just cant build GHC boot/base libs with any dwarf setting higher than -g0 on darwin because of some linker / object code format limitation

comment:40 Changed 3 months ago by bgamari

Unfortunately this severely limits the usability of DWARF on the whole. In general one can only reliably unwind the stack if unwinding tables are available for all code in the running image. If you find a single frame for which there is no unwind information (or a frame pointer) available you must stop.

comment:41 Changed 3 months ago by carter

Ben: you mean the base being too big for macho object format?

I agree it’s a problem.

I’m confused though : do you mean unwind the stack using the stack walker that only supports ELF FORMATS only at the moment. Or do you mean gdb / lldb being informative for debugging in general? OS X doesn’t get the stack walker at the moment anyways , so that’s only a forward looking issue afaict ;)

comment:42 Changed 2 months ago by Ben Gamari <ben@…>

In 1ebe8438/ghc:

StgCRun: Disable unwinding on Darwin

See #15207.

comment:43 Changed 2 months ago by bgamari

Resolution: fixed
Status: infoneededclosed
Note: See TracTickets for help on using tickets.