Opened 3 years ago

Closed 3 years ago

#9673 closed bug (fixed)

aarch64 7.8.4, 7.10, 7.11: lib/ghc/bin/ghc-pkg --version does not output from subprocess

Reported by: juhpetersen Owned by:
Priority: high Milestone: 7.10.2
Component: Compiler Version: 7.11
Keywords: Cc: bgamari
Operating System: Linux Architecture: aarch64
Type of failure: Installing GHC failed Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

(I first mentioned this recently in ticket:7942#comment:42 but I thought better to open a new ticket for this.)

I tried to build ghc-7.8.3 on aarch64 Linux (Fedora 21 development) with the patch from #7942, and ran into:

"inplace/bin/ghc-cabal" copy libraries/haskell2010 dist-install "strip" '/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64' '/usr' '/usr/lib64/ghc-7.8.3' '/usr/share/doc/ghc/html/libraries' 'v dyn '
Installing library in
/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/haskell2010-1.1.2.0
"/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/bin/ghc-pkg" --force --global-package-db "/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/package.conf.d" update rts/dist/package.conf.install
Reading package info from "rts/dist/package.conf.install" ... done.
rts-1.0: Warning: library-dirs: /usr/lib64/ghc-7.8.3/rts-1.0 doesn't exist or isn't a directory
rts-1.0: Warning: include-dirs: /usr/lib64/ghc-7.8.3/include doesn't exist or isn't a directory
rts-1.0: cannot find any of ["libHSrts.a","libHSrts.p_a","libHSrts-ghc7.8.3.so","libHSrts-ghc7.8.3.dylib","HSrts-ghc7.8.3.dll"] on library path (ignoring)
"inplace/bin/ghc-cabal" register libraries/ghc-prim dist-install "/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/bin/ghc" "/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/bin/ghc-pkg" "/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3" '/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64' '/usr' '/usr/lib64/ghc-7.8.3' '/usr/share/doc/ghc/html/libraries' NO  
Warning: cannot determine version of
/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/bin/ghc-pkg
:
""
Registering ghc-prim-0.3.1.0...
"inplace/bin/ghc-cabal" register libraries/integer-gmp dist-install "/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/bin/ghc" "/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/bin/ghc-pkg" "/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3" '/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64' '/usr' '/usr/lib64/ghc-7.8.3' '/usr/share/doc/ghc/html/libraries' NO  
Warning: cannot determine version of
/builddir/build/BUILDROOT/ghc-7.8.3-38.fc22.aarch64/usr/lib64/ghc-7.8.3/bin/ghc-pkg
:
""
ghc-cabal: Installed package ID not registered: "ghc-prim-0.3.1.0-inplace"
ghc.mk:901: recipe for target 'install_packages' failed
Makefile:64: recipe for target 'install' failed
make[1]: *** [install_packages] Error 1
make: *** [install] Error 2

The problem seems to be that the installed (dynlinked) ghc-pkg does give any output!! (eg "ghc-pkg --version" returns "", same for --help.) When I try the built binaries I find that ghc-7.8.3/utils/ghc-pkg/dist/build/tmp/ghc-pkg (which is statically linked) works normally (ie it outputs --help and --version) whereas $DESTDIR/$libdir/ghc-7.8.3/bin/ghc-pkg gives no output on stdout!

A workaround is to build with DYNAMIC_GHC_PROGRAMS=NO (ticket:7942#comment:44)

Attachments (1)

0001-mk-config.mk.in-Enable-SMP-and-GHCi-support-for-Aarc.patch (1.6 KB) - added by erikd 3 years ago.

Download all attachments as: .zip

Change History (31)

comment:1 Changed 3 years ago by juhpetersen

Same for 7.8.4 btw ;)

Downstream bug is https://bugzilla.redhat.com/show_bug.cgi?id=1195231

comment:2 Changed 3 years ago by juhpetersen

This also happens with 7.10 RC2 (with Erik's patch https://github.com/ghc/ghc/commit/b9063703301f0d902b4bb2eb28ac27e9bc050ea0 ).

comment:3 Changed 3 years ago by juhpetersen

Architecture: armaarch64
Summary: ghc-7.8.3 fails to build on aarch64ghc 7.8.4 and 7.10 fail to build on aarch64
Version: 7.8.37.10.1-rc1

comment:4 Changed 3 years ago by juhpetersen

Priority: normalhigh
Summary: ghc 7.8.4 and 7.10 fail to build on aarch64aarch64 7.8.4, 7.10, 7.11: lib/ghc/bin/ghc-pkg --version does not output from subprocess
Type of failure: None/UnknownInstalling GHC failed

Also happens with ghc-7.11.20150316 (yesterday's head of git master).

Note:

$ rpmbuild/BUILDROOT/ghc-7.11.20150316-0.1.fc21.aarch64/usr/lib64/ghc-7.11.20150316/bin/ghc-pkg --version
GHC package manager version 7.11.20150316
$ 

vs

$ echo $(rpmbuild/BUILDROOT/ghc-7.11.20150316-0.1.fc21.aarch64/usr/lib64/ghc-7.11.20150316/bin/ghc-pkg --version)

$ 

!!

So somehow subprocess IO seems to be affected.

comment:5 Changed 3 years ago by erikd

Closed #10264 as a duplicate of this. Tracked it down to somewhere in the process library. Debugging continues.

comment:6 Changed 3 years ago by erikd

Have a tiny test program (trimmed down version of Distribution.Simple.Utils.rawSystemStdInOut running the problem arguments) as follows:

import System.IO
import System.Exit      ( ExitCode(..) )
import Control.Concurrent
import Control.Monad
import Control.Exception (evaluate, finally)
import System.Environment

import System.Process

main :: IO ()
main = do
  [path] <- getArgs
  runInteractiveProcess path ["--version"] Nothing Nothing >>=
    \(inh,outh,errh,pid) -> do

      err <- hGetContents errh
      out <- hGetContents outh

      mv <- newEmptyMVar
      let force str = (evaluate (length str) >> return ())
            `finally` putMVar mv ()
          --TODO: handle exceptions like text decoding.
      _ <- forkIO $ force out
      _ <- forkIO $ force err

      -- wait for both to finish, in either order
      takeMVar mv
      takeMVar mv

      -- wait for the program to terminate
      exitcode <- waitForProcess pid
      unless (exitcode == ExitSuccess) $
        print $ path ++ " returned " ++ show exitcode
                       ++ if null err then "" else
                          " with error message:\n" ++ err

      print (out, err, exitcode)

Compiling and running this in a couple of different ways results in some interesting results:

$ /usr/bin/ghc-7.6.3 --make test.hs -o prog && ./prog  /usr/bin/ghc-pkg
Linking prog ...
("GHC package manager version 7.6.3\n","",ExitSuccess)
$ /usr/bin/ghc-7.6.3 --make test.hs -o prog && \
                         ./prog /home/erikd/GHC/7.11/bin/ghc-pkg
("","",ExitSuccess)

This may be an output encoding issue as the TODO warning in the code warns about.

Last edited 3 years ago by erikd (previous) (diff)

comment:7 Changed 3 years ago by erikd

Even adding a print immediately after the hGetContents calls like this:

      err <- hGetContents errh
      out <- hGetContents outh
      print (out, err)

results in a pair of empty strings : ("",""). Now diving into hGetContents.

comment:8 Changed 3 years ago by juhpetersen

This also affects building packages like haskell-platform and gtk2hs-buildtools, which have versioned build-tools dependencies (eg on alex and happy, etc) in their .cabal files, when the those build-tools are Haskell dynamically linked.

(A workaround is to drop the version/bound from the .cabal file.)

comment:9 Changed 3 years ago by erikd

Pretty certain that the problem is caused by a lack of SMP support for AArch64 in mk/config.mk.

With the attached patch I was able to compile and install from git HEAD. Still need to do a little more testing to be sure.

comment:10 Changed 3 years ago by juhpetersen

Thanks Erik, I kicked off a test build with this patch:

http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=2957099

comment:11 in reply to:  10 Changed 3 years ago by juhpetersen

Replying to juhpetersen:

I kicked off a test build with this patch: http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=2957099

Unfortunately at least the 7.8.4 fedora build still seems to fail the same way for me unless I am missing something:

http://arm.koji.fedoraproject.org//work/tasks/7099/2957099/build.log [7MB]

Last edited 3 years ago by juhpetersen (previous) (diff)

comment:12 Changed 3 years ago by rwbarton

The behavior in comment:4 is really weird, I would start the investigation there. Maybe try echo $(strace .../ghc-pkg --version) and try to figure out why it does not write anything to stdout? Could it be because stdout is not connected to a terminal somehow? (That's about the only explanation I can think of...)

comment:13 Changed 3 years ago by juhpetersen

Version: 7.10.1-rc17.11

I diffed the straces (with addresses normalized/sanitized to 0x3ffyyyyyyyy). Below is the final significant chunk. The straces are for "./rpmbuild/BUILDROOT/ghc-7.10.1-1.fc23.aarch64/usr/lib64/ghc-7.10.1/bin/ghc-pkg --version".

--- process.strace	2015-04-15 03:11:21.351274400 -0400
+++ subprocess.strace	2015-04-15 03:13:42.167058743 -0400
 mprotect(0x3ffyyyyyyyy, 65536, PROT_READ) = 0
 mprotect(0x3ffyyyyyyyy, 65536, PROT_READ) = 0
 mprotect(0x3ffyyyyyyyy, 65536, PROT_READ) = 0
 munmap(0x3ffyyyyyyyy, 48881)            = 0
-set_tid_address(0x3ffyyyyyyyy)          = 13259
+set_tid_address(0x3ffyyyyyyyy)          = 13239
 set_robust_list(0x3ffyyyyyyyy, 24)      = 0
 rt_sigaction(SIGRTMIN, {0x3ffyyyyyyyy, [], SA_SIGINFO}, NULL, 8) = 0
 rt_sigaction(SIGRT_1, {0x3ffyyyyyyyy, [], SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
 rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
-brk(0)                                  = 0x13010000
-brk(0x13040000)                         = 0x13040000
+brk(0)                                  = 0x6950000
+brk(0x6980000)                          = 0x6980000
 openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
 fstat(3, {st_mode=S_IFREG|0644, st_size=106374736, ...}) = 0
 mmap(NULL, 106374736, PROT_READ, MAP_PRIVATE, 3, 0) = 0x3ffyyyyyyyy
 close(3)                                = 0
 clock_getres(CLOCK_PROCESS_CPUTIME_ID, {0, 1}) = 0
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 37059433}) = 0
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 36939847}) = 0
 openat(AT_FDCWD, "/proc/meminfo", O_RDONLY|O_CLOEXEC) = 3
 fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3ffyyyyyyyy
 read(3, "MemTotal:       16690880 kB\nMemF"..., 1024) = 1024
 close(3)                                = 0
 munmap(0x3ffyyyyyyyy, 65536)            = 0
 mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x3ffyyyyyyyy
-munmap(0x3ffyyyyyyyy, 655360)           = 0
 munmap(0x3ffyyyyyyyy, 393216)           = 0
+munmap(0x3ffyyyyyyyy, 655360)           = 0
 timer_create(CLOCK_MONOTONIC, {(nil), SIGVTALRM, SIGEV_SIGNAL, {...}}, {0}) = 0
 rt_sigaction(SIGVTALRM, {0x3ffyyyyyyyy, [], SA_RESTART}, NULL, 8) = 0
 timer_settime(0, 0, {it_interval={0, 10000000}, it_value={0, 10000000}}, NULL) = 0
 rt_sigaction(SIGINT, {0x3ffyyyyyyyy, [], 0}, {SIG_DFL, [], 0}, 8) = 0
 rt_sigaction(SIGINT, NULL, {0x3ffyyyyyyyy, [], 0}, 8) = 0
 rt_sigaction(SIGINT, {0x3ffyyyyyyyy, [], 0}, NULL, 8) = 0
 rt_sigaction(SIGPIPE, {0x3ffyyyyyyyy, [], 0}, {SIG_DFL, [], 0}, 8) = 0
 rt_sigaction(SIGTSTP, {0x3ffyyyyyyyy, [], 0}, NULL, 8) = 0
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 37535065}) = 0
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 37421445}) = 0
 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
 rt_sigaction(SIGINT, {0x3ffyyyyyyyy, [], SA_RESETHAND|SA_SIGINFO}, NULL, 8) = 0
 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
-ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
-pselect6(2, [], [1], NULL, {0, 0}, 0)   = 1 (out [1], left {0, 0})
-write(1, "GHC package manager version 7.10"..., 35) = 35
-ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 38040864}) = 0
+ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x3ffyyyyyyyy) = -1 ENOTTY (Inappropriate ioctl for device)
+ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x3ffyyyyyyyy) = -1 ENOTTY (Inappropriate ioctl for device)
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 37847540}) = 0
 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 38076223}) = 0
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 37882073}) = 0
 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
 rt_sigaction(SIGVTALRM, {SIG_IGN, [], SA_INTERRUPT|SA_NODEFER|SA_RESETHAND}, {0x3ffyyyyyyyy, [], SA_RESTART}, 8) = 0
 timer_delete(0)                         = 0
 rt_sigprocmask(SIG_BLOCK, [TTOU], [], 8) = 0
 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
 rt_sigaction(SIGINT, {SIG_DFL, [], 0}, NULL, 8) = 0
 rt_sigaction(SIGPIPE, {SIG_DFL, [], 0}, NULL, 8) = 0
 rt_sigaction(SIGTSTP, {SIG_DFL, [], 0}, NULL, 8) = 0
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 38428740}) = 0
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 38226330}) = 0
 exit_group(0)                           = ?
 +++ exited with 0 +++

comment:14 Changed 3 years ago by juhpetersen

and the subprocess straces diff between x86_64 and aarch64 looks like this (for 7.8.4):

--- subprocess-x86_64.strace	2015-04-15 17:18:07.000000000 +0900
+++ subprocess-arm64.strace	2015-04-15 16:50:06.000000000 +0900
-rt_sigaction(SIGVTALRM, {0x7f149254f650, [], SA_RESTORER|SA_RESTART, 0x7f1496add430}, NULL, 8) = 0
+rt_sigaction(SIGVTALRM, {0x3ff9a226910, [], SA_RESTART}, NULL, 8) = 0
 timer_settime(0, 0, {it_interval={0, 10000000}, it_value={0, 10000000}}, NULL) = 0
-rt_sigaction(SIGINT, {0x7f149255b750, [], SA_RESTORER, 0x7f1496add430}, {SIG_DFL, [], 0}, 8) = 0
-rt_sigaction(SIGINT, NULL, {0x7f149255b750, [], SA_RESTORER, 0x7f1496add430}, 8) = 0
-rt_sigaction(SIGINT, {0x7f149255b750, [], SA_RESTORER, 0x7f149219ab20}, NULL, 8) = 0
-rt_sigaction(SIGPIPE, {0x7f149255b6e0, [], SA_RESTORER, 0x7f1496add430}, {SIG_DFL, [], 0}, 8) = 0
-rt_sigaction(SIGTSTP, {0x7f149255b770, [], SA_RESTORER, 0x7f1496add430}, NULL, 8) = 0
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 8032580}) = 0
-clock_gettime(CLOCK_MONOTONIC, {1555, 94648060}) = 0
+rt_sigaction(SIGINT, {0x3ff9a237a10, [], 0}, {SIG_DFL, [], 0}, 8) = 0
+rt_sigaction(SIGINT, NULL, {0x3ff9a237a10, [], 0}, 8) = 0
+rt_sigaction(SIGINT, {0x3ff9a237a10, [], 0}, NULL, 8) = 0
+rt_sigaction(SIGPIPE, {0x3ff9a2379ac, [], 0}, {SIG_DFL, [], 0}, 8) = 0
+rt_sigaction(SIGTSTP, {0x3ff9a237a38, [], 0}, NULL, 8) = 0
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 34154801}) = 0
 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
-rt_sigaction(SIGINT, {0x7f149255b7f0, [], SA_RESTORER|SA_RESETHAND|SA_SIGINFO, 0x7f1496add430}, NULL, 8) = 0
+rt_sigaction(SIGINT, {0x3ff9a237ac4, [], SA_RESETHAND|SA_SIGINFO}, NULL, 8) = 0
 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
-ioctl(1, TCGETS, 0x7ffe2a9b0e70)        = -1 ENOTTY (Inappropriate ioctl for device)
-select(2, [], [1], NULL, {0, 0})        = 1 (out [1], left {0, 0})
-write(1, "GHC package manager version 7.8."..., 34) = 34
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 8190648}) = 0
-clock_gettime(CLOCK_MONOTONIC, {1555, 94885797}) = 0
+ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x3fff519c9e8) = -1 ENOTTY (Inappropriate ioctl for device)
+ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x3fff519c9e8) = -1 ENOTTY (Inappropriate ioctl for device)
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 34575324}) = 0
 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 8203047}) = 0
-clock_gettime(CLOCK_MONOTONIC, {1555, 94922279}) = 0
-clock_gettime(CLOCK_THREAD_CPUTIME_ID, {0, 8206804}) = 0
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 34608930}) = 0
+clock_getres(CLOCK_THREAD_CPUTIME_ID, {0, 1}) = 0
+clock_gettime(CLOCK_THREAD_CPUTIME_ID, {0, 34623282}) = 0
 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 0}}, NULL) = 0
-rt_sigaction(SIGVTALRM, {SIG_IGN, [], SA_RESTORER|SA_INTERRUPT|SA_NODEFER|SA_RESETHAND, 0x7f149219ab20}, {0x7f149254f650, [], SA_RESTORER|SA_RESTART, 0x7f1496add430}, 8) = 0
+rt_sigaction(SIGVTALRM, {SIG_IGN, [], SA_INTERRUPT|SA_NODEFER|SA_RESETHAND}, {0x3ff9a226910, [], SA_RESTART}, 8) = 0
 timer_delete(0)                         = 0
 rt_sigprocmask(SIG_BLOCK, [TTOU], [], 8) = 0
 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
-rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f1496add430}, NULL, 8) = 0
-rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER, 0x7f1496add430}, NULL, 8) = 0
-rt_sigaction(SIGTSTP, {SIG_DFL, [], SA_RESTORER, 0x7f1496add430}, NULL, 8) = 0
-clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 8337035}) = 0
-clock_gettime(CLOCK_MONOTONIC, {1555, 95152954}) = 0
+rt_sigaction(SIGINT, {SIG_DFL, [], 0}, NULL, 8) = 0
+rt_sigaction(SIGPIPE, {SIG_DFL, [], 0}, NULL, 8) = 0
+rt_sigaction(SIGTSTP, {SIG_DFL, [], 0}, NULL, 8) = 0
+clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {0, 34949910}) = 0
 exit_group(0)                           = ?
 +++ exited with 0 +++

comment:15 Changed 3 years ago by rwbarton

Interesting results. This seems to confirm that the issue is whether stdout is a tty. When stdout is redirected to a file (and thus not a tty), the stdout Handle will have BlockBuffering (see GHC.IO.Handle.Internals.getCharBuffer). Perhaps the output is being written to that buffer, but never flushed. (Whereas when stdout is a tty, the stdout Handle has LineBuffering so the output is flushed immediately.)

ghc-pkg --version calls exitWith ExitSuccess, so I think stdout is supposed to be flushed by GHC.TopHandler.flushStdHandles. I'm very suspicious of the two ioctl syscalls, where there is only one on x86_64. It looks as though the stdout Handle CAF (which is a unsafePerformIO $ do { ... ; mkHandle ... }) has been copied by the dynamic loader and the version referred to by the executable has a separate identity from the version referred to internally by libHSbase in GHC.TopHandler.flushStdHandles.

On Linux x86_64 I see these references to the stdout Handle in ghc-pkg:

rwbarton@morphism:/tmp$ objdump -DR ~/lib/ghc-7.8.4/bin/ghc-pkg | grep stdout
			6407a0: R_X86_64_GLOB_DAT	base_GHCziIOziHandleziFD_stdout_closure
			6493a0: R_X86_64_64	base_GHCziIOziHandleziFD_stdout_closure
			6495b8: R_X86_64_64	base_GHCziIOziHandleziFD_stdout_closure
			649968: R_X86_64_64	base_GHCziIOziHandleziFD_stdout_closure
			649ce0: R_X86_64_64	base_GHCziIOziHandleziFD_stdout_closure
			649e30: R_X86_64_64	base_GHCziIOziHandleziFD_stdout_closure

What is the corresponding output on your ARM system?

comment:16 Changed 3 years ago by erikd

On my Aarch64 system, for ghc-pkg from the Debian version of ghc-7.6.4 I get:

$ objdump -DR /usr/lib/ghc/lib/ghc-pkg | grep stdout
0000000001035d20 <stdout>:
                        1035d20: R_AARCH64_COPY stdout

where as for the newly compiled but broken 7.10 version I get:

$ objdump -DR /home/erikd/GHC/7.10/lib/ghc-7.10.1.20150414/bin/ghc-pkg | grep stdout
                        47bd60: R_AARCH64_GLOB_DAT      base_GHCziIOziHandleziFD_stdout_closure
000000000049a2e8 <base_GHCziIOziHandleziFD_stdout_closure>:
                        49a2e8: R_AARCH64_COPY  base_GHCziIOziHandleziFD_stdout_closure

GHC 7.6.3 and 7.10 are different and so are Amd64 vs AArch64.

It could well be a linker bug. The big difference between 7.6 and 7.10 is DYNAMIC_GHC_PROGRAMS is YES and as @juhp points out, this works when DYNAMIC_GHC_PROGRAMS is switched off.

comment:17 in reply to:  16 Changed 3 years ago by rwbarton

Cc: bgamari added

Replying to erikd:

000000000049a2e8 <base_GHCziIOziHandleziFD_stdout_closure>:
                        49a2e8: R_AARCH64_COPY  base_GHCziIOziHandleziFD_stdout_closure

I'm not very familiar with dynamic linking on ARM but isn't this wrong? base_GHCziIOziHandleziFD_stdout_closure needs to point to the same address in the executable and in the shared library since it is a closure that will get overwritten with a new Handle object the first time it is entered.

bgamari, you know about these ARM relocation types right?

comment:18 Changed 3 years ago by erikd

Using the test program from comment 6 and a basic "Hello world" program, I can reproduce this problem by compiling the "Hello world" program with -O2 -dynamic and running it with the above test program runner.

comment:19 Changed 3 years ago by erikd

Simple test program compiled with the stage2 compiler (with -O2 -dyanmic):

import System.Environment
import System.IO
main :: IO ()
main = do
    args <- getArgs
    putStrLn "GHC package manager version 1.2.3"
    if null args
        then return ()
        else hFlush stdout

Running as ./testprog > a.txt ; cat a.txt produces the expected and correct output whereas ./testprog X > a.txt ; cat a.txt produces nothing.

comment:20 Changed 3 years ago by rwbarton

It seems that the R_AARCH64_COPY relocation is not necessarily wrong; either the executable can go through a GOT to access a symbol with fixed offset from the dynamic library, or the dynamic library can go through a GOT to access a symbol with fixed offset from the executable which has been copied from the dynamic library with a copy relocation. Definitely worth checking that the linker took care of this properly though.

comment:21 Changed 3 years ago by bgamari

Erikd, are you using gold? There is a bug in the arm bfd LD wherein the linker will use a COPY relocation where we wouldn't expect it. As you suggest chaos ensues as the object's info table is not preserved by the linker when the symbol definition is copied. This is one of the reasons why we require gold on arm. I'll try to pit together a more complete reply in the morning.

comment:22 in reply to:  15 Changed 3 years ago by juhpetersen

Replying to rwbarton:

Thanks for the analysis and comments.

On Linux x86_64 I see these references to the stdout Handle in ghc-pkg:

:

What is the corresponding output on your ARM system?

For the static 7.8.4 build (ie with DYNAMIC_GHC_PROGRAMS=NO) I get:

$ objdump -DR /usr/lib64/ghc-7.8.4/bin/ghc-pkg | grep stdout
00000000013290b0 <stdout>:
                        13290b0: R_AARCH64_COPY stdout

Whereas the dynamic builds for 7.8.4 and 7.10.1 give:

$ objdump -DR /home/petersen/rpmbuild/BUILDROOT/ghc-7.8.4-42.2.fc22.aarch64/usr/lib64/ghc-7.8.4/bin/ghc-pkg | grep stdout
                        470af8: R_AARCH64_GLOB_DAT      base_GHCziIOziHandleziFD_stdout_closure
000000000048b908 <base_GHCziIOziHandleziFD_stdout_closure>:
                        48b908: R_AARCH64_COPY  base_GHCziIOziHandleziFD_stdout_closure

I think this 7.8.4 build was with ld.gold but it didn't seem to help. It would be good if someone else could also confirm that.

If it should help I can regenerate the diffs with real addresses.

comment:23 in reply to:  18 Changed 3 years ago by juhpetersen

Replying to erikd:

Using the test program from comment 6 and a basic "Hello world" program, I can reproduce this problem by compiling the "Hello world" program with -O2 -dynamic and running it with the above test program runner.

I can reproduce with dynamic helloworld too with -O1 and above:

$ cat > test.hs
main = putStrLn "hi"
$ rm -f test.o test.hi; ghc -dynamic -O1 test.hs ; echo $(./test)
[1 of 1] Compiling Main             ( test.hs, test.o )
Linking test ...

$ 

It doesn't happen with -O0.

comment:24 Changed 3 years ago by juhpetersen

(Naively statically linking Setup.hs does not seem to help though (cf comment:8).)

Last edited 3 years ago by juhpetersen (previous) (diff)

comment:25 Changed 3 years ago by erikd

@bgamari This is not Arm, this is AArch64/Arm64. I seem to remember that @juhp tried the gold linker on AArch64, but ran into other problems.

comment:26 in reply to:  25 Changed 3 years ago by juhpetersen

Replying to erikd:

I seem to remember that @juhp tried the gold linker on AArch64, but ran into other problems.

Well I tried to build 7.8.4 with ld.gold but it didn't seem to make a difference for me.

Last edited 3 years ago by juhpetersen (previous) (diff)

comment:27 Changed 3 years ago by erikd

I just tested with the gold linker and that seems to fix it. Patch coming.

comment:28 Changed 3 years ago by Erik de Castro Lopo <erikd@…>

In 0bbc2ac6dae9ce2838f23a75a6a989826c06f3f5/ghc:

Use the gold linker for aarch64/linux (#9673)

Like 32 bit Arm, Aarch64 requires use of the gold linker.

Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>

Test Plan: 'make install' on aarch64, validate elsewhere

Reviewers: rwbarton, bgamari, austin

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D858

GHC Trac Issues: #9673

comment:29 Changed 3 years ago by erikd

Milestone: 7.10.2
Status: newmerge

comment:30 Changed 3 years ago by thoughtpolice

Resolution: fixed
Status: mergeclosed

Merged to ghc-7.10.

Note: See TracTickets for help on using tickets.