|Version 2 (modified by simonmar, 7 years ago) (diff)|
Troubleshooting the GHC build
Here we keep track of failures that can occur when building GHC, with solutions.
Using autoconf by mistake
If you used autoconf instead of sh boot, you'll get an error when you run ./configure:
...lots of stuff... creating mk/config.h mk/config.h is unchanged configuring in ghc running /bin/sh ./configure --cache-file=.././config.cache --srcdir=. ./configure: ./configure: No such file or directory configure: error: ./configure failed for ghc
Cannot create configure
autoreconf (which gets run by sh boot) seems to create the file configure read-only. So if you need to run sh boot again (which I sometimes do for safety's sake), you get
/usr/bin/autoconf: cannot create configure: permission denied
Solution: delete configure first.
Configure can't find darcs version
When you run your configure script, it falls over with
sh-2.04$ ./configure --with-gcc=c:/mingw/bin/gcc --with-ld=c:/mingw/bin/ld.exe --host=i386-unknown-mingw32 configure: WARNING: If you wanted to set the --build type, don't use --host. If a cross compiler is detected then cross compile mode will be used. checking for GHC version date... -nThe system cannot find the file specified. configure: error: failed to detect version date: check that darcs is in your path
This error is nothing to do with darcs! The darcs-version test in configure uses sort, and it is picking up the Windows sort (in c:\windows\system32) instead of the MSYS or Cygwin sort.
Solution: either hack the configure script by hand, or (better) make sure that MSYS/Cygwin are in your PATH before Windows. Since c:\windows\system32 is, by default, in the System Environment Variable called PATH, and System Variables come first when searching for paths, you'll have to put MSYS/Cygwin bin directory in the System PATH, before c:\windows\system32.
(Incidentally, find is another program that Windows has too, with different functionality to Unix.)
Argument list too long
You may find this towards the end of compiling the base library:
c:\ghc\ghc-6.6.1\bin\ar.exe: creating libHSbase.a xargs: c:/ghc/ghc-6.6.1/bin/ar: Argument list too long make: *** [libHSbase.a] Error 126 make: *** Deleting file `libHSbase.a' Failed making all in base: 1 make: *** [all] Error 1 make: Leaving directory `/cygdrive/c/GHC6.6.1/ghc-6.6.1/libraries' make: *** [stage1] Error 2
Sadly the argument list has a limited length in Windows. This may be fixable somehow (Windows expertise welcomed here), but what we do is to set
SplitObjs = NO
in build.mk. That stops the splitting-up of object files, and dramatically reduces the number of object files involved. Link times are also improved. (Binary size increases though.)
Also, you can arrange for the (huge) list of files to be processed iteratively, rather all at once, and that would probably be a principal solution. xargs feeds the file names to the appropriate command (e.g. ar). In $(GHC_TOP)/mk/target.mk find the place where it is called and add this switch
xargs -n NNN
where NNN is the number of arguments processed at a time. It should be small enough to be less than the limit and large enough for the whole thing not to be too slow.
Note, that it's not good to edit target.mk in general.
Space in TMPDIR
One difficulty that comes up from time to time is running out of space in TMPDIR. (It is impossible for the configuration stuff to compensate for the vagaries of different sysadmin approaches to temp space.)
The quickest way around it is setenv TMPDIR /usr/tmp or even setenv TMPDIR . (or the equivalent incantation with your shell of choice).
The best way around it is to say
in your build.mk file. Then GHC and the other tools will use the appropriate directory in all cases.
Warning "warning: assignment from incompatible pointer type"
You may occasionally see a warning from the C compiler when compiling some Haskell code, eg. "warning: assignment from incompatible pointer type". These are usually harmless, but it's a good idea to report it on the mailing list so that we can fix it.
Warning "ar: filename GlaIOMonad__1_2s.o truncated to GlaIOMonad_"
Similarly, archiving warning messages like the following are not a problem:
ar: filename GlaIOMonad__1_2s.o truncated to GlaIOMonad_ ar: filename GlaIOMonad__2_2s.o truncated to GlaIOMonad_ ...
GHC's sources go through cpp before being compiled, and cpp varies a bit from one Unix to another. One particular gotcha is macro calls like this:
Some cpps treat the comma inside the string as separating two macro arguments, so you get
:731: macro `SLIT' used with too many (2) args
Alas, cpp doesn't tell you the offending file! Workaround: don't put weird things in string args to cpp macros.
Cabal/Distribution/Compat/FilePath.hs: No such file or directory
You may see this:
Distribution/Compat/FilePath.hs:2: error: Cabal/Distribution/Compat/FilePath.hs: No such file or directory make: *** [depend] Error 1 make: *** [stage1] Error 1
Possible Solution:: Be sure you have run sh darcs-all get to get all necessary packages. Don't forget to run sh boot again after you pull in new packages.
xargs: /usr/bin/ar: terminated by signal 11
You may see this when compiling libraries:
(echo Control/Concurrent_stub.o System/CPUTime_hsc.o System/Time_hsc.o ; /usr/bin/find Control/Applicative_split Control/Arrow_split Control/Concurrent_split Control/Concurrent/Chan_split ...long mess... Text/PrettyPrint/HughesPJ_split Text/Printf_split Text/Read_split Text/Read/Lex_split Text/Show_split Text/Show/Functions_split -name '*.o' -print) | xargs /usr/bin/ar q libHSbase.a /usr/bin/ar: creating libHSbase.a xargs: /usr/bin/ar: terminated by signal 11 make: *** [libHSbase.a] Error 125 make: *** Deleting file `libHSbase.a' make: *** [all] Error 1
What is happening is that the ghc build system is linking thousands and thousands of tiny .o files into libHSbase.a. GNU ar isn't optimised for this use-case and it takes far more memory than it really needs to. So what happens is that ar takes >500Mb of memory and your virtual machine / virtual server probably isn't configured with that much memory and so the linux kernel OOM killer terminates the ar process.
To make this worse, since there are so many .o files, it takes several invocations of ar to link them all. On each invocation ar is building the symbol index (-q is ignored) and this is what takes the most time and memory. It's a good deal quicker to use a custom program (100 lines of Haskell) to build libHSbase.a and then use ranlib just once to build the symbol index.
[Duncan Coutts] I submitted a patch to gnu binutils to make ar take less memory when linking 1000's of files so it now only takes around 100Mb rather than 500Mb when linking libHSbase.a. That patch is included in version 2.17 I think (in other words most systems don't have it yet).
What you can do in the mean time is either configure your virtual machine with more memory or turn off the split-objs feature when you configure ghc. Just add SplitObjs=NO to your mk/build.mk file (which may not exist to start with). (The Gentoo ebuild does this automatically)