Opened 7 years ago

Closed 10 months ago

#3242 closed bug (fixed)

GHCi linker does not correctly locate static libraries under Windows

Reported by: jeffz1 Owned by: Phyx-
Priority: high Milestone: 7.10.3
Component: GHCi Version: 7.4.1
Keywords: Cc: felipe.lessa@…, jwlato@…, hvr, oerjan@…, bjorn.buckwalter@…, vagarenko, wren@…, RyanGlScott
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case: T3242
Blocked By: #3658 Blocking:
Related Tickets: #7097 Differential Rev(s): Phab:D1455
Wiki Page:

Description (last modified by igloo)

On Windows, when attempting to use Hipmunk from ghci, it produces the error:

ghci: can't load .so/.DLL for: m (addDLL: could not load DLL)

To reproduce:

cabal install Hipmunk
ghci
:m + Physics.Hipmunk
initChipmunk

Presumably this has something to do with there being no libm on Windows.

Attachments (1)

hipmunk-mingw.patch (519 bytes) - added by nus 4 years ago.

Download all attachments as: .zip

Change History (48)

comment:1 Changed 7 years ago by igloo

  • difficulty set to Unknown
  • Resolution set to invalid
  • Status changed from new to closed

Thanks for the report.

The hipmunk cabal file unconditionally says

Extra-Libraries: m

so this is a bug in hipmunk, not GHC.

comment:2 Changed 7 years ago by jeffz1

Ah, if it's a bug in hipmunk, do you have any idea of the correct approach?

I tried modifying Hipmunk's .cabal to instead be:

if !os(windows) {

Extra-Libraries: m

}

cabal configure, cabal build, cabal install in ghci, if I try initChipmunk again, I get:

Loading package syb ... linking ... <interactive>: C:\Program Files\Haskell\Hipm unk-0.2.2\ghc-6.10.2\HSHipmunk-0.2.2.o: unknown symbol `_fmodf' : unable to load package `syb'

So I guess that isn't correct either. GHC works fine with m or without m, it's just ghci that has the issue.

comment:3 Changed 7 years ago by meteficha

  • Resolution invalid deleted
  • Status changed from closed to reopened

Hello, igloo!

I'm Hipmunk's maintainer and it has been almost two months since you've closed this bug saying that this is a bug in Hipmunk but not saying what should be done on Hipmunk's side. I'm reopening it because I too believe that this is a bug on GHCi's side. I'll be happy to "fix" Hipmunk if you say how to do so, though.

Thanks, Felipe.

comment:4 Changed 7 years ago by meteficha

  • Cc felipe.lessa@… added

comment:5 Changed 7 years ago by igloo

  • Description modified (diff)

comment:6 Changed 7 years ago by igloo

  • Resolution set to invalid
  • Status changed from reopened to closed

I'm afraid I don't know how to do this on Windows, but I still don't see a GHC bug here: If you tell GHC that you need a library that doesn't exist then you will get the above error message.

Please reopen this ticket and give more details if you believe that there is a GHC bug causing this problem.

comment:7 Changed 6 years ago by jwlato

  • Resolution invalid deleted
  • Status changed from closed to new
  • Type of failure set to None/Unknown
  • Version changed from 6.10.2 to 7.0.1

I'm reopening this because I'm getting the same behavior with ghc-7.0.1 and the package "ieee-0.7", which also specifies a dependency on "m".

It appears that libm is distributed as part of the MinGW stuff with ghc; I see the file "/e/ghc/ghc-7.0.1/mingw/lib/libm.a". Compiling with ghc passes the flag "-lm" during the linking step and works properly. It's only when using ghci that the library can't be found. If I install the ieee package without the libm dependency programs still build with ghc, but ghci complains about missing symbols.

comment:8 Changed 6 years ago by jwlato

  • Cc jwlato@… added

comment:9 Changed 6 years ago by igloo

  • Milestone set to 7.0.3

comment:10 Changed 5 years ago by fryguybob

After a little investigation I have discovered that on MinGW libm.a is dummy library to satisfy -lm. The desired code is in libmingwex.a which gets linked when compiling with ghc. Changing to Extra-Libraries: mingwex does not help as ghci looks for DLL to satisfy that. I'm not sure how to get libmingwex.a to load with ghci. If you turn libmingwex.a into a dll:

ar -x libmingwex.a
gcc -shared *.o -o mingwex.dll

Then everything works.

comment:11 Changed 5 years ago by keloglan2011

  • Status changed from new to patch

fryguybob's solution worked. Thanks fryguybob. :)

(renaming mingwex.dll to m.dll and putting it in your path solves it without recompiling anything )

comment:12 Changed 5 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1

comment:13 Changed 5 years ago by simonmar

  • Resolution set to fixed
  • Status changed from patch to closed

I think this should work with 7.4.1. Loading .a files in GHCi on Windows now works (it didn't before). Please re-open if you find there's still a bug.

comment:14 Changed 4 years ago by schernichkin

  • Resolution fixed deleted
  • Status changed from closed to new
  • Version changed from 7.0.1 to 7.4.1

The problem still exists on Windows. Mostly caused by ieee754 and other libraries refered m. Compiler works just fine the problem only with ghci.

comment:15 follow-up: Changed 4 years ago by simonmar

  • Milestone changed from 7.4.1 to 7.4.2

Can you give us instructions to reproduce the problem please?

comment:16 in reply to: ↑ 15 Changed 4 years ago by schernichkin

Replying to simonmar:

Can you give us instructions to reproduce the problem please?

Sure.

  1. Install any library relying on m. Lets take ieee754.
  2. Start GHCi, import Numeric.IEEE.
  3. Type "minNormal :: Float"

You will get following output:

Loading package ieee754-0.7.3 ... <interactive>: m: The specified module could not be found. can't load .so/.DLL for: m.dll (addDLL: could not load DLL)

I got it on GHCi, version 7.4.1, but I believe the bug exists on more recent versions to.

comment:17 Changed 4 years ago by igloo

  • Milestone changed from 7.4.2 to 7.4.3

Changed 4 years ago by nus

comment:18 Changed 4 years ago by nus

  • Status changed from new to patch

MinGW's libm is a dummy object file to satisfy unix'ish linkers,the actual symbols are provided by the C runtime library (msvcrt.dll),which gets linked in both when compiled/dynamically loaded by GHCI. There's no need to mention libm in 'extra-libraries'. The patch was tested on GHC 7.4.2.

comment:19 Changed 4 years ago by nus

Pardon, the patched Hipmunk library was tested. The test was

import Physics.Hipmunk

main = initChipmunk

which didn't fail.

comment:20 Changed 4 years ago by nus

  • Status changed from patch to new

Turns out that besides msvcrt also mingwex carries some actual symbols needed for IEEE754 compliance (f.e. 'copysignf'). When we mention libm in 'extra-libraries'(which is not necessary), the dummy libm.a gets linked in, but the symbols references are satisfied by both mingwex/msvcrt. In the compiled case the libraries are introduced into the linkage by the built-in gcc specs. In the ghci case the linkage succeeds if the symbols references get satisfied by msvcrt (which is always preloaded by ghci), and fails if the symbols happen to live in mingwex (which ghci knows nothing about). Also see ticket #1883.

comment:21 Changed 4 years ago by nus

The reference to an undefined symbol _copysignf is a result of a foreign import in IEEE.hs:

$ nm ieee754-0.7.3/ghc-7.4.2/HSieee754-0.7.3.o |grep copysignf
         U _copysignf
[..snip...]
ieee754-0.7.3/Numeric/IEEE.hs:
[...snip...]
foreign import ccall unsafe "copysignf"
[...snip...]
$ cabal build
[...snip...]
Building ieee754-0.7.3...
[...snip...]
compile: input file .\Numeric\IEEE.hs
[...snip...]
*** Stg2Stg:
*** CodeGen:
*** CodeOutput:
*** Assembler:
"C:\mnt\data1\ghc32b\lib/../mingw/bin/gcc.exe" "-fno-stack-protector" "-Wl,--hash-size=31" "-Wl,--reduce-memory-overheads" "-I.\Numeric" "-Idist\build" "-Idist\build\autogen" "-Idist\build" "-c" "C:\Users\adm\AppData\Local\Temp\ghc3328_0\ghc3328_0.s" "-o" "dist\build\Numeric\IEEE.o"
[...snip...]
$ grep _copysignf /tmp/ghc3588_0/ghc3588_0.s
        call _copysignf
        call _copysignf
$ nm dist/build/Numeric/IEEE.o |grep copysignf
         U _copysignf
00000078 D _ieee754zm0zi7zi3_NumericziIEEE_czucopysignf_closure
00001038 T _ieee754zm0zi7zi3_NumericziIEEE_czucopysignf_info

comment:22 Changed 4 years ago by nus

comment:23 Changed 4 years ago by igloo

  • Blocked By 3658 added

comment:24 Changed 4 years ago by igloo

  • Milestone changed from 7.4.3 to 7.6.2

comment:25 Changed 2 years ago by thoughtpolice

  • Milestone changed from 7.6.2 to 7.10.1

Moving to 7.10.1.

comment:26 Changed 2 years ago by tejon

  • Cc hvr added

I am attempting to run the test code for helm, which depends on gtk and sdl2. As in the main ticket case, compiling with GHC works fine. I ran into the "GHCi can't find m.dll" issue on the pango dependency, but fixed it using fryguybob's suggestion. Sadly, I run into a similar issue on the sdl2 dependency:

Loading package sdl2-1.1.0 ... <interactive>: mingw32: The specified module could not be found.
can't load .so/.DLL for: mingw32.dll (addDLL: could not load DLL)

Relinking libmingw32.a into a dll was problematic: the .a contains both dllmain.o and a spurious main.o (which I removed), and also a pair of objects crtst.o and tlsmcrt.o which each do nothing but declare the same integer (a threading state flag), leading to a duplicate declaration error; I have tried favoring each, with the same result. Linking the remainder into mingw32.dll, I now get this:

Loading package sdl2-1.1.0 ... <interactive>: Unknown PEi386 section name `.rdata$IID_IBindStatusCallback' (while processing: C:/msys/lib\libSDL2.a)
ghc.exe: panic! (the 'impossible' happened)
  (GHC version 7.8.3 for i386-unknown-mingw32):
        loadArchive "C:/msys/lib\\libSDL2.a": failed

Perhaps mangling the relink broke everything, but I don't believe it should have gotten that far in the first place: like the m.dll issue this seems to arise from GHCi's inability to identify libfoo.a with foo.dll.

comment:27 Changed 2 years ago by oerjan

I suspect I just hit this bug when trying to cabal install random-extras from Haskell Platform 2014.2.0.0 on Windows 8.1. The dependency random-fu uses the logfloat library from Template Haskell while compiling other modules, which gives me the error message:

[ 7 of 29] Compiling Data.Random.Distribution.Uniform ( src\Data\Random\Distribu
tion\Uniform.hs, dist\build\Data\Random\Distribution\Uniform.o )
Loading package ghc-prim ... linking ... done.
[snipped long package list]
Loading package logfloat-0.12.1 ... ghc.exe: m: Den angitte modulen ble ikke fun
net.
<command line>: can't load .so/.DLL for: m.dll (addDLL: could not load DLL)
Failed to install random-fu-0.2.6.1
cabal: Error: some packages failed to install:
random-extras-0.19 depends on random-fu-0.2.6.1 which failed to install.
random-fu-0.2.6.1 failed during the building phase. The exception was:
ExitFailure 1

logfloat.cabal contains precisely such an Extra-Libraries: m line that is used in its default configuration. And trying to use logfloat from (Win)GHCi gives the same kind of error message.

I am able to get around this for logfloat by giving --flags -useFFI, but I still think it's a shame if this bug prevents a straightforward cabal install from working.

comment:28 Changed 2 years ago by oerjan

  • Cc oerjan@… added

comment:29 Changed 2 years ago by dkwright

Happening for me on Windows 8.1 64 bit with 64 bit GHC 2014.2.0.0 as well. Loading this code (stripped down from something more practical), example.hs, in ghci and typing main, gives the error:

module Main where
 
import Criterion.Main

main :: IO()
main = putStrLn "example"

Without the import of Criterion.Main, no error. It looks like it's a dependency on ieee754 from Criterion.Main that does it.

Error:

SNIP
Loading package ansi-wl-pprint-0.6.7.1 ... linking ... done.
Loading package blaze-builder-0.3.3.4 ... linking ... done.
Loading package cassava-0.4.2.0 ... linking ... done.
Loading package MonadRandom-0.3 ... linking ... done.
Loading package exceptions-0.6.1 ... linking ... done.
Loading package nats-0.2 ... linking ... done.
Loading package semigroups-0.15.3 ... linking ... done.
Loading package transformers-compat-0.3.3.4 ... linking ... done.
Loading package void-0.6.1 ... linking ... done.
Loading package contravariant-1.2 ... linking ... done.
Loading package tagged-0.7.2 ... linking ... done.
Loading package distributive-0.4.4 ... linking ... done.
Loading package comonad-4.2.2 ... linking ... done.
Loading package semigroupoids-4.2 ... linking ... done.
Loading package bifunctors-4.1.1.1 ... linking ... done.
Loading package prelude-extras-0.4 ... linking ... done.
Loading package profunctors-4.2.0.1 ... linking ... done.
Loading package free-4.9 ... linking ... done.
Loading package transformers-base-0.4.3 ... linking ... done.
Loading package monad-control-0.3.3.0 ... linking ... done.
Loading package either-4.3.1 ... linking ... done.
Loading package ieee754-0.7.3 ... <interactive>: m: The specified module could n
ot be found.
can't load .so/.DLL for: m.dll (addDLL: could not load DLL)
*Main>
Last edited 2 years ago by dkwright (previous) (diff)

comment:30 Changed 23 months ago by bjornbm

  • Cc bjorn.buckwalter@… added

comment:31 Changed 23 months ago by dreixel

I hit this problem too with criterion on Windows. criterion depends on hastache, which depends on ieee754 without any version bounds. ieee754-0.7.4 is good, ieee754-0.7.3 isn't. So I fixed it with the following sequence of commands:

ghc-pkg unregister criterion-1.0.1.0
ghc-pkg unregister hastache-0.6.0
ghc-pkg unregister ieee754-0.7.3
cabal install criterion --constraint="ieee754-0.7.4"

Hopefully others can do something similar to this. Also, package authors should update their packages to require ieee754 >= 0.7.4.

comment:32 Changed 21 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:33 Changed 12 months ago by vagarenko

  • Cc vagarenko added

comment:34 Changed 12 months ago by thoughtpolice

  • Milestone changed from 7.12.1 to 8.0.1

Milestone renamed

comment:35 Changed 12 months ago by WrenThornton

  • Cc wrengr added
  • Priority changed from normal to high

@oerjan

Yes, I've long known about and documented this problem in the INSTALL/README file for the logfloat package. It's always been there in the repo and tarball, but hopefully is more visible now that Hackage has links to readme files. There used to be another ticket for this problem, though I lost the ticket number over the years so the readme only gives a short description of the problem (i'll add this ticket number to the readme once I get home). The long description of the situation so far as I'm aware/concerned is:

For *compiled* code, everything works perfectly fine— without the need for Cygwin or Mingw/Msys or any of that. The problem is specifically with GHCi. The C functions that logfloat needs live in what posix calls "libm", thus we have "Extra-Libraries: m" in the cabal file. However, as nus mentions, MinGW's so-called "libm" file (which ships with GHC) is just a dummy header with no actual content; therefore the cabal file is useless here. The actual symbols/C-code lives in the libmingwex.a library (which also ships with GHC). When compiling, libmingwex.a is picked up automagically and so everything works; whereas when using the interpreter, libmingwex.a is ignored and the loading error is generated.

This bug has been a thorn in my side for years. I don't use Windows, but I have many users who do and who have to work around the problem on a daily basis. The only feasible workaround at present is to compile logfloat without FFI support (thereby eliminating most of the precision which is the entire point of the library) in order to do any sort of interactive debugging or development, and then hopefully remembering to recompile logfloat and everything that depends on it (to get precision) before shipping. I really wish we could just get this fixed already so I don't have to explain the sorry state of things to yet another machine learning / natural language processing person just trying out Haskell. This really makes a terrible impression on folks :(

comment:36 Changed 12 months ago by WrenThornton

  • Cc winterkoninkje+ghc@… added; wrengr removed

comment:37 Changed 12 months ago by WrenThornton

  • Cc wren@… added; winterkoninkje+ghc@… removed

comment:38 Changed 11 months ago by Phyx-

  • Owner set to Phyx-

comment:39 Changed 10 months ago by RyanGlScott

  • Cc RyanGlScott added

comment:40 follow-up: Changed 10 months ago by Phyx-

It seems the issue is caused by findArchive being unable to find any archives that are shipped using the in-place GCC.

  • It works on Linux because findArchive would search the standard Linux include path.
  • It works during compilation because GCC can find it's own libraries (we explicitly tell it where to look for libraries using the gcc wrapper around realgcc)

So fixing the issue means using searchForLibUsingGcc in findArchive as well, which will then find the correct file.

The reason for the error as it is, is because if we can't locate the library using any of the methods we have, we assume it is a system dll, or something on the system search path. e.g. if trying to load kernel32.dll.

There is a slight issue in that the GHCi code (incorrectly) favors static archives over dynamic ones

findDll        `orElse` 
findArchive    `orElse` 
tryGcc         `orElse` 
tryGccPrefixed `orElse` 
assumeDll

This has the unwanted effect of when kernel32 is specific as a lib, it will try to load kernel32.a instead of kernel32.dll.

To solve this I have added another search function that is able to search the Windows search paths using SearchPath in order to find if it is a dll on the system search path.

The new search order is:

findDll     `orElse` 
findSysDll  `orElse` 
tryGcc      `orElse` 
findArchive `orElse` 
assumeDll

(tryGccPrefixed was rolled into tryGcc so it is no longer needed at top level)

This seems to be working well, it is dependent on another patch of mine so have to push that one through before this one. But patch coming soon.

comment:41 in reply to: ↑ 40 Changed 10 months ago by Kludgy

Great insight! I'm really looking forward to this patch. I presume findSysDll is the new function?

Last edited 10 months ago by Kludgy (previous) (diff)

comment:42 Changed 10 months ago by Phyx-

Yup, findSysDll on Windows will try to see if assumeDll would be loading anything. on other platforms it just does a no-op

comment:43 Changed 10 months ago by Phyx-

  • Summary changed from ghci: can't load .so/.DLL for: m (addDLL: could not load DLL) to GHCi linker does not correctly locate static libraries under Windows

comment:44 Changed 10 months ago by Phyx-

  • Architecture changed from x86 to Unknown/Multiple
  • Differential Rev(s) set to Phab:D1455
  • Status changed from new to patch
  • Test Case set to T3242

comment:45 Changed 10 months ago by lukexi

  • Milestone changed from 8.0.1 to 7.10.3

We should get this into 7.10.3 if possible, as it's another showstopper for linking certain C/C++ libraries on Windows in the presence of Template Haskell (see #10726).

comment:46 Changed 10 months ago by Ben Gamari <ben@…>

In acce37f/ghc:

Fix archive loading on Windows by the runtime loader

The runtime loader is unable to find archive files `.a` shipping
with the inplace `GCC`.

It seems the issue is caused by `findArchive` being unable to
find any archives that are shipped using the in-place `GCC`.

- It works on Linux because `findArchive` would search
  the standard Linux include path.
- It works during compilation because `GCC` can find it's own libraries
  (we explicitly tell it where to look for libraries using the `gcc`
  wrapper around `realgcc`)

So fixing the issue means using `searchForLibUsingGcc` in `findArchive`
as well, which will then find the correct file.

The reason for the error as it is, is because if we can't locate the
library using any of the methods we have, we assume it is a system dll,
or something on the system search path.  e.g. if trying to load
`kernel32.dll`.

There is a slight issue in that the `GHCi` code (incorrectly) favors
`static archives` over `dynamic` ones

```
findDll        `orElse`
findArchive    `orElse`
tryGcc         `orElse`
tryGccPrefixed `orElse`
assumeDll
```
This has the unwanted effect of when `kernel32` is specified as a lib,
it will try to load `kernel32.a` instead of `kernel32.dll`.

To solve this I have added another search function that is able to
search the Windows search paths using `SearchPath` in order to find if
it is a dll on the system search path.

The new search order is:

```
findDll     `orElse`
findSysDll  `orElse`
tryGcc      `orElse`
findArchive `orElse`
assumeDll
```

(`tryGccPrefixed` was rolled into `tryGcc` so it is no longer needed at
top level)

Test Plan: ./validate added new windows tests T3242

Reviewers: thomie, erikd, hvr, austin, bgamari

Reviewed By: thomie, erikd, bgamari

Differential Revision: https://phabricator.haskell.org/D1455

GHC Trac Issues: #3242

comment:47 Changed 10 months ago by bgamari

  • Resolution set to fixed
  • Status changed from patch to closed
Note: See TracTickets for help on using tickets.