Opened 4 years ago

Last modified 2 months ago

#4012 new bug

Compilation results are not deterministic

Reported by: kili Owned by:
Priority: high Milestone: 7.6.2
Component: Compiler Version: 6.12.2
Keywords: Cc: mail@…, the.dead.shall.rise@…, id@…, shacka@…, mail@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Other Difficulty: Difficult (2-5 days)
Test Case: Blocked By:
Blocking: Related Tickets:

Description (last modified by simonmar)

There are some issues with non-determinism in the output of GHC, which means that compilations are not repeatable. This affects some users (e.g. Debian packagers) who need to be able to get repeatable hashes for the packages of a GHC build.

The cases we know about that lead to non-deterministic results are:

  • The spec_ids (specialised Ids) attached to an Id have a non-deterministic ordering
  • CSE can give different results depending on the order in which the bindings are considered, and since the ordering is non-deterministic, the result of CSE is also non-deterministic. e.g. in x = z; y = z; z = 3, where y and x are exported, we can end up with either x = y; y = 3 or y = x; x = 3.
  • There seems to be something unpredictable about the order of arguments to SpecConstr-generated specialisations, see http://www.haskell.org/pipermail/glasgow-haskell-users/2011-April/020287.html
  • The wrappers generated by the CApiFFI extension have non-deterministic names. (see comment:15 below).

Old ticket description follows

Short story: if you use ghc-6.12.1.20100318 (or similar, probably
ghc-6.12.1 release will produce the same results) to bootstrap
ghc-6.12, and then use that ghc-6.12 to bootstrap another ghc-6.12,
those two instances of ghc-6.12 will have different ABI hashes and
interfaces in the ghc package. If you use ghc-6.10 for the
bootstrapping, you'll even get differences in the ghc, base and
Cabal packages.

Long story: see logfiles and descriptions at http://darcs.volkswurst.de/boot-tests/ (note that the logfiles are quite large, I really don't want to attach 150 MB of logs to this ticket).

Attachments (8)

zap-cbooterversion_-in-an-attempt-to-fix-_4012.dpatch (3.6 KB) - added by kili 4 years ago.
warp-abi-diff (2.8 KB) - added by nomeata 18 months ago.
Changed ABI only due to alpha renaming
warp-no-exposed-change.patch (3.0 KB) - added by nomeata 18 months ago.
This change changes the ABI of warp-1.2.1.1 unexpectedy
Warp-before.hi (20.5 KB) - added by nomeata 18 months ago.
Interface file before the patch
Warp-after.hi (20.5 KB) - added by nomeata 18 months ago.
Interface file after the patch
Warp.hs (4.0 KB) - added by nomeata 18 months ago.
Attempt to minimize the problem (still needs conduit) (before patch)
Warp.2.hs (4.1 KB) - added by nomeata 18 months ago.
Attempt to minimize the problem (still needs conduit) (after patch)
Test.hs (175 bytes) - added by nomeata 18 months ago.
Small testcase

Download all attachments as: .zip

Change History (53)

comment:1 Changed 4 years ago by kili

This is much fun... after running diff(1) with some sanitizing -I options on two *.ghc.ifaces lists, the most interesting difference was in the interface of the module `Panic, and there in the unfolding: section of showGhcException. Here's the hunk with that difference:

@@ -187032,7 +187029,7 @@
                                 Panic.showGhcException5
                                 (GHC.Base.++
                                    @ GHC.Types.Char
-                                   Config.cProjectVersion
+                                   Config.cBooterVersion
                                    (GHC.Base.++
                                       @ GHC.Types.Char
                                       Panic.showGhcException4

So what happens here seems to be some optimization replacing cProjectVersion by cBooterVersion iff those two strings are equal (i.e. when building 6.12 with 6.12 as the booter).

All other interface and ABI hash changes are just triggered by this change; indeed I could `fix' the problem by removing all occurrences of cBooterVersion. See attached patch (but don't apply, because it's *not* a proper fix but just a workaround).

I'm not sure wether the real bug is this optimization (substitute cBooterVersion for cProjectVersion) or the fact that the value of those strings make it into the ABI hash.

comment:2 follow-up: Changed 4 years ago by simonmar

  • Resolution set to invalid
  • Status changed from new to closed

See Commentary/Compiler/RecompilationAvoidance, in particular I think the point right at the bottom about CSE is probably what's causing the differences you see.

In a sense it's not a bug, in that we know interfaces are not stable. Of course we'd prefer it if interfaces were stable, and moving towards stability has been a goal of mine over the past year or so. I don't think it'll help to have a bug open on this right now, though.

comment:3 Changed 4 years ago by simonmar

Oops, I forgot: thanks for the report, and you were right to be suspicious! Interface stability is a reasonable thing to expect. The fact that you got different results when changing the booting compiler is really a red herring: you can get different results just by saying 'make clean' and recompiling.

comment:4 in reply to: ↑ 2 ; follow-up: Changed 4 years ago by kili

Replying to simonmar:

In a sense it's not a bug, in that we know interfaces are not stable. Of course we'd prefer it if interfaces were stable, and moving towards stability has been a goal of mine over the past year or so. I don't think it'll help to have a bug open on this right now, though.

But this means if two people build packages, where one used a different bootstrapper for ghc than the other, they can't use each other's packages. I'd consider that a bug. (Actually, the OpenBSD port has a knob for using an already installed ghc-6.12.2 for bootstrapping instead of the prebuilt bindist I provide, but I'll better remove that knob for now)

comment:5 in reply to: ↑ 4 ; follow-up: Changed 4 years ago by simonmar

  • Component changed from Build System to Compiler
  • Difficulty set to Difficult (2-5 days)
  • Operating System changed from OpenBSD to Unknown/Multiple
  • Resolution invalid deleted
  • Status changed from closed to new
  • Summary changed from Different stage2 results depending on the version of the bootstrapping compiler to Compilation results are not deterministic

Replying to kili:

But this means if two people build packages, where one used a different bootstrapper for ghc than the other, they can't use each other's packages. I'd consider that a bug.

The bootstrapping GHC is a red herring, as I mentioned - the fact is that compilation results aren't deterministic. They never have been, in fact.

Even if compilation results were deterministic, under what circumstances would you want to have two systems "use each others packages", when they're both using a GHC independently built from source? If they independently build GHC from source, why wouldn't they build packages from source too?

Note that packages are registered with an MD5 hash of the interface, so the package system will spot if you try to register an incompatible package.

Anyway, I do agree that having deterministic compilation results is a desirable thing, so on second thoughts I've re-opened the bug and retitled it.

comment:6 in reply to: ↑ 5 ; follow-ups: Changed 4 years ago by kili

Replying to simonmar:

The bootstrapping GHC is a red herring, as I mentioned - the fact is that compilation results aren't deterministic. They never have been, in fact.

And it's good that the problems are detected now with the hashes.

Even if compilation results were deterministic, under what circumstances would you want to have two systems "use each others packages", when they're both using a GHC independently built from source? If they independently build GHC from source, why wouldn't they build packages from source too?

For example, if two persons are working on a ghc package for their operating system, and also on updates for all the stuff that needs ghc (xmonad, alex, happy, monadius, to mention the most important ones), it's a pity if they're not able to exchange packages.

You mentioned that even a `make clean; make' may change the interfaces. Indeed, I remember at least one case where ghc segfailted during the build (this doesn't happen often, only every 20th or 30th build) and I just restarted the build -- and got interface changes.

Note that packages are registered with an MD5 hash of the interface, so the package system will spot if you try to register an incompatible package.

And that's good, but it's just a workaround, UMHO.

Anyway, I do agree that having deterministic compilation results is a desirable thing, so on second thoughts I've re-opened the bug and retitled it.

Thanks. But is this really *one* Bug? There are the problems with non-determinisms, but I think that the CSE on inlined constants (like cBooterVersion and cProjectVersion) is a separate problem.

comment:7 in reply to: ↑ 6 Changed 4 years ago by igloo

Replying to kili:

Thanks. But is this really *one* Bug? There are the problems with non-determinisms, but I think that the CSE on inlined constants (like cBooterVersion and cProjectVersion) is a separate problem.

I don't think that's a bug at all. You are compiling different source code, so it's entirely reasonable that the ABI should be different.

If the ABI was otherwise stable then it might be worth trying to work around cBooterVersion's value affecting the ABI, though.

comment:8 in reply to: ↑ 6 Changed 4 years ago by simonmar

Replying to kili:

For example, if two persons are working on a ghc package for their operating system, and also on updates for all the stuff that needs ghc (xmonad, alex, happy, monadius, to mention the most important ones), it's a pity if they're not able to exchange packages.

For this to work I think you'd really need stable ABIs, not just deterministic compilation. If you only had deterministic compilation, it would be hard to guarantee that all the inputs to the compilation were identical between the two systems: there are a lot of ways that differences can accidentally creep in (different optimisation flags, system configuration settings, etc.).

See Commentary/Packages for more on this.

You mentioned that even a `make clean; make' may change the interfaces. Indeed, I remember at least one case where ghc segfailted during the build (this doesn't happen often, only every 20th or 30th build)

FWIW, we never see random segfaults in GHC here, so I expect that really is a bug specific to your OS or system that needs investigating.

and I just restarted the build -- and got interface changes.

Yes - unfortunate, but not unexpected.

Anyway, I do agree that having deterministic compilation results is a desirable thing, so on second thoughts I've re-opened the bug and retitled it.

Thanks. But is this really *one* Bug? There are the problems with non-determinisms, but I think that the CSE on inlined constants (like cBooterVersion and cProjectVersion) is a separate problem.

Ian pointed out that you really have source differences here, which I hadn't spotted. (Actually, I'm not sure why cBooterVersion isn't set to the version of the stage1 compiler when building stage2, it doesn't seem useful to record the stage0 version in the stage2 compiler.)

So the underlying problem leading to the CSE issue is that the compiler doesn't use a deterministic ordering for bindings internally. It uses the Unique ordering, which is semi-random. The results of CSE may depend on the order of bindings, but it's entirely possible that there are other non-deterministic consequences of the same kind elsewhere. Making the order deterministic would fix all of them.

comment:9 Changed 4 years ago by igloo

  • Milestone set to 6.14.1

comment:10 Changed 3 years ago by igloo

  • Milestone changed from 7.0.1 to 7.0.2

comment:11 Changed 3 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1

comment:12 Changed 3 years ago by nomeata

  • Cc mail@… added

Just a minor comment: This is hurting distributions quite a bit. Is there a chance to at least avoid this particular problem (cBooterVersion vs cProjectVersion) in the next ghc release?

Thanks,
Joachim

comment:13 Changed 3 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1

comment:14 Changed 2 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from normal to low

comment:15 Changed 2 years ago by nomeata

This has bit us again with 7.4.1. I forgot about this issue and we built lots of libraries against the first upload of GHC 7.4.1 (built with 7.0.4), all of which will have to be rebuild after the next minor GHC upload, because that will be built with 7.4.1 and the base API will likely change again.

This is the interface diff causing the ABI change in base, in case anybody wonders:

/usr/lib/ghc/base-4.5.0.0/System/Posix/Internals.hi
--- /dev/fd/63  2012-02-10 20:33:16.717639938 +0000
+++ /dev/fd/62  2012-02-10 20:33:16.717639938 +0000
@@ -5,11 +5,11 @@
 Way: Wanted [],
      got    []
 interface base:System.Posix.Internals 7041
-  interface hash: fef49c410428b674b72ebd8b1a93bd62
-  ABI hash: 33b2adf3f92d97c87fbcbd52d7f22781
+  interface hash: 13159d537315369ddfe00efa59167f8a
+  ABI hash: fe56115a605d2758561d089b972bb8bb
   export-list hash: 83b224804aef34838bb7767a01e8aaa7
   orphan hash: 693e9af84d3dfcc71e640e005bdc5e2e
-  flag hash: 28bf87c2d7df91e45dad874fc0a5931f
+  flag hash: 865402a98b08183763ca20b5e9837ae0
   used TH splices: False
   where
 exports:
@@ -241,7 +241,7 @@
 import  -/  integer-gmp:GHC.Integer.Type 254721fe3c053c778036ed1a1fa4248d
 addDependentFile "libraries/base/include/HsBaseConfig.h"
 addDependentFile "libraries/base/dist-install/build/autogen/cabal_macros.h"
-3e0a4be3b609e3ac62226d26fd8dbfa2
+9e62726eeeec5155bee7f03712b80c79
   $wa :: GHC.Prim.Int#
          -> GHC.Prim.State# GHC.Prim.RealWorld
          -> (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.IO.IOMode.IOMode #)
@@ -258,7 +258,7 @@
                           System.Posix.Internals.fdGetMode3
                           System.Posix.Internals.fdGetMode2
                           (\ ds2 :: GHC.Prim.State# GHC.Prim.RealWorld ->
-                           case {__pkg_ccall base ghc_wrapper_d2ju_fcntl GHC.Prim.Int#
+                           case {__pkg_ccall base ghc_wrapper_d2jn_fcntl GHC.Prim.Int#
                                                                          -> GHC.Prim.Int#
                                                                          -> GHC.Prim.State#
                                                                                 GHC.Prim.RealWorld
@@ -386,7 +386,7 @@
                                               0
                                               -> System.Posix.Internals.fileType2
                                                    w } } } } } } } } } } } }) -}
-178f30c6a50e12552309d6f35e96cfa3
+1e8eed47078d0056cc7209ae4be4cd83
   $wa2 :: GHC.Prim.Int#
           -> GHC.Prim.State# GHC.Prim.RealWorld
           -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)
@@ -404,7 +404,7 @@
                                                                         GHC.Prim.RealWorld,
                                                                     GHC.Prim.Int# #)}
                           GHC.Prim.realWorld# of wild1 { (#,#) ds2 ds3 ->
-                   case {__pkg_ccall base ghc_wrapper_d2ji_fcntl GHC.Prim.Int#
+                   case {__pkg_ccall base ghc_wrapper_d2jb_fcntl GHC.Prim.Int#
                                                                  -> GHC.Prim.Int#
                                                                  -> GHC.Prim.Int#
                                                                  -> GHC.Prim.State#
@@ -440,7 +440,7 @@
                              (GHC.Types.NTCo:IO (Refl Foreign.C.Types.CInt))
                                ds6 of wild5 { (#,#) new_s1 a76 ->
                         (# new_s1, GHC.Tuple.() #) } } } } } }) -}
-20c72f17c3f40d20ed52caaaa7a2c116
+ce1394a8e53df017e302d2cc305d0231
   $wa3 :: GHC.Prim.Int#
           -> GHC.Types.Bool
           -> GHC.Prim.State# GHC.Prim.RealWorld
@@ -459,7 +459,7 @@
                           System.Posix.Internals.fdGetMode3
                           System.Posix.Internals.setNonBlockingFD3
                           (\ ds2 :: GHC.Prim.State# GHC.Prim.RealWorld ->
-                           case {__pkg_ccall base ghc_wrapper_d2ju_fcntl GHC.Prim.Int#
+                           case {__pkg_ccall base ghc_wrapper_d2jn_fcntl GHC.Prim.Int#
                                                                          -> GHC.Prim.Int#
                                                                          -> GHC.Prim.State#
                                                                                 GHC.Prim.RealWorld
@@ -494,7 +494,7 @@
                                                                                GHC.Prim.RealWorld,
                                                                            GHC.Prim.Int# #)}
                                     GHC.Prim.realWorld# of wild6 { (#,#) ds2 ds3 ->
-                             case {__pkg_ccall base ghc_wrapper_d2ji_fcntl GHC.Prim.Int#
+                             case {__pkg_ccall base ghc_wrapper_d2jb_fcntl GHC.Prim.Int#
                                                                            -> GHC.Prim.Int#
                                                                            -> GHC.Prim.Int#
                                                                            -> GHC.Prim.State#
@@ -530,7 +530,7 @@
                                                                                GHC.Prim.RealWorld,
                                                                            GHC.Prim.Int# #)}
                                     GHC.Prim.realWorld# of wild6 { (#,#) ds4 ds5 ->
-                             case {__pkg_ccall base ghc_wrapper_d2ji_fcntl GHC.Prim.Int#
+                             case {__pkg_ccall base ghc_wrapper_d2jb_fcntl GHC.Prim.Int#
                                                                            -> GHC.Prim.Int#
                                                                            -> GHC.Prim.Int#
                                                                            -> GHC.Prim.State#
@@ -751,7 +751,7 @@
                        ((->)
                             (Sym (Foreign.C.Types.NTCo:CInt))
                             (GHC.Types.IO (Sym (Foreign.C.Types.NTCo:CInt))))) -}
-285fd75770912c1d8f7839e3a5f7913c
+5f043b97e154508a95f4d6814442f737
   c_fcntl_lock :: Foreign.C.Types.CInt
                   -> Foreign.C.Types.CInt
                   -> GHC.Ptr.Ptr System.Posix.Internals.CFLock
@@ -765,7 +765,7 @@
                    case ds1 of ds5 { GHC.Int.I32# ds6 ->
                    case ds2 of ds7 { GHC.Ptr.Ptr ds8 ->
                    (\ ds9 :: GHC.Prim.State# GHC.Prim.RealWorld ->
-                    case {__pkg_ccall base ghc_wrapper_d2j3_fcntl GHC.Prim.Int#
+                    case {__pkg_ccall base ghc_wrapper_d2iW_fcntl GHC.Prim.Int#
                                                                   -> GHC.Prim.Int#
                                                                   -> GHC.Prim.Addr#
                                                                   -> GHC.Prim.State#
@@ -788,7 +788,7 @@
                             ((->)
                                  (Refl (GHC.Ptr.Ptr System.Posix.Internals.CFLock))
                                  (GHC.Types.IO (Sym (Foreign.C.Types.NTCo:CInt)))))) -}
-344994480e335140ee30d5da0c864863
+f98068efddbcdfb5e3350b8ad3fa7be6
   c_fcntl_read :: Foreign.C.Types.CInt
                   -> Foreign.C.Types.CInt
                   -> GHC.Types.IO Foreign.C.Types.CInt
@@ -798,7 +798,7 @@
                    case ds of ds2 { GHC.Int.I32# ds3 ->
                    case ds1 of ds4 { GHC.Int.I32# ds5 ->
                    (\ ds6 :: GHC.Prim.State# GHC.Prim.RealWorld ->
-                    case {__pkg_ccall base ghc_wrapper_d2ju_fcntl GHC.Prim.Int#
+                    case {__pkg_ccall base ghc_wrapper_d2jn_fcntl GHC.Prim.Int#
                                                                   -> GHC.Prim.Int#
                                                                   -> GHC.Prim.State#
                                                                          GHC.Prim.RealWorld
@@ -817,7 +817,7 @@
                        ((->)
                             (Sym (Foreign.C.Types.NTCo:CInt))
                             (GHC.Types.IO (Sym (Foreign.C.Types.NTCo:CInt))))) -}
-6387e106fea1f6634ffe8fe759bd7748
+c8ec806642726c5ab7f4c951173d88ac
   c_fcntl_write :: Foreign.C.Types.CInt
                    -> Foreign.C.Types.CInt
                    -> Foreign.C.Types.CLong
@@ -829,7 +829,7 @@
                    case ds1 of ds5 { GHC.Int.I32# ds6 ->
                    case ds2 of ds7 { GHC.Int.I64# ds8 ->
                    (\ ds9 :: GHC.Prim.State# GHC.Prim.RealWorld ->
-                    case {__pkg_ccall base ghc_wrapper_d2ji_fcntl GHC.Prim.Int#
+                    case {__pkg_ccall base ghc_wrapper_d2jb_fcntl GHC.Prim.Int#
                                                                   -> GHC.Prim.Int#
                                                                   -> GHC.Prim.Int#
                                                                   -> GHC.Prim.State#
@@ -1584,7 +1584,7 @@
                   ((->)
                        (Refl (GHC.Ptr.Ptr Foreign.C.Types.CChar))
                        (GHC.Types.IO (Sym (Foreign.C.Types.NTCo:CInt)))) -}
-b6e75b90c0bc4bc5d2e2ae1b63b9846f
+236fe7b0bd958dd442a1f699d1345485
   c_utime :: Foreign.C.String.CString
              -> GHC.Ptr.Ptr System.Posix.Internals.CUtimbuf
              -> GHC.Types.IO Foreign.C.Types.CInt
@@ -1595,7 +1595,7 @@
                    case ds of ds2 { GHC.Ptr.Ptr ds3 ->
                    case ds1 of ds4 { GHC.Ptr.Ptr ds5 ->
                    (\ ds6 :: GHC.Prim.State# GHC.Prim.RealWorld ->
-                    case {__pkg_ccall base ghc_wrapper_d2hj_utime GHC.Prim.Addr#
+                    case {__pkg_ccall base ghc_wrapper_d2hc_utime GHC.Prim.Addr#
                                                                   -> GHC.Prim.Addr#
                                                                   -> GHC.Prim.State#
                                                                          GHC.Prim.RealWorld
@@ -1931,7 +1931,7 @@
                         (Foreign.C.Types.NTCo:CInt) of wild { GHC.Int.I32# a76 ->
                    case a76 of wild1 {
                      DEFAULT -> GHC.Types.False (-1) -> GHC.Types.True } }) -}
/usr/lib/ghc/base-4.5.0.0/Control/Concurrent.hi
--- /dev/fd/63  2012-02-10 20:33:24.365677867 +0000
+++ /dev/fd/62  2012-02-10 20:33:24.365677867 +0000
@@ -5,11 +5,11 @@
 Way: Wanted [],
      got    []
 interface base:Control.Concurrent 7041
-  interface hash: 859a60ba0f58963d0b32951e150a9384
-  ABI hash: ae249f19a927403afe6459ced1ed82bd
+  interface hash: bbbd6db6b81349eeb98d6de1d9533e0e
+  ABI hash: a2da1fbff0edb4b1499650aa0180adeb
   export-list hash: 43716dbfa877e75f4e9112114a6a1572
   orphan hash: 693e9af84d3dfcc71e640e005bdc5e2e
-  flag hash: 44b3db9df460d6cd34b4094fa836f3ef
+  flag hash: 9d38727ed6a9e7ba8d387df844cc1d1e
   used TH splices: False
   where
 exports:
@@ -174,7 +174,7 @@
   otherwise 82850e7a148d17e005ebe89173cad100
 import  -/  GHC.Conc 62a874d9e31bc9eb426ba181adbd36a1
   exports: 7f1de5b156bcfc21b036b8be089d0862
-import  -/  GHC.Conc.IO 9c0a53bd8477f330b48bd95ab54465be
+import  -/  GHC.Conc.IO a46105f75fe740b6d84dba2a24195977
   threadDelay 3f2ea80f962aa726a69f0a26c2cd72e8
   threadWaitRead ee2b193ffe18e7cc4e8ed3b078e64f0c
   threadWaitWrite 892414941939665c2ba67b0eb12ffb9b
@@ -238,6 +238,16 @@
 import  -/  ghc-prim:GHC.Classes 9f526208b19b2511259880485f6b7413
 import  -/  ghc-prim:GHC.Types d58bb266a5f6fd38ade7006bcfc6ede5
 addDependentFile "libraries/base/dist-install/build/autogen/cabal_macros.h"
+0909aa1928600224faf9e000c5f6b0e0
+  $fforkOS_entry_a1T0 :: GHC.Stable.StablePtr (GHC.Types.IO ())
+                         -> GHC.Types.IO ()
+    {- Arity: 2, HasNoCafRefs, Strictness: U(L)L,
+       Unfolding: InlineRule (0, True, True)
+                  Control.Concurrent.$fforkOS_entry_a1T1
+                    `cast`
+                  ((->)
+                       (Refl (GHC.Stable.StablePtr (GHC.Types.IO ())))
+                       (Sym (GHC.Types.NTCo:IO (Refl ())))) -}
 3ec1a60cf71f47d982cf274d264b867d
   $fforkOS_entry_a1T1 :: GHC.Stable.StablePtr (GHC.Types.IO ())
                          -> GHC.Prim.State# GHC.Prim.RealWorld
@@ -252,16 +262,6 @@
                           sp
                           eta of wild1 { (#,#) new_s a ->
                    a `cast` (GHC.Types.NTCo:IO (Refl ())) new_s } }) -}
-cf9f1b73372486eb748c4c2a9120bf5b
-  $fforkOS_entry_a1T7 :: GHC.Stable.StablePtr (GHC.Types.IO ())
-                         -> GHC.Types.IO ()
-    {- Arity: 2, HasNoCafRefs, Strictness: U(L)L,
-       Unfolding: InlineRule (0, True, True)
-                  Control.Concurrent.$fforkOS_entry_a1T1
-                    `cast`
-                  ((->)
-                       (Refl (GHC.Stable.StablePtr (GHC.Types.IO ())))
-                       (Sym (GHC.Types.NTCo:IO (Refl ())))) -}
 fcb1cef8d2a42a26479f83226cc08633
   type Buffer a
       = (GHC.MVar.MVar (GHC.MVar.MVar [a]), Control.Concurrent.QSem.QSem)

And similarly in /usr/lib/ghc/base-4.5.0.0/GHC/IO/Handle/Internals.hi:

@@ -2181,7 +2181,7 @@
     {- Arity: 3, HasNoCafRefs, Strictness: SU(LLLLLL)L,
        Inline: INLINE[0],
        Unfolding: Worker(ext0: GHC.IO.Handle.Internals.$wa3 (arity 3) -}
-"SC:a_s31q0" [ALWAYS] forall @ dev
+"SC:a_s31j0" [ALWAYS] forall @ dev
                              @ enc_state
                              @ dec_state
                              sc :: GHC.IO.Device.IODevice dev

Maybe the hashes over the inlining expressions could be made independent of the names, by applying some normalization before hashing the expression?

comment:16 Changed 2 years ago by simonmar

  • Description modified (diff)
  • Priority changed from low to high

I've updated the description following the latest evidence. Let's try to do something about this for the next major release.

comment:17 Changed 22 months ago by nomeata

Let me illustrate one consequence of this problem. In Debian, the freeze for wheezy is near. We have just managed to migrate ghc-7.4.1 to wheezy, but there are a few remaining bugs of various severity (#5991, #6156) and having GHCi on ARM would be nice as well. Alas, we cannot upgrade to ghc-7.4.2 which provides all this, as it would require rebuilding everything, which is not possible at this stage any more (http://lists.debian.org/debian-haskell/2012/06/msg00038.html).

What we will probably do is to backport fixes from 7.4.2 onto 7.4.1, but this is not nice either, as it might introduce new bugs and makes our ghc deviate more and more from upstream.

Ideally, if 7.4.2 does not change anything about the actual ABI, it would not force it to change completely. Currently, it does, by encoding the version number in every hash (#5328; granted, I filed that, but that does not mean its the best solution :-)). It should do something like haddock: Keep an internal counter, separate from the version number, that is increased if indeed everything needs to be rebuilt, e.g. changes in the .hi file format, in the calling convention etc. (← identifying this list is probably non-trivial). And if a new compiler version is released that does not do any of these, the number stays the same and previously compiled packages can be re-used.

This would also require the file names of installed packages to not include the ghc version but the ghc abi interface number; but configuring this can be left to the distributions.

(Hmm, this is not _really_ this bug, but closely related.)

comment:18 Changed 19 months ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:19 Changed 18 months ago by nomeata

Again this is biting me. We are trying to backport a fix in warp to Debian testing, without having to rebuild everything based on it. And indeed the patch does not really change any of the exported information (e.g. no function eligible for inlining). But nevertheless the ABI hash changes. Judging from the diff of ghc --show-iface (attached), this is only due to renaming of internal variables.

Maybe the ABI hashing should first alpha-normalize the expression?

Changed 18 months ago by nomeata

Changed ABI only due to alpha renaming

comment:20 Changed 18 months ago by simonmar

We do alpha-normalise the whole code of the module before we hash it - that's why you're seeing a60 rater than the internal names that look like a_s3fj. It's a mystery to me why there are differences here, so we ought to look into it. Ian has agreed to fix the differences due to ghc_wrapper Ids that were mentioned above, and to look into the differences in "SC:..." rule names, which should fix two of the known issues here.

Fixing the CSE ordering issue is a tricky one, because it means keeping the bindings in a deterministic order throughout the compiler.

comment:21 follow-up: Changed 18 months ago by simonpj

So to summarise Simon M's comments

  • A full and total fix is beyond us at the moment
  • But it seems likely that some modest fixes that are relatively easy (notably ccall wrappers and the "SC.." rule names) may go a long way. Ian is going to have a go at those.

As he says, the a59/a60 diff in your example is a bit mysterious. Perhaps show us both interface files? And preferably a way to reproduce.

Simon

comment:22 Changed 18 months ago by igloo

  • Owner set to igloo

comment:23 in reply to: ↑ 21 Changed 18 months ago by nomeata

Replying to simonpj:

As he says, the a59/a60 diff in your example is a bit mysterious. Perhaps show us both interface files? And preferably a way to reproduce.

To reproduce, fetch haskell-warp-1.2.1.1 and compile with 7.4.1; once unmodified and once with the (to be attached) patch applied. Interface files also attached.

Changed 18 months ago by nomeata

This change changes the ABI of warp-1.2.1.1 unexpectedy

Changed 18 months ago by nomeata

Interface file before the patch

Changed 18 months ago by nomeata

Interface file after the patch

Changed 18 months ago by nomeata

Attempt to minimize the problem (still needs conduit) (before patch)

Changed 18 months ago by nomeata

Attempt to minimize the problem (still needs conduit) (after patch)

comment:24 Changed 18 months ago by ian@…

commit 095b9bf4ed418c43216cfca2ae271c143e555f1d

Author: Ian Lynagh <ian@well-typed.com>
Date:   Fri Nov 2 03:45:15 2012 +0000

    Don't put uniqs in ghc wrapper function names; part of #4012
    
    The wrapper functions can end up in interface files, and thus are
    part of the ABI hash. But uniqs easily change for no good reason
    when recompiling, which can lead to an ABI hash needlessly changing.

 compiler/deSugar/DsForeign.lhs |   20 +++++++++++++++-----
 compiler/main/DynFlags.hs      |   11 ++++++++---
 2 files changed, 23 insertions(+), 8 deletions(-)

comment:25 Changed 18 months ago by ian@…

commit ba38f995d6312aa0cfe15873c8e5e9475e03f19c

Author: Ian Lynagh <ian@well-typed.com>
Date:   Fri Nov 2 22:54:12 2012 +0000

    Avoid putting uniqs in specconstr rules; part of #4012
    
    There's no need to have the uniq in the rule, but its presence can
    cause spurious ABI changes.

 compiler/specialise/SpecConstr.lhs |   10 +++++++---
 1 files changed, 7 insertions(+), 3 deletions(-)

comment:26 follow-up: Changed 18 months ago by simonpj

Do Ian's changes help?

comment:27 in reply to: ↑ 26 Changed 18 months ago by nomeata

Replying to simonpj:

Do Ian's changes help?

Please excuse me if I don’t set up HEAD in a way to build all the packages required to test the code above. But I did find a small test case that could potentially be added to the test suite. With the attached Test.hs and GHC-7.4.1, I get this result:

$ ghc --make -O Test.hs && cp Test.hi Test1.hi && ghc --make -DVARIANT -O Test.hs && diff -u <(ghc --show-iface Test1.hi) <(ghc --show-iface Test.hi)
[1 of 1] Compiling Test             ( Test.hs, Test.o )
[1 of 1] Compiling Test             ( Test.hs, Test.o )
--- /dev/fd/63	2012-11-07 20:45:57.719385850 +0100
+++ /dev/fd/62	2012-11-07 20:45:57.719385850 +0100
@@ -5,11 +5,11 @@
 Way: Wanted [],
      got    []
 interface main:Test 7041
-  interface hash: f70ea3ebb89cd1fb5fd38e711dc3a86b
-  ABI hash: 3d46e40ea31fe2b5209a9cfa0e8c55a2
+  interface hash: 0c3a556b0b6f4bd91ecf6b9097dd0914
+  ABI hash: fbf17c77263ac40a0a46a1da9debd563
   export-list hash: 10500eb025dddf2ab63ae606468fa772
   orphan hash: 693e9af84d3dfcc71e640e005bdc5e2e
-  flag hash: fc61acebd75ed0e7e110c288131c5ae4
+  flag hash: eda85656077ab2758c42e66fbdb9f82a
   used TH splices: False
   where
 exports:
@@ -33,15 +33,15 @@
                      ds :: ((), a) ->
                    case ds of wild { (,) ds1 z ->
                    case ds1 of wild1 { () -> Test.x @ a $dEq $dNum z } }) -}
-9ca884031f6b6682163a35c8b033ebe1
+3170054a493c32af50e7d3df196daeae
   b :: forall a. [a] -> [a] -> [a]
     {- Arity: 1, HasNoCafRefs, Strictness: L,
        Unfolding: (\ @ a x1 :: [a] ->
                    let {
-                     lvl2 :: [a]
+                     lvl3 :: [a]
                      = let { y :: [a] = GHC.Base.++ @ a x1 x1 } in GHC.Base.++ @ a y y
                    } in
-                   \ z :: [a] -> GHC.Base.++ @ a z lvl2) -}
+                   \ z :: [a] -> GHC.Base.++ @ a z lvl3) -}
 2d4ae0d356c36e2c96109ac55a0ff610
   x :: forall a.
        GHC.Classes.Eq a -> GHC.Num.Num a -> a -> GHC.Types.Bool

Changed 18 months ago by nomeata

Small testcase

comment:28 Changed 18 months ago by simonmar

Note that compiling different code and hoping to get the same ABI hash is a much more difficult proposition than getting the same ABI hash by repeatedly compiling the same code (which is what this ticket was originally about). I realise it's a reasonable thing to want to do, but I just wanted to point out the distinction. We need to solve the easier problem first before we can tackle the harder one.

comment:29 Changed 18 months ago by nomeata

Sure. But it is desireable, and low hanging fruit that fixes the ABI of different code (such as normalizing the interface wrt alpha-renaming) will probably make it more robust against odd changes when compiling the same code twice.

comment:30 Changed 18 months ago by simonmar

Actually you're right. I said that we do alpha-renaming, but in fact the current algorithm keeps the original names when they don't clash, and that gives rise to the lvl2/lvl3 difference in your example.

We should not be using the original OccName when tidying a local binder, we should just always use "x", or use a different OccName for different kinds of binders (e.g. "x" for let-binders, "y" for lambdas, and "z" for case binders). Relevant code is in coreSyn/CoreTidy.lhs:tidyIdBndr.

comment:31 Changed 18 months ago by simonmar

... or, if we like to keep the original names for readability, we could do local renaming only when computing the hash, and throw away the renamed version afterwards (basically hash the DeBruijn representation).

comment:32 Changed 18 months ago by igloo

I think this is only really a problem when ghc generates the same name at both the top level and in an expression. Is it feasible to generate e.g. 'tlvl' for top-level names, and 'lvl' for let/lambda-bound names?

comment:33 Changed 18 months ago by igloo

  • Owner igloo deleted

comment:34 Changed 17 months ago by simonmar

@igloo: I'm not sure if that would fix it. A name that was generated at the top-level might be floated in, and conversely a non-top-level name might be floated out. So wouldn't the same problem still occur?

comment:35 Changed 17 months ago by igloo

Hmm, OK. I was going to suggest a final renaming pass (in which only generated names get renamed) and not throwing the result away, so that it's easier to understand when the hash is or isn't changing.

But I think that would be worse, because then the -ddump-simpl etc names wouldn't match the final names.

So your DeBruijn idea sounds best to me.

comment:36 Changed 14 months ago by refold

  • Cc the.dead.shall.rise@… added

comment:37 Changed 14 months ago by isaacdupree

  • Cc id@… added

comment:38 Changed 10 months ago by shacka

  • Cc shacka@… added

comment:39 follow-up: Changed 3 months ago by nh2

Is this related to https://ghc.haskell.org/trac/ghc/ticket/8144, and does its fix maybe improve the situation?

comment:40 Changed 3 months ago by nh2

  • Cc mail@… added

comment:41 in reply to: ↑ 39 ; follow-up: Changed 3 months ago by nomeata

Replying to nh2:

Is this related to https://ghc.haskell.org/trac/ghc/ticket/8144, and does its fix maybe improve the situation?

Not very much, I believe, as the cases discuss here do not necessarily involve header files. But the fix for #8144 is of course a general improvement.

comment:42 in reply to: ↑ 41 Changed 3 months ago by nh2

Replying to nomeata:

Not very much, I believe, as the cases discuss here do not necessarily involve header files. But the fix for #8144 is of course a general improvement.

It's not so much about header files only - I believe UsageFile can be introduced using multiple ways, not only #include, and maybe some other time staps makes it into those hashes like in that bug.

comment:43 Changed 3 months ago by simonmar

Things got worse for a while when we were including timestamps in the interface, but then #8144 fixed that. There are still a set of underlying problems that cause non-determinism.

comment:44 Changed 2 months ago by nomeata

Debian is (slowly, as usual) working towards reproducible builds, and while some Haskell packages satisfy that requirement, others do not – see this message: https://lists.debian.org/debian-haskell/2014/02/msg00011.html

What I find interesting is that --show-iface shows now change at all, besides a different interface hash. It would make debugging these issues easier if everything that takes part in the hash calculation was visible in --show-iface.

Off your head, any idea what else could contribute to a interface change?

Built paths and time stamps contribute not towards the hash, right?

comment:45 Changed 2 months ago by nomeata

Ah, I just see that #8144 was actually fixed after 7.6. I’ll shut up and report back when 7.8 has hit the Debian archives...

Note: See TracTickets for help on using tickets.