Opened 12 years ago

Closed 11 years ago

Last modified 10 years ago

#1372 closed bug (fixed)

Recompilation checker should consider package versions (and other factors)

Reported by: bringert Owned by: simonmar
Priority: normal Milestone: 6.10 branch
Component: Compiler Version: 6.8.1
Keywords: Cc: dons@…, sjanssen@…, chevalier@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

Currently, if an installed package is upgraded, the recompilation checker does not think that modules which use that package need to be recompiled. Since the package version is part of the internal names, and because of cross-package optimizations, this really needs to be done.

If the interface file would list which packages (including version numbers) a module uses, we could decide to recompile whenever the packages used by the module are different from those used in the current compilation.

Change History (28)

comment:1 Changed 12 years ago by bringert

Version: 6.6.16.6

comment:2 Changed 12 years ago by simonmar

Milestone: 6.1
Summary: Recompilation checker should consider package versionsRecompilation checker should consider package versions (and other factors)

Some comments following discussion with bringert on IRC:

There's a more general underlying problem here. We should trigger recompilation if the results could be in any way different to what we got before. Ways in which the results could be different, that we don't currently track:

  • the package flags are different and/or the set of exposed packages has changed
  • header files that are #included (with -cpp) have changed
  • any flags that may affect the compilation results are different
  • the state of the world has changed at all (with -fth, see #481).

It's obviously not practical to cover all the possibilities here, but we could do better than we currently are.

comment:3 Changed 12 years ago by simonmar

Milestone: 6.16.8
Priority: normalhigh

After discussion with dons on IRC, I think we should do something to force recompilation when packages have changed for 6.8. Apparently a lot of people are getting broken builds (link failures, or worse segfaults) when developing multiple packages simultaneously.

comment:4 Changed 12 years ago by simonmar

Distilled conversation with Simon PJ this morning, concerning how to detect when dependent packages have "changed" in order to force recompilation.

First, we'll store a timestamp with each package in the database, being the time at which the package was registered (or perhaps, built). This timestamp will be recorded with each dependent package in the .hi file. We can now tell when a package that we depend on has been re-registered, which will catch cases where you recompile a Cabal package, install it, and then continue building another package.

We also want to notice when:

  • a new version of a package has been installed. The old one may have been either left in place or removed.
  • You're using -hide-all-packages, and the package flags have changed, meaning that modules you import now resolve differently. This happens when you re-configure a Cabal package after installing new packages, and build without cleaning first.
  • You're using --make, the exposed packages have changed, meaning that modules you import now resolve differently. This happens when not using Cabal.

All these cases are covered by "imported modules now resolve differently", so for each module we compile we'll check that condition. We know the imported modules: they're in the ModSummary (for one-shot compilation we'll have to generate a ModSummary, for make-mode we already have it from the downsweep).

For each directly imported module, resolve it using findModule (the ModSummary could record the results of that, since the downsweep already needs to do it). If the resulting module or package is not in the dependencies listed in the .hi file, or the resolution is ambiguous, then we recompile.

This will not catch some very obscure and highly unlikely cases, such as when a module moves from one package to another package, and the package timestamps haven't changed.

This will also not catch the case when a package is registered in-place using Cabal, it is rebuilt, but not re-registered. The timestamp won't be updated, but the package has changed, and things will go wrong. To handle this case we'd have to track module versions across package boundaries, which is something we definitely don't want to do: one advantage of package boundaries is that you can avoid doing fine-grained dependency tracking across them.

comment:5 Changed 12 years ago by simonpj

See also #106, which describes a similar problem.

Simon

comment:6 Changed 12 years ago by igloo

I'm not entirely sure I understand correctly, but I think that what you propose will mean that with GHC's current build system we'll rebuild all the libraries (except base) when you do "make" in a built tree, as we do (roughly) "setup/Setup build && setup/Setup register --inplace" for each library in turn.

comment:7 Changed 12 years ago by simonmar

Yes, registering a package would cause all the dependent packages to be recompiled. In the case of the GHC build system that may not be what you want. Perhaps we can avoid re-registering if nothing was recompiled and the package is already registered.

comment:8 Changed 12 years ago by simonmar

Owner: set to simonmar

I'm looking at this

comment:9 Changed 11 years ago by simonmar

Priority: highnormal

We're unlikely to do any more on this for 6.8

comment:10 Changed 11 years ago by dons

Cc: dons@… added

Just keeping this alive.

We've had a large number of build failures in the latest xmonad relelase with people upgrading and reinstalling the X11-extras package, but not running clean in their xmonad directory. GHC then happily continues compiling xmonad against the old library, leading to type errors.

The api has changed enough that at least this doesn't produce linking errors, but *lots* of people are falling over this problem. The solution is always, if they have a build failure, to make clean first.

It is the single most frequently reported problem with xmonad!

See for example this bug report:

http://thread.gmane.org/gmane.comp.lang.haskell.xmonad/2855

This faq for xmonad entry:

http://xmonad.org/faq.html#upgrade

comment:11 in reply to:  10 ; Changed 11 years ago by dons

See also this complaint on reddit about xmonad failing to build:

http://programming.reddit.com/info/5ym5w/comments/c02a1o5

comment:12 in reply to:  11 Changed 11 years ago by dons

Another user reporting a build failure:

http://hpaste.org/3401

And yet another, this time with segfaults fixed by recompiling;

rickson> Hi! I upgraded to xmonad 0.4 yesterday, and since then it been regularly segfaulting dons> rickson: ok, can you rebuild X11, X11-extras and xmonad from scratch, making sure you 'runhaskell Setup.lhs clean' in each one first? rickson> dons: rebuilding the libs seems to fix the problem (i'm still here!)

comment:13 in reply to:  10 ; Changed 11 years ago by simonmar

Replying to dons:

I'd like to understand what happened a bit better, because I think there are actually several different issues. Correct me if I'm wrong, but the error in the thread you referred to was caused by:

  • X11-extras was updated, the API was changed, the version was not bumped
  • xmonad was updated to match
  • it failed to compile against the old X11-extras

that's expected, right?

And there seems to be some problem with Cabal not noticing when a .hsc file has been modified, that sounds like a Cabal bug.

The problem you described seems slightly different:

  • X11-extras is modified, the version is bumped
  • xmonad is updated to match (with a dependency on the new version)
  • darcs pull xmonad
  • without cleaning first, setup build

Now, Cabal doesn't notice that xmonad.cabal has been updated, and hence doesn't notice that it needs to depend on the new version of X11-extras. If this is the case, it's a Cabal bug - Cabal should refuse to build if the .cabal file has been modified since the last configure.

I've been thinking about how to solve the original recompilation problem. I even hacked up the fix proposed earlier, but then decided it was wrong. We can't rely on packages being registered to notice when things change - the compiler really needs to look at the .hi files. Ideally, we need to create a fingerprint of the interface (hence SimonPJs message to the Haskell mailing list about fingerprints).

None of this will help unless Cabal invokes GHC to do recompilation checking: so if Cabal starts doing its own dependency analysis, we'll be in trouble again.

comment:14 Changed 11 years ago by duncan

Yes, currently Cabal does not notice when the .cabal file has been updated without doing configure again. So cabal-setup build is not enough, you have to cabal-setup configure && cabal-setup build. I'd like to include this feature in the Cabal dependency analysis project, ie it should automatically rebuild the appropriate things depending on what has been changed in the .cabal file.

As for Cabal doing dependency analysis for packages, it does not need to be as accurate as ghc, so long as it is conservatively inaccurate. Then it'll invoke ghc which might decide that there's nothing to do afterall.

comment:15 in reply to:  14 ; Changed 11 years ago by simonmar

Replying to duncan:

Yes, currently Cabal does not notice when the .cabal file has been updated without doing configure again. So cabal-setup build is not enough, you have to cabal-setup configure && cabal-setup build. I'd like to include this feature in the Cabal dependency analysis project, ie it should automatically rebuild the appropriate things depending on what has been changed in the .cabal file.

This would be pretty easy to do in Cabal as it stands, right? I can submit a patch if you agree.

As for Cabal doing dependency analysis for packages, it does not need to be as accurate as ghc, so long as it is conservatively inaccurate. Then it'll invoke ghc which might decide that there's nothing to do afterall.

To be conservative and still not invoke GHC on every file, you'd have to check the modification times on the .hi files of all package dependencies (possibly all the .hi files rather than just the ones that are directly imported, I'm not sure).

comment:16 in reply to:  15 Changed 11 years ago by duncan

Replying to simonmar:

This would be pretty easy to do in Cabal as it stands, right? I can submit a patch if you agree.

To reconfigure automatically you mean? or just tell users they need to reconfigure? The latter would be easy and I'd gladly accept a patch to do that.

To be conservative and still not invoke GHC on every file, you'd have to check the modification times on the .hi files of all package dependencies (possibly all the .hi files rather than just the ones that are directly imported, I'm not sure).

Checking modification times is cheap.

comment:17 in reply to:  13 Changed 11 years ago by dons

Replying to simonmar:

  • X11-extras was updated, the API was changed, the version was not bumped
  • xmonad was updated to match
  • it failed to compile against the old X11-extras

that's expected, right?

So in the past we've had failures of this sort, yes. No version bump in the X11-extras version, then users building without cleaning, getting strange errors. However, now we bump the version on any api change, esp. so for the released stable branch, leading to problems of the second kind:

The problem you described seems slightly different:

  • X11-extras is modified, the version is bumped
  • xmonad is updated to match (with a dependency on the new version)
  • darcs pull xmonad
  • without cleaning first, setup build

Now, Cabal doesn't notice that xmonad.cabal has been updated, and hence doesn't notice that it needs to depend on the new version of X11-extras. If this is the case, it's a Cabal bug - Cabal should refuse to build if the .cabal file has been modified since the last configure.

Right, now that we bump versions properly, the problem is the lack of recompilation in repositories, typically when users darcs pull, and the xmonad .cabal file changes, and then some kind of build failure occurs, or runtime segfault in the worst case.

comment:18 in reply to:  13 Changed 11 years ago by dons

Cc: sjanssen@… added
Version: 6.66.8.1

In the xmonad head branch we've split things so that xmonad extensions are a library which depends on the xmonad core. Now we get a new set of segfaulting xmonad binaries, when people forget to clean their extensions library, after rebuilding the core library:

My xmonad segfaulted today [at noBorders' function] since I had forgotten to runhaskell Setup.lhs clean' in XMonadContrib directory.

-- Reported by Valery V. Vorotyntsev

Any suggestions on how we can avoid this bad user experience?

comment:19 Changed 11 years ago by dons

    17:21  mauke> XMonad/Config/Droundy.hs:91:37: Couldn't match expected type 
              `Graphics.X11.Xlib.Types.Rectangle' against inferred type `Rectangle'
    17:22  sjanssen> mauke: separate compilation issue, clean and rebuild

More recompilation troubles when libraries change.

comment:2 in reply to:  19 Changed 11 years ago by simonmar

Replying to dons:

Ok, in GHC 6.8.1 and Cabal 1.2 we added various measures to prevent accidental mismatches such as this. I need to establish whether these measures are not working for some reason, or whether you're running into cases that we just don't catch yet (there are definitely such cases, but hopefully much fewer than before).

Please could you describe exactly the sequence of events that leads to the crash?

To be clear: I believe all compile-time errors can be avoided by bumping versions when APIs change. If you have evidence to the contrary, I need to hear about it.

comment:21 in reply to:  20 ; Changed 11 years ago by dons

Replying to simonmar:

Please could you describe exactly the sequence of events that leads to the crash?

To be clear: I believe all compile-time errors can be avoided by bumping versions when APIs change. If you have evidence to the contrary, I need to hear about it.

Yes, the situation is improving. We bump most libraries now, cabal is spotting changes to .cabal files. Cabal is spotting .cabal file changes in particular, which is great.

The current xmonad architecture, where xmonad's core is a library (much like ghc-api) is creating more troubles. User extensions are in a library that depends on the xmonad library. When xmonad changes, which it does fairly often, the XMonadContrib library users don't remember to clean, they sometimes get linker errors or segfaults (see this bug report http://code.google.com/p/xmonad/issues/detail?id=93&can=1 for example).

This I think is the only case currently where we have link problems: no .cabal version change, one library changes and is reinstalled, the library that uses it doesn't see the package.conf file changed, isn't cleaned first, builds, then won't link (or it does, and crashes).

Could cabal just require a clean whenever the package.conf changes? (Or does it do that already, and we need to point users to a newer cabal version)?

comment:22 in reply to:  21 Changed 11 years ago by dons

We continue to get user reports of linker failures, and funny runtime behaviour, when:

  • a user updates xmonad darcs library
  • then rebuilds a previously built xmonad-contrib library (the extension repo)

since we can't bump xmonad's version number on every internal change, cabal doesn't spot that 'xmonad', the library, has updated, and the result is linker errors or worse. For example, today, one user reports:

10:13  gwern> '/home/gwern/bin/lib/xmonad-0.4/ghc-6.8.1.20071117/libHSxmonad-0.4.a(Main.o): In function `stEm_info':
10:14  gwern> (.text+0x8990): undefined reference to 
              `X11zm1zi4zi0_GraphicsziX11ziXlibziExtras_a181_info'
10:14  gwern> rebuilding and install x11 and using --make didn't work... 
              better try cleaning everything
10:15  dons> oh, clean everything.
10:16  gwern> yes, that worked. geez.

Another reported massive memory use, that disappeared on cleaning.

My suggestion would be that cabal requires a 'clean' to be run if the global or user package.conf has changed since the last build. That would solve this problem, wouldn't it?

comment:23 Changed 11 years ago by simonmar

Yes, we could force a clean if the package.conf has changed. One reason I didn't want to go this route was that it doesn't help if you're using setup register --in-place, where you don't actually need to register after building a package. But I guess you aren't doing that, so depending on the modification time of package.conf would work for you.

A workaround is to use -fomit-interface-pragmas when compiling xmonad, BTW. That will affect inter-module optimisations, though.

comment:24 Changed 11 years ago by simonmar

Simon PJ pointed out to me that I should say a bit more about what our long-term plans are to address this problem.

We understand the issues, and we want to implement the right fix. We believe the right fix is to

  • include a fingerprint in each .hi file, calculated over the whole contents of the interface.
  • in the dependencies of an interface, give the fingerprints of modules depended on, including fingerprints of package modules (perhaps only directly imported package modules, I'm not sure of the details yet).
  • when deciding whether to recompile, compare the fingerprints of dependencies the last time we compiled against current fingerprints. This will catch changes to packages, but only those changes that force a recompile.

However, before we do this we want to take a critical look at the whole recompilation checking infrastructure, and write some commentary about it. I want to investigate, for example, whether it would be reasonable to simplify it by recording module fingerprints only, rather than versions on every entity as we do now. This would entail more recompilation, but perhaps not significantly more, and it would vastly simplify the implementation.

<rant> Recompilation checking is a maintenance nightmare. It's hard to test, because testing it involves recompiling multiple times with changing sources. It's hard to report bugs, because reproduction is so hard, so when it goes wrong people tend to do 'make clean' and carry on, throwing away the evidence. Perhaps they report the bug anecdotally, but by then it's too late. This is why I want to take a serious look at the whole idea and see whether we can engineer it to be simpler and more testable/maintainable. </rant>

comment:25 Changed 11 years ago by tim

Cc: chevalier@… added

comment:26 Changed 11 years ago by simonmar

Milestone: 6.8 branch6.10 branch
Resolution: fixed
Status: newclosed

Now fixed (in HEAD, not STABLE, and merging is probably not practical):

Wed May 28 05:52:58 PDT 2008  Simon Marlow <marlowsd@gmail.com>
  * Use MD5 checksums for recompilation checking (fixes #1372, #1959)

comment:27 Changed 10 years ago by simonmar

Architecture: UnknownUnknown/Multiple

comment:28 Changed 10 years ago by simonmar

Operating System: UnknownUnknown/Multiple
Note: See TracTickets for help on using tickets.