Opened 10 years ago

Closed 8 years ago

Last modified 8 years ago

#2089 closed bug (fixed)

reading the package db is slow

Reported by: duncan Owned by:
Priority: normal Milestone: 6.12 branch
Component: Package system Version: 6.8.2
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: x86_64 (amd64)
Type of failure: Compile-time performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

With a large number of registered packages it takes ages for ghc to read the package db and it does this every time it is run so it starts to add up.

I have a rather fast x86-64 machine and 160 registered packages. Here are some timings:

$ time ghc-pkg list > /dev/null
user    0m1.164s

$ time ghc -c does-not-exist.c 2> /dev/null
real    0m0.612s

$ time hsc2hs does-exist.hsc --cflag=--version 2> /dev/null
user    0m0.572s

So since cabal configure involves running all of the above it starts to take a while:

$ time cabal configure
Configuring cabal-install-0.4.3...
real    0m2.241s
user    0m1.916s

The obvious solution is to use a binary cache of the package db containing the most commonly needed mappings like module name -> package etc.

Change History (8)

comment:1 Changed 10 years ago by simonmar

difficulty: Moderate (1 day)
Milestone: 6.10 branch

The scheme we discussed before is something like this:

  • ghc-pkg would automatically generate a binary cache of the package DB whenever it changed.
  • we want to transition to using a directory of files for the package DB, where a new package can be installed by dropping a file into it and running ghc-pkg update to update the binary cache.

The main sticking point here is what binary library to use. In GHC we have our own binary library, but it currently isn't available for ghc-pkg - we'd have to extract it, which is difficult because it has dependencies on other GHC datatypes, or use a different binary library.

If GHC were first modified to use Data.Binary for its interface files, this would be a lot easier.

comment:2 Changed 9 years ago by duncan

Here is an example from debian where the current ghc-pkg register/unregister system causes excess complexity:

https://bugs.launchpad.net/ubuntu/+source/gtk2hs/+bug/229489

The problem is that there are a lot of dependencies on the order in which actions are performed. When upgrading ghc one has to unregister all the old packages, then install the new ghc and then install and register all the new packages. When packages are not written perfectly we end up with corner cases where things go wrong (like the above bug about packages being uninstallable).

With the proposed system it would be much simpler. The files could be uninstalled and installed in any old order so long as we ghc-pkg update at the end. There are many other examples of similar systems in linux distros, so this mode is reasonably well supported (info caches, font registration, gtk+ icon cache etc).

comment:3 Changed 9 years ago by simonmar

Operating System: MultipleUnknown/Multiple

comment:4 Changed 9 years ago by igloo

Milestone: 6.10 branch6.12 branch

comment:5 Changed 9 years ago by simonmar

Component: DriverPackage system

comment:6 Changed 8 years ago by simonmar

Resolution: fixed
Status: newclosed

Fixed. Reading the package DB is several times quicker for me now.

Thu Sep 10 03:27:03 PDT 2009  Simon Marlow <marlowsd@gmail.com>
  * Change the representation of the package database
  
   - the package DB is a directory containing one file per package
     instance (#723)
  
   - there is a binary cache of the database (#593, #2089)
  
   - the binary package is now a boot package
  
   - there is a new package, bin-package-db, containing the Binary
     instance of InstalledPackageInfo for the binary cache.
  
  Also included in this patch
  
   - Use colour in 'ghc-pkg list' to indicate broken or hidden packages
    
     Broken packages are red, hidden packages are 
    
     Colour support comes from the terminfo package, and is only used when
      - not --simple-output
      - stdout is a TTY
      - the terminal type has colour capability
  
   - Fix the bug that 'ghc-pkg list --user' shows everything as broken

comment:7 Changed 8 years ago by simonmar

difficulty: Moderate (1 day)Moderate (less than a day)

comment:8 Changed 8 years ago by simonmar

Type of failure: Compile-time performance bug
Note: See TracTickets for help on using tickets.