Opened 9 years ago

Closed 7 years ago

#2722 closed bug (invalid)

<<loop> when compiling with -O option with ghc-6.10.0.20081019

Reported by: uwe Owned by:
Priority: normal Milestone: 7.2.1
Component: libraries (other) Version: 7.0.1
Keywords: arrows Cc: ross@…
Operating System: Linux Architecture: x86_64 (amd64)
Type of failure: Runtime crash Test Case: tyepcheck/should_run/T2722
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

When trying to compile HXT with ghc-6.10 rc1 all examples, even the simplest ones give the following result. Here is an example:

theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> ./HXmlParser example1.xml
HXmlParser: <<loop>>

this only occurs, when compiling with -O (or -O2). When compiling whithout any optimizaions, the programs run as expected

Here is a more complete log of one example

The running example without -O

theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> ghc --version
The Glorious Glasgow Haskell Compilation System, version 6.10.0.20081019
theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> ghc-pkg list
/home/theo/lib/ghc-6.10.0.20081019/./package.conf:
    Cabal-1.6.0.1, HUnit-1.2.0.2, QuickCheck-1.2.0.0, array-0.2.0.0,
    base-3.0.3.0, base-4.0.0.0, bytestring-0.9.1.4, containers-0.2.0.0,
    curl-1.3.2.1, directory-1.0.0.2, (dph-base-0.3), (dph-par-0.3),
    (dph-prim-interface-0.3), (dph-prim-par-0.3), (dph-prim-seq-0.3),
    (dph-seq-0.3), filepath-1.1.0.1, (ghc-6.10.0.20081019),
    ghc-prim-0.1.0.0, haddock-2.2.2, haskell-src-1.0.1.3,
    haskell98-1.0.1.0, hpc-0.5.0.2, html-1.0.1.2, hxt-8.2.0,
    integer-0.1.0.0, mtl-1.1.0.2, network-2.2.0.0, old-locale-1.0.0.1,
    old-time-1.0.0.1, packedstring-0.1.0.1, parallel-1.0.0.1,
    parsec-2.1.0.1, pretty-1.0.1.0, process-1.0.1.0, random-1.0.0.1,
    regex-base-0.72.0.2, regex-compat-0.71.0.1, regex-posix-0.72.0.3,
    rts-1.0, stm-2.1.1.1, syb-0.1.0.0, tagsoup-0.6,
    template-haskell-2.3.0.0, time-1.1.2.2, unix-2.3.1.0,
    xhtml-3000.2.0.1
theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> find ../../../src -name '*.hi' | xargs rm -f
theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> make local GHCFLAGS=-Wall
ghc --make -o ./HXmlParser -Wall -fglasgow-exts -ignore-package hxt -i../../../src ./HXmlParser.hs
[  1 of 112] Compiling Text.XML.HXT.XPath.XPathKeywords ( ../../../src/Text/XML/HXT/XPath/XPathKeywords.hs, ../../../src/Text/XML/HXT/XPath/XPathKeywords.o )
[  2 of 112] Compiling Text.XML.HXT.IO.GetFILE ( ../../../src/Text/XML/HXT/IO/GetFILE.hs, ../../../src/Text/XML/HXT/IO/GetFILE.o )
[  3 of 112] Compiling Text.XML.HXT.DTDValidation.RE ( ../../../src/Text/XML/HXT/DTDValidation/RE.hs, ../../../src/Text/XML/HXT/DTDValidation/RE.o )
...
[111 of 112] Compiling Text.XML.HXT.Arrow ( ../../../src/Text/XML/HXT/Arrow.hs, ../../../src/Text/XML/HXT/Arrow.o )
[112 of 112] Compiling Main             ( HXmlParser.hs, HXmlParser.o )
Linking ./HXmlParser ...
theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> ./HXmlParser example1.xml
<?xml version="1.0" encoding="UTF-8"?>
<?pi  value="a processing instruction"?><a att1="test äöüß test" att2="root">
    <b btt2="b1" btt3="root"/>
    <cü>hello world äöüß test</cü>
</a>

The same with -O

theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser>
theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> find ../../../src -name '*.hi' | xargs rm -f
theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> make clean
rm -f ./HXmlParser *.o *.hi
theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> make local GHCFLAGS="-Wall -O"
ghc --make -o ./HXmlParser -Wall -O -fglasgow-exts -ignore-package hxt -i../../../src ./HXmlParser.hs
[  1 of 112] Compiling Text.XML.HXT.XPath.XPathKeywords ( ../../../src/Text/XML/HXT/XPath/XPathKeywords.hs, ../../../src/Text/XML/HXT/XPath/XPathKeywords.o )
[  2 of 112] Compiling Text.XML.HXT.IO.GetFILE ( ../../../src/Text/XML/HXT/IO/GetFILE.hs, ../../../src/Text/XML/HXT/IO/GetFILE.o )
[  3 of 112] Compiling Text.XML.HXT.DTDValidation.RE ( ../../../src/Text/XML/HXT/DTDValidation/RE.hs, ../../../src/Text/XML/HXT/DTDValidation/RE.o )
[  4 of 112] Compiling Text.XML.HXT.RelaxNG.Unicode.Blocks ( ../../../src/Text/XML/HXT/RelaxNG/Unicode/Blocks.hs, ../../../src/Text/XML/HXT/RelaxNG/Unicode/Blocks.o )
...
[111 of 112] Compiling Text.XML.HXT.Arrow ( ../../../src/Text/XML/HXT/Arrow.hs, ../../../src/Text/XML/HXT/Arrow.o )
[112 of 112] Compiling Main             ( HXmlParser.hs, HXmlParser.o )
Linking ./HXmlParser ...
theo@brett:~/haskell/hxt-ghc10/examples/arrows/hparser> ./HXmlParser example1.xml
HXmlParser: <<loop>>

The only difference is the -O.

My guess is, that there could be problems with the overloaded version of id and . (function comp.). This is used in HXT because of various arrow types.

In the example going wrong, there is a list arrow involved, that manages a state and runs in the IO monad.

It's rather hard to strip down this test case to a smaller example, because I've no idea where to search for this loop.

If neccessary I can prepare a tar archive with all sources needed to reproduce this error.

Attachments (2)

HXmlParser.hs (3.1 KB) - added by igloo 9 years ago.
T2722.hs (23.7 KB) - added by igloo 7 years ago.

Download all attachments as: .zip

Change History (26)

comment:1 Changed 9 years ago by igloo

difficulty: Unknown
Milestone: 6.10.1
Priority: normalhigh

A tar archive would be very useful, thanks!

comment:2 Changed 9 years ago by uwe

I've prepared a tar archive with all sources needed to reproduce the error. The URL is http://www.fh-wedel.de/~si/HXmlToolbox/hxt-ghc10-error.tar.gz

Please untar the archive, cd into hxt-ghc10 and run "make error". This will cd into examples/arrows/hparser, compile the test prog with -O0 and execute a simple test run.

Then the build is repeated but with -O and the test prog issues <<loop>>.

There are a lot of sources, that shouldn't have any influence on this error. Interesting modules could be the Control.Arrow.* modules.

comment:3 Changed 9 years ago by simonpj

That's a big program! Can you say why you suspect id and (.), and/or Control.Arrow.*?

Changed 9 years ago by igloo

Attachment: HXmlParser.hs added

comment:4 Changed 9 years ago by igloo

I've attached a single module case:

$ ghc -fforce-recomp --make HXmlParser.hs -Wall && ./HXmlParser
[1 of 1] Compiling Main             ( HXmlParser.hs, HXmlParser.o )
Linking HXmlParser ...
Foo 1
Foo 2
HXmlParser: XXX
$ ghc -fforce-recomp --make HXmlParser.hs -O -Wall && ./HXmlParser
[1 of 1] Compiling Main             ( HXmlParser.hs, HXmlParser.o )
Linking HXmlParser ...
Foo 1
HXmlParser: <<loop>>

comment:5 Changed 9 years ago by simonpj

OK I know what is happening. Sigh.

I boiled it down still more:

module Foo (main, arid1, arid2) where

import Prelude hiding (id, (.))
import qualified Prelude
import Control.Category
import Control.Arrow
import System.IO
import Debug.Trace


main = runXIOState arid1	-- Loop with arid1
                                -- Works with arid2

arid1 :: Arrow m => m a a
arid1 = arr id

arid2 :: Arrow m => m a a
arid2 = arr Prelude.id

runXIOState :: IOSLA () c -> IO [c]
runXIOState f = runIOSLA f ()

newtype IOSLA a b = IOSLA { runIOSLA :: a -> IO [b] }

instance Arrow IOSLA where
    arr f = IOSLA $ \ x -> return [f x]

instance Category IOSLA where
    id = arr id

-- arr :: Arrow m => (b->c) -> m b c
-- id  :: Category m => m b b
-- (arr id) :: Arrow m => m a a

What is happening is this.

  • In Control.Arrow we find the following rewrite rule
    {-# RULES "identity" arr id = id
    
  • But, as you see above, id is defined to be arr id.

So the result is (unsurprisingly) a loop.

I'm not sure whether the fault lies with the person who wrote the RULE in Control.Arrow, or the person who wrote the instance of IOSLA above. (The instance does look suspicious, because Category is a superclass of Arrow, and yet uses the arrow arr method to define id.)

Very tricky to track down. At all events I say that GHC is not at fault.

I'm leaving this open because someone should

  • Decide whether the rules in Control.Arrow are valid in general.
  • If so, write some documentation somewhere about the constraints that these rules impose on instance declarations.

Simon

comment:6 Changed 9 years ago by igloo

Cc: ross@… added
Component: Compilerlibraries (other)
Keywords: arrows added; Optimization Loop removed
Milestone: 6.10.1Not GHC
Priority: highnormal

comment:7 Changed 9 years ago by uwe

The guy who wrote the instance of IOSLA that was me. I took the advice from Don Stewarts wiki page for porting arrow code to ghc-6.10.

As far as I see, in the definition of

id = arr id

the id on the right hand side is another one than id on the left. The optimisation RULE above only holds for the instance Category (->), not for others. So the application of this rule should be restricted to (->).

By the way, would it be a workaround for me to define the id explicitly, not using the pure function id?

comment:8 Changed 9 years ago by simonpj

Well the Arrow class in Control.Arrow is immediately followed by a whole bunch of rules:

{-# RULES
"identity"
                arr id = id
"compose/arr"   forall f g .
                (arr f) . (arr g) = arr (f . g)
"first/arr"     forall f .
                first (arr f) = arr (first f)
"second/arr"    forall f .
                second (arr f) = arr (second f)
"product/arr"   forall f g .
                arr f *** arr g = arr (f *** g)
"fanout/arr"    forall f g .
                arr f &&& arr g = arr (f &&& g)
"compose/first" forall f g .
                (first f) . (first g) = first (f . g)
"compose/second" forall f g .
                (second f) . (second g) = second (f . g)
 #-}

If you are saying that they are not valid in general (eg you say the first is valid only for (->)), then these rules had better be speicialised for the types they work for.

Can you and Ross discuss this, and figure out what to do? It's not a GHC issue any more: rather a question of the Arrow/Category library design.

Simon

PS: Yes if you define id explicitly that ought to be ok. But remember that the RULE arr id ---> id is still going to apply (perhaps bogusly?) at all arrows. So you want to do more than work around the problem.

comment:9 Changed 9 years ago by ross

Resolution: fixed
Status: newclosed

It never occurred to me before that using a valid equation as a rule could change the semantics of a recursive definition, but it's obvious in retrospect.

So with the Category split, this rule has to go, and I've pushed a patch to delete it.

comment:10 Changed 7 years ago by litoh

Architecture: x86x86_64 (amd64)
Resolution: fixed
Status: closednew
Type of failure: Runtime crash
Version: 6.10.17.0.1

I'm getting the same error with another Arrow library (FRP.Yampa). What's strange though is that the error disappears under certain circumstances:

# use Bool instead of Event () for input type # use Int instead of (String, Int) for output type # - # inline constant function

WTF? :)

{-# LANGUAGE Arrows #-}

module Main (main) where

import FRP.Yampa

type ObjIn = Event () -- loop #1
--type ObjIn = Bool -- no loop #1

type ObjOut = (String, Int) -- loop #2
--type ObjOut = Int         -- no loop #2

type GameObj = SF ObjIn ObjOut

testObj :: GameObj
testObj = proc hit -> do
    returnA -< ("testObj", 1) -- loop #2
--    returnA -< 1            -- no loop #2

process :: [GameObj] -> SF () [ObjOut]
process objs = proc _ -> do
    rec
        gamestate <- par logic objs
            -< gamestate -- loop #3 (recursive definition!)
--            -< [] -- no loop #3

    returnA -< gamestate

logic :: [ObjOut] -> [sf] -> [(ObjIn, sf)]
logic gamestate objs = map route objs
  where
    route obj =
        (if null (foo gamestate) then NoEvent else NoEvent, obj) -- loop #1
--        (if null (foo gamestate) then False else False, obj) -- no loop #1

foo :: [ObjOut] -> [ObjOut]
foo [] = []
foo objs = concat (collisions objs)
  where
    collisions [] = []
    collisions (out:objs') = 
        [[out, out'] | out' <- objs, out `collide` out'] -- loop #4
--        [[out, out'] | out' <- objs, True] -- no loop #4

collide :: ObjOut -> ObjOut -> Bool
collide (_, p) (_, p') = True -- loop #2
--collide p p' = True         -- no loop #2


main :: IO ()
main = do
    putStrLn . show $ embed (process [testObj]) ((), [(1.0, Nothing)])

comment:11 Changed 7 years ago by igloo

Milestone: Not GHC7.0.3

comment:12 Changed 7 years ago by simonmar

Owner: set to simonpj

comment:13 Changed 7 years ago by simonpj

Status: newinfoneeded
Test Case: tyepcheck/should_run/T2722

OK I think this one is properly fixed now

Wed Jan 12 06:56:04 PST 2011  simonpj@microsoft.com
  * Major refactoring of the type inference engine
  
  This patch embodies many, many changes to the contraint solver, which
  make it simpler, more robust, and more beautiful.  But it has taken
  me ages to get right. The forcing issue was some obscure programs
  involving recursive dictionaries, but these eventually led to a
  massive refactoring sweep.

  Main changes are:
   * No more "frozen errors" in the monad.  Instead "insoluble
     constraints" are now part of the WantedConstraints type.
  
   * The WantedConstraint type is a product of bags, instead of (as
     before) a bag of sums.  This eliminates a good deal of tagging and
     untagging.
  
   * This same WantedConstraints data type is used
       - As the way that constraints are gathered
       - As a field of an implication constraint
       - As both argument and result of solveWanted
       - As the argument to reportUnsolved
  
   * We do not generate any evidence for Derived constraints. They are
     purely there to allow "impovement" by unifying unification
     variables.
  
   * In consequence, nothing is ever *rewritten* by a Derived
     constraint.  This removes, by construction, all the horrible
     potential recursive-dictionary loops that were making us tear our
     hair out.  No more isGoodRecEv search either. Hurrah!
  
   * We add the superclass Derived constraints during canonicalisation,
     after checking for duplicates.  So fewer superclass constraints
     are generated than before.
  
   * Skolem tc-tyvars no longer carry SkolemInfo.  Instead, the
     SkolemInfo lives in the GivenLoc of the Implication, where it
     can be tidied, zonked, and substituted nicely.  This alone is
     a major improvement.
  
   * Tidying is improved, so that we tend to get t1, t2, t3, rather
     than t1, t11, t111, etc
  
     Moreover, unification variables are always printed with a digit
     (thus a0, a1, etc), so that plain 'a' is available for a skolem
     arising from a type signature etc. In this way,
       (a) We quietly say which variables are unification variables,
           for those who know and care
       (b) Types tend to get printed as the user expects.  If he writes
               f :: a -> a
               f = ...blah...
           then types involving 'a' get printed with 'a', rather than
           some tidied variant.
  
   * There are significant improvements in error messages, notably
     in the "Cannot deduce X from Y" messages.

    M ./compiler/basicTypes/Name.lhs -1 +1
    M ./compiler/basicTypes/OccName.lhs -2 +4
    M ./compiler/basicTypes/Var.lhs -1 +4
    M ./compiler/ghci/RtClosureInspect.hs -5 +2
    M ./compiler/main/HscTypes.lhs -5 +9
    M ./compiler/main/InteractiveEval.hs -18 +32
    M ./compiler/typecheck/FamInst.lhs -2 +1
    M ./compiler/typecheck/Inst.lhs -42 +113
    M ./compiler/typecheck/TcArrows.lhs -1 +1
    M ./compiler/typecheck/TcBinds.lhs -16 +9
    M ./compiler/typecheck/TcCanonical.lhs -110 +162
    M ./compiler/typecheck/TcClassDcl.lhs -3 +2
    M ./compiler/typecheck/TcEnv.lhs +1
    M ./compiler/typecheck/TcErrors.lhs -298 +248
    M ./compiler/typecheck/TcExpr.lhs -3 +8
    M ./compiler/typecheck/TcHsSyn.lhs -1 +1
    M ./compiler/typecheck/TcHsType.lhs -2 +2
    M ./compiler/typecheck/TcInstDcls.lhs -23 +14
    M ./compiler/typecheck/TcInteract.lhs -336 +304
    M ./compiler/typecheck/TcMType.lhs -79 +119
    M ./compiler/typecheck/TcMatches.lhs -1 +1
    M ./compiler/typecheck/TcPat.lhs -6 +6
    M ./compiler/typecheck/TcRnDriver.lhs -4 +16
    M ./compiler/typecheck/TcRnMonad.lhs -7 +22
    M ./compiler/typecheck/TcRnTypes.lhs -103 +305
    M ./compiler/typecheck/TcRules.lhs -2 +3
    M ./compiler/typecheck/TcSMonad.lhs -270 +82
    M ./compiler/typecheck/TcSimplify.lhs -378 +427
    M ./compiler/typecheck/TcSplice.lhs -9 +10
    M ./compiler/typecheck/TcType.lhs -155 +87
    M ./compiler/typecheck/TcUnify.lhs -38 +39
    M ./compiler/types/FamInstEnv.lhs +3
    M ./compiler/types/Unify.lhs -31 +41
    M ./compiler/utils/Bag.lhs -1 +14

I've added a test taken from an earlier comment. Could you try the HEAD (which will turn into GHC 7.0.2 soon) on your FRP example, please?

comment:14 Changed 7 years ago by litoh

I really tried to help but after 6 hrs of unsuccessfully compiling ghc-HEAD-2011-01-10-ghc.tar.bz2 on Ubuntu x64 Linux I gave up. Sorry.

Maybe there is a better place to discuss this but I get the following error when doing $ ghc/make

Configuring haddock-2.8.0... make[1]: * No rule to make target `libraries/dph/ghc.mk'. Stop. make: * [all] Error 2

Another problem appears to be that darcs is constantly freezing $ ./darcs-all get

Copying pristine 486 done, 6 queued. legacy

$ darcs -v 2.4.4 (release)

comment:15 Changed 7 years ago by simonpj

Ah, yes, I recall this darcs-freezing thing happened to me to. What fixed it was adding --no-cache, I think. try that.

As to your build error, you have done ./darcs-all get haven't you? I think you'd be better off with HEAD than with something from 10 Jan.

Simon

comment:16 Changed 7 years ago by litoh

The problem remains, here's what I did:

Installed new darcs to avoid sleeping: $ darcs -v

2.5.1 (release)

I followed the GHC build guides at http://hackage.haskell.org/trac/ghc/wiki/Building/Hacking (with --no-cache option) $ darcs get --lazy --no-cache http://darcs.haskell.org/ghc $ ./darcs-all --testsuite get

in mk/build.mk: BuidlFlavour = prof

$ perl boot $ ./configure --with-ghc=/usr/local/bin/ghc-7.0.1.20110217 (I had to recompile for profiling support and define an older GHC build. I used the current RC build) $ make -j2 $ make install $ ghc --version

The Glorious Glasgow Haskell Compilation System, version 7.1.20110223

$ cabal install yampa $ ghc --make Test.hs -o test -O2 -fforce-recomp

test: <<loop>>

:/

Changed 7 years ago by igloo

Attachment: T2722.hs added

comment:17 Changed 7 years ago by igloo

I've attached a dependency-free testcase:

$ ghc --make -O T2722 -fforce-recomp && ./T2722; ghc --make -O0 T2722 -fforce-recomp && ./T2722 
[1 of 1] Compiling Main             ( T2722.hs, T2722.o )
Linking T2722 ...
T2722: <<loop>>
[1 of 1] Compiling Main             ( T2722.hs, T2722.o )
Linking T2722 ...
[[("testObj",1)],[("testObj",1)]]

comment:18 Changed 7 years ago by simonpj

OK, my comment with the patch "Major refactoring of the type inference engine" is a complete red herring; this ticket is about something different.

The original bug turned out to be that a RULE in Control.Arrow was incompatible with a client's instance declaration. It's possible (probable, even) that the same thing is happening here.

  • Try with -O -fno-enable-rewrite-rules which will switch off all rewrite rules but enable other optimisations.
  • Try -ddump-rule-firings to see which rules are firing
  • If that is OK, try disabling rules in Control.Arrow to try to binary-chop your way to the offending RULE.

Let me know if I can help, but currently I believe this is a library/client problem rather than a GHC one.

Simon

comment:19 Changed 7 years ago by litoh

$ ghc --make test.hs -o test -O2 -fno-enable-rewrite-rules -ddump-rule-firings

Rule fired: Class op show
Rule fired: Class op arr
Rule fired: Class op first
Rule fired: concat
Rule fired: map
Rule fired: Class op arr
Rule fired: arrPrim/arrEPrim
Rule fired: unpack
Rule fired: Class op loop

$ ./test
> testloop: <<loop>>

If I understood "-fno-enable-rewrite-rules" correctly, it should turn off rules. Thus the bug doesn't derive from rule rewriting?

comment:20 Changed 7 years ago by simonpj

Harumph. It turns out that -fno-enable-rewrite-rules doesn't work. This patch fixes it

* Make -fno-enable-rewrite-rules work properly
  
  I'd failed to propagate the Opt_EnableRewriteRules flag properly,
  which meant that -fno-enable-rewrite-rules didn't disable all
  rewrites.  This patch fixes it.

    M ./compiler/simplCore/CoreMonad.lhs -4 +1
    M ./compiler/simplCore/SimplCore.lhs -1 +1
    M ./compiler/simplCore/SimplUtils.lhs -7 +13
    M ./compiler/simplCore/Simplify.lhs -1 +2

Once that is done, T2722 works fine with -O -fno-enable-rewrite-rules, and loops with plain -O. (You don't need to recompile the libraries to see this effect.)

So it's definitely a rewrite rule. It seems to me very likely that it's a similar bug to the one I tracked down at the beginning of this ticket. It's probably not strictly a GHC bug, but still a bug in a RULE in a library shipped with GHC is much the same from a customer point of view.

To help find it,

  • -ddump-rule-firings will show which rules are firing
  • -ddump-rule-rewrites will additionally show before and after
  • You can comment out rules to disable them individually. (But some are in libraries, so that will require you to recompile the library.)

Thanks for helping

Simon

comment:21 Changed 7 years ago by litoh

Resolution: invalid
Status: infoneededclosed

comment:22 Changed 7 years ago by igloo

Owner: simonpj deleted
Resolution: invalid
Status: closednew

We still need to fix the rules in the library that comes with GHC, so reopening this.

comment:23 in reply to:  22 Changed 7 years ago by litoh

Turned out to be a bug in the Yampa library Concerning my test case, it was "a complete red herring" ;)

comment:24 Changed 7 years ago by igloo

Resolution: invalid
Status: newclosed

Ah, you mean there is no bug in the arrow library after all? OK, thanks for looking into it and letting us know; I'll close this ticket again then.

Note: See TracTickets for help on using tickets.