#14918 closed bug (fixed)

GHC 8.4.1 regression: derived Read instances with field names containing # no longer parse

Reported by: RyanGlScott Owned by:
Priority: high Milestone: 8.4.2
Component: Compiler Version: 8.4.1
Keywords: deriving Cc: tdammers
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Test Case: deriving/should_run/T14918
Blocked By: Blocking:
Related Tickets: #5041, #14364 Differential Rev(s): Phab:D4502
Wiki Page:

Description (last modified by RyanGlScott)

(Originally noticed here.)

Consider the following program:

{-# LANGUAGE MagicHash #-}
module Bug where

data T a = MkT { runT# :: a }
  deriving (Read, Show)

t1, t2 :: T Int
t1 = MkT 1
t2 = read $ show t1

main :: IO ()
main = print t2

In GHC 8.2.1, this runs without issue:

$ /opt/ghc/8.2.2/bin/runghc Bug.hs
MkT {runT# = 1}

In GHC 8.4.1, however, this produces a runtime error:

$ ~/Software/ghc-8.4.1/bin/runghc Bug.hs
Bug.hs: Prelude.read: no parse

Change History (7)

comment:1 Changed 11 months ago by RyanGlScott

Description: modified (diff)

comment:2 Changed 11 months ago by RyanGlScott

Cc: tdammers added

Comparing the results of -ddump-deriv between GHC 8.2.1 and 8.4.1 is interesting. In GHC 8.2.1, we have:

  instance GHC.Read.Read a => GHC.Read.Read (Bug.T a) where
      = GHC.Read.parens
             (do GHC.Read.expectP (Text.Read.Lex.Ident "MkT")
                 GHC.Read.expectP (Text.Read.Lex.Punc "{")
                 GHC.Read.expectP (Text.Read.Lex.Ident "runT")
                 GHC.Read.expectP (Text.Read.Lex.Symbol "#")
                 GHC.Read.expectP (Text.Read.Lex.Punc "=")
                 a1_a2wO <- Text.ParserCombinators.ReadPrec.reset GHC.Read.readPrec
                 GHC.Read.expectP (Text.Read.Lex.Punc "}")
                 GHC.Base.return (Bug.MkT a1_a2wO)))
    GHC.Read.readList = GHC.Read.readListDefault
    GHC.Read.readListPrec = GHC.Read.readListPrecDefault

But in GHC 8.4.1, we have:

  instance GHC.Read.Read a => GHC.Read.Read (Bug.T a) where
      = GHC.Read.parens
             (do GHC.Read.expectP (Text.Read.Lex.Ident "MkT")
                 GHC.Read.expectP (Text.Read.Lex.Punc "{")
                 a1_a2Tm <- GHC.Read.readField
                              "runT#" (Text.ParserCombinators.ReadPrec.reset GHC.Read.readPrec)
                 GHC.Read.expectP (Text.Read.Lex.Punc "}")
                 GHC.Base.return (Bug.MkT a1_a2Tm)))
    GHC.Read.readList = GHC.Read.readListDefault
    GHC.Read.readListPrec = GHC.Read.readListPrecDefault

This likely has something to do with commit dbd81f7e86514498218572b9d978373b1699cc5b (Factor out readField (#14364)). tdammers, do you know what is going on here?

comment:3 Changed 11 months ago by RyanGlScott

Ah, I know what is happening here. #5041 is quite relevant, as is this comment from the GHC source:

    -- For constructors and field labels ending in '#', we hackily
    -- let the lexer generate two tokens, and look for both in sequence
    -- Thus [Ident "I"; Symbol "#"].  See Trac #5041

Now let's look at what readField does:

readField :: String -> ReadPrec a -> ReadPrec a
readField fieldName readVal = do
        expectP (L.Ident fieldName)
        expectP (L.Punc "=")

Alas, it attempts to treat the field name as a single Ident. But if that field name contains a # (e.g., runT#, as in the original example), then this will fail.

It looks like we'll need a variant of readField that takes hashes into account.

comment:4 Changed 11 months ago by RyanGlScott

Differential Rev(s): Phab:D4502
Status: newpatch

comment:5 Changed 11 months ago by Ryan Scott <ryan.gl.scott@…>

In d5577f44/ghc:

Special-case record fields ending with hash when deriving Read

In commit dbd81f7e86514498218572b9d978373b1699cc5b, a
regression was inadvertently introduced which caused derived `Read`
instances for record data types with fields ending in a `#` symbol
(using `MagicHash`) would no longer parse on valid output. This
is ultimately due to the same reasons as #5041, as we cannot parse
a field name like `foo#` as a single identifier. We fix this issue
by employing the same workaround as in #5041: first parse the
identifier name `foo`, then then symbol `#`.

This is accomplished by the new `readFieldHash` function in
`GHC.Read`. This will likely warrant a `base-` release.

Test Plan: make test TEST=T14918

Reviewers: tdammers, hvr, bgamari

Reviewed By: bgamari

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #14918

Differential Revision: https://phabricator.haskell.org/D4502

comment:6 Changed 11 months ago by RyanGlScott

Status: patchmerge
Test Case: deriving/should_run/T14918

comment:7 Changed 11 months ago by bgamari

Resolution: fixed
Status: mergeclosed

Merged to master and ghc-8.4.

Note: See TracTickets for help on using tickets.