Opened 14 months ago

Last modified 9 months ago

#12665 new feature request

Make Read instances for Integral types faster, and make them fail fast

Reported by: dfeuer Owned by: dfeuer
Priority: high Milestone: 8.4.1
Component: Core Libraries Version: 8.0.1
Keywords: Cc: ekmett
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by dfeuer)

At present, the Read instances for standard types are generally written as though Read were a typical parser for a programming language. However, it most assuredly is not. Specifically, we currently follow the pattern of lexing and then parsing. The first problem with that is that while we follow that model, we don't actually get the big benefit of that model; the class methods don't mention or respect token boundaries, so lexing first doesn't prevent backtracking. More to the point, perhaps, lexing looks at the string and attempts to identify the next token, whatever it may be. But for Read, we don't need that. Thanks to the types, we have a very clear sense of what characters we expect to encounter.

I've started to sketch out an improvement, which reads Int and Word around seven times as fast (9.5 times as fast with parentheses and negation), and will fail immediately on something like read (fix ('a':)) :: Int rather than going into an infinite loop.

To begin with, modify the definition of paren thus:

expectP' :: Char -> ReadPrec ()
expectP' c = lift (expect' c)
{-# INLINE expectP' #-}

expect' :: Char -> ReadP ()
expect' c = do
  ReadP.skipSpaces
  thing <- ReadP.get
  if thing == c
    then pure ()
    else ReadP.pfail
{-# INLINE expect' #-}


paren :: ReadPrec a -> ReadPrec a
-- ^ @(paren p)@ parses \"(P0)\"
--      where @p@ parses \"P0\" in precedence context zero
paren p = do expectP' '('
             x <- reset p
             expectP' ')'
             return x

This allows fast failure when looking for parentheses, so we don't have to scan to the end of the first token (whatever it may be) before concluding that it is not '('.

Now we can parse Word and Int very efficiently. I had to specialize earlier than I wanted to convince GHC that I don't want to convert through Integer. I'm not sure why the fromIntegral rule doesn't fire reliably around here. The code below (temporarily) uses the current definition for base 16, because that's a bit fussy; I'll rewrite it soon.

charDiff :: Char -> Char -> Word
charDiff c1 c2 = fromIntegral (ord c1 - ord c2)
{-# INLINE charDiff #-}

readHexOct :: ReadP Word
readHexOct = do
  _ <- ReadP.char '0'
  baseId <- lexBaseChar
  case baseId of
    Oct -> readBaseP 8
    Hex -> L.readHexP
{-# INLINE readHexOct #-}

data BaseId = Oct | Hex

lexBaseChar :: ReadP BaseId
lexBaseChar = do { c <- ReadP.get;
                   case c of
                     'o' -> pure Oct
                     'O' -> pure Oct
                     'x' -> pure Hex
                     'X' -> pure Hex
                     _ -> ReadP.pfail }

readWord :: ReadP Word
readWord = readNumber (readHexOct ReadP.<++ readBaseP 10)

readInt :: ReadP Int
readInt = fromIntegral <$> readWord

readBaseP :: Integral a => Word -> ReadP a
readBaseP !base = do
  c <- ReadP.get
  let diff = charDiff c '0'
  if diff < base
    then readBaseP' base (fromIntegral diff)
    else ReadP.pfail
{-# INLINE readBaseP #-}

readBaseP' :: Integral a => Word -> a -> ReadP a
readBaseP' !base !acc0 = ReadP.look >>= go acc0
  where
    go !acc (c:cs) | diff < base = ReadP.get *> go (fromIntegral base * acc + fromIntegral diff) cs
      where diff = charDiff c '0'
    go !acc _ = pure acc
{-# INLINE readBaseP' #-}

readNumber :: Num a => ReadP.ReadP a -> ReadPrec a
readNumber p = parens $ do
  cs <- lift skipSpaces *> look
  case cs of
    ('-': _) -> get *> lift (skipSpaces *> (negate <$> p))
    _ -> lift $ skipSpaces *> p
{-# INLINE readNumber #-}

Other Word and Int-like types can be built on top of that foundation. I haven't yet attempted to deal with other instances, but I think there are probably a lot of opportunities for similar improvements.

WARNING: I know very little about parsing, and less about ReadPrec. It's conceivable that I've made some semantic errors here, although I don't think I have.

Change History (20)

comment:1 Changed 14 months ago by dfeuer

Description: modified (diff)

comment:2 Changed 14 months ago by dfeuer

Owner: set to dfeuer

comment:3 Changed 14 months ago by dfeuer

So I don't forget: when we improve Read Integer and Read Natural, we probably want to make the radix-doubling thing start out coarser. I think we basically want to start with base 10^19 for decimal, etc., so that we work with chunks that fit in a Word. Essentially, start by parsing to a list of Words, each of which represents a base-10^19 digit, and then apply an appropriate algorithm to combine these all into an Integer or Natural.

comment:4 Changed 14 months ago by dfeuer

Description: modified (diff)

comment:5 Changed 14 months ago by dfeuer

The code I gave does have a bug. If a string can be parsed as a fractional value, it seems it's not supposed to produce any results for integral types. I think we can fix this when we get to the end of the digits by looking ahead for . or e followed by a digit, and failing if we see one.

comment:6 Changed 14 months ago by dfeuer

Let me take that back; I think my modified implementation actually does what Haskell98 says, whereas the current implementation does something else. I don't know for sure what should be considered the correct behavior.

P.S., to match the current behavior, we'd need to accept "122e-" as an integer, but not "122e-3", because the current implementation lexes "122e-3" as a fractional number and thus not an integer, whereas it lexes "122e-" as an integer followed by an identifier followed by a symbol.

comment:7 Changed 14 months ago by int-e

The following is a lazier variant of valInteger that does a chunking pass on Ints. (Resulting from discussion on #haskell)

-- Digits' represents a stream of digits where the last digit carries its own
-- base. For example, in base 100, we could represent 12345 as either one of
--   DCons 12 (DCons 34 (DNil 5 10))  or
--   DCons 1 (DCons 23 (DCons 45 (DNil 0 1)))
data Digits' = DCons !Integer Digits' | DNil !Integer Integer

valInteger :: Int -> [Int] -> Integer
valInteger _ [] = 0
valInteger b ds = go b1 (chunks 0 0 ds) where
    -- chunking pass: collect digits in base b^chunkSize
    chunkSize :: Int
    chunkSize = 7 -- b^chunkSize must fit into an Int
    b1 :: Integer
    b1 = fromIntegral (b^chunkSize)
    chunks :: Int -> Int -> [Int] -> Digits'
    chunks l d ds | l == chunkSize = DCons (fromIntegral d) (chunks 0 0 ds)
    chunks l d [] = DNil (fromIntegral d) (fromIntegral (b^l))
    chunks l d (d' : ds) = chunks (l+1) (d*b + d') ds
    -- bottom-up combination pass, squaring the base in each pass
    go :: Integer -> Digits' -> Integer
    go _ (DNil d _) = d
    go _ (DCons d' (DNil d b)) = d'*b + d
    go b (DCons d' ds) = go (b*b) (combine b d' ds)
    combine :: Integer -> Integer -> Digits' -> Digits'
    combine b d (DNil d' b') = DNil (d*b' + d') (b * b')
    combine b d (DCons d' (ds@DNil{})) = DCons (d*b + d') ds
    combine b d (DCons d' (DCons d'' ds)) = DCons (d*b + d') (combine b d'' ds)

comment:8 in reply to:  7 ; Changed 14 months ago by dfeuer

Replying to int-e:

chunkSize = 7 -- b^chunkSize must fit into an Int

Surely we should set the chunk size based on the actual word size, rather than assuming every machine is 32-bit or whatever.

comment:9 in reply to:  8 Changed 14 months ago by int-e

Replying to dfeuer:

Replying to int-e:

chunkSize = 7 -- b^chunkSize must fit into an Int

Surely we should set the chunk size based on the actual word size, rather than assuming every machine is 32-bit or whatever.

Yes, I intended to leave this as an exercise to the reader though, and instead focussed on streamlining the interesting bits of the algorithm. (In principle the chunk size could depend on the base as well, but only some serious benchmarking effort will show whether that is worthwhile.)

comment:10 Changed 14 months ago by int-e

more prototypes for conversion from digits to an integer can be found here: http://int-e.eu/~bf3/haskell/FromDigits.hs ... valInteger' is the most interesting one of the bunch, and according to some preliminary testing, also the most efficient.

comment:11 in reply to:  10 ; Changed 14 months ago by dfeuer

Replying to int-e:

more prototypes for conversion from digits to an integer can be found here: http://int-e.eu/~bf3/haskell/FromDigits.hs ... valInteger' is the most interesting one of the bunch, and according to some preliminary testing, also the most efficient.

There's an unfortunate problem with the memoized base list. Once GHC knows the base, it pulls the bs base list up to the top level. If someone parses a very large integer, I believe we could be stuck with its base list indefinitely. Now it turns out that ReadP is defined newtype ReadP a = R (forall b . (a -> P b) -> P b). If Text.ParserCombinators.ReadP exposed the R constructor and the P type (which it does not) then we could almost certainly trick GHC into thinking that bs depends on the passed a -> P b function. Since it does not, we're in a bit of a pickle. I think the best approach is probably to add an "internal" module that exposes all these details. For experimental purposes, I guess I can just use unsafeCoerce.

comment:12 Changed 14 months ago by dfeuer

Argh! I was stupid there. I figured out a way. Also, I managed to forget that we have arbitrary look-ahead, so if we prefer to work from a list, we actually can.

comment:13 in reply to:  11 ; Changed 14 months ago by int-e

Replying to dfeuer:

There's an unfortunate problem with the memoized base list. [...]

One can also store the bases in the stack itself, see valInteger'' in the (updated) http://int-e.eu/~bf3/haskell/FromDigits.hs file. This causes some extra allocations, so the code becomes slower, but almost imperceptibly so.

In my crude measurements (code compiled with -O2, run by using the test* functions in ghci), converting a list of 10,000,000 digits in base 10 takes 8.1s with the code currently in base (Data.Read.Lex.valInteger), 2.0s with valInteger' and 2.1s with valInteger''.

Converting 218 18 digit integers takes 0.47s with the original code, 0.23s with valInteger' and valInteger''.

Note that no actual lexing is done by this code.

comment:14 in reply to:  13 Changed 14 months ago by dfeuer

Replying to int-e:

One can also store the bases in the stack itself, see valInteger'' in the (updated) http://int-e.eu/~bf3/haskell/FromDigits.hs file. This causes some extra allocations, so the code becomes slower, but almost imperceptibly so.

I wrote up something essentially equivalent to Numeric.readDec based on your latest code and the shape of the lexer:

decNat :: ReadP Natural
decNat = valNat 10 19 $ \c ->
  let diff = charDiff c '0'
  in if diff < 10
     then Just diff
     else Nothing


valNat :: Word -> Word -> (Char -> Maybe Word) -> ReadP Natural
valNat base chunkSize terp = do
  s <- ReadP.look
  valNat' base chunkSize terp s
{-# INLINE valNat #-}


data Stack = SNil | SSkip Natural !Stack | SCons Natural !Natural !Stack

{-# INLINE valNat' #-}
valNat' :: Word -> Word -> (Char -> Maybe Word) -> [Char] -> ReadP Natural
valNat' base chunkSize terp cs0@(c0 : _)
  | Just _ <- terp c0 = goChunks SNil 0 0 cs0
  where
    b1 :: Natural
    b1 = fromIntegral base * fromIntegral (base^(chunkSize-1))

    goChunks :: Stack -> Word -> Word -> [Char] -> ReadP Natural
    goChunks !s !l !chunk !ds
      | l == chunkSize = goChunks (step s (fromIntegral chunk)) 0 0 ds
    goChunks !s !l !chunk (c:cs)
      | Just d <- terp c
          = ReadP.get *> goChunks s (l+1) (chunk*base+d) cs
    goChunks !s !l !chunk _ =
       pure $ fromStack (fromIntegral chunk) (fromIntegral (base^l)) s

    step :: Stack -> Natural -> Stack
    step SNil d = SCons b1 d SNil
    step (SSkip b s) d = SCons b d s
    step (SCons b d' s) d = SSkip b (step' b s (d + d'*b))

    step' :: Natural -> Stack -> Natural -> Stack
    step' b SNil d = SCons (b*b) d SNil
    step' _ (SSkip b s) d = SCons b d s
    step' _ (SCons b d' s) d = SSkip b (step' b s (d + d' * b))

    fromStack :: Natural -> Natural -> Stack -> Natural
    fromStack d' _ SNil = d'
    fromStack d' b' (SSkip b s) = fromStack d' b' s
    fromStack d' b' (SCons b d s) = fromStack (d*b'+d') (b*b') s

valNat' _ _ _ _ = ReadP.pfail

-- We could write a parser more precisely imitating GHC's current
-- `Read` instance by using `valNat` and then performing a look-ahead
-- to check for illegal termination sequences involving `.` or `e`,
-- but I'm really not convinced that we should.

This seems to be very fast indeed when used for base 10 with a chunk size of 19 (on a 64-bit system). Oddly, Numeric.readDec is much, much slower than read, which looks likely to be a RULES issue. That makes it a bit hard to compare the algorithms fairly. I haven't yet wired your simpler version up in quite this fashion, and I'm not actually sure if it can be wired up so, but it would be worth checking.

comment:15 Changed 14 months ago by YitzGale

Does this still guarantee

read "xxxx" == xxxx

for every Fractional literal xxxx that parses, for all built-in instances of Fractional? I believe we have that guarantee now.

comment:16 in reply to:  15 Changed 14 months ago by dfeuer

Replying to YitzGale:

Does this still guarantee

read "xxxx" == xxxx

for every Fractional literal xxxx that parses, for all built-in instances of Fractional? I believe we have that guarantee now.

Thanks to the type-directed nature of Read, it shouldn't affect them one bit. And if it does, we can certainly fix that.

comment:17 Changed 14 months ago by YitzGale

Summary: Make Read instances faster, and make them fail fastMake Read instances for Integral types faster, and make them fail fast

Sorry, I was confused by a parallel conversation on the mailing list about parsers for Float and Double. Now I see that this ticket is only about Integral types. Perhaps the title of this ticket should be changed to make that clear.

comment:18 Changed 14 months ago by int-e

It's bad enough for integral types:

> reads "\t\n\v\f\r \160\5760\6158\8192\8193\8194\8195\8196\8197\8198\8199\8200\8201\8202\8239\8287\12288 ( ( - 1 ) )  " :: [(Integer,String)]
[(-1,"  ")]

(Interestingly, read "-(1)" fails)

comment:19 Changed 13 months ago by Ben Gamari <ben@…>

In bef7e78/ghc:

Read parentheses better

Instead of pulling a token and looking for `'('` or `')'`,
just look for the character itself. This prevents us from
lexing every single item twice, once to see if it's a
left parenthesis and once to actually parse it.

Partially fixes #12665

Make parens faster more aggressively

* Strip spaces before parsing, so we never have to strip
the same spaces twice.

* String parsers together manually, to try to avoid unnecessary closure
creation.

Test Plan: Validate

Reviewers: austin, hvr, bgamari

Reviewed By: bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2630

GHC Trac Issues: #12665

comment:20 Changed 9 months ago by bgamari

Milestone: 8.2.18.4.1

This won't happen for 8.2.

Note: See TracTickets for help on using tickets.