Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#10444 closed bug (fixed)

Text.Read.Lex.lex broken

Reported by: strake888 Owned by:
Priority: normal Milestone: 8.0.1
Component: Core Libraries Version: 7.10.1
Keywords: report-impact Cc: core-libraries-committee@…, mfdyck@…, hvr, ekmett
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Incorrect result at runtime Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D1122, Phab:D1480.
Wiki Page:

Description (last modified by bgamari)

Prelude> lex "&) = mempty"
[("&",") = mempty")]
Prelude> lex "∘) = mempty"
[]

I traced this problem to Text.Read.Lex.lex

Attachments (3)

0001-unbreak-Text.Read.Lex.lex.patch (3.7 KB) - added by mfdyck.google 4 years ago.
0001-unbreak-Text.Read.Lex.lex.2.patch (16.2 KB) - added by mfdyck.google 4 years ago.
0001-unbreak-Text.Read.Lex.lex.3.patch (1.9 KB) - added by mfdyck.google 3 years ago.
Behave as GHC lexer

Download all attachments as: .zip

Change History (19)

comment:1 Changed 4 years ago by mfdyck.google

I have a patch for this but I need Google to release it; asking

Changed 4 years ago by mfdyck.google

comment:2 Changed 4 years ago by mfdyck.google

Status: newpatch

comment:3 Changed 4 years ago by mfdyck.google

Cc: mfdyck@… added

comment:4 Changed 4 years ago by simonpj

Milestone: 7.12.1

comment:5 Changed 4 years ago by rwbarton

Cc: hvr added

Hi,

I agree that this is a deviation from the Report-specified behavior of lex, since ∘ is a Unicode symbol character; and thanks for the patch. However, we generally try fairly hard not to introduce new boot files in base.

ccing hvr, who is most likely to know off-hand: can this import cycle be worked around easily (probably by moving isPunctuation and isSymbol into GHC.Unicode)?

comment:6 Changed 4 years ago by mfdyck.google

isPunctuation and isSymbol are defined in terms of GeneralCategory, which derives Read, and GHC.Read imports Text.Read.Lex. We could move GeneralCategory and generalCategory to GHC.Unicode and standalone derive Read instance in GHC.Read or Data.Char; acceptable?

comment:7 in reply to:  6 Changed 4 years ago by rwbarton

Replying to mfdyck.google:

isPunctuation and isSymbol are defined in terms of GeneralCategory, which derives Read, and GHC.Read imports Text.Read.Lex.

Ah, gotcha.

We could move GeneralCategory and generalCategory to GHC.Unicode and standalone derive Read instance in GHC.Read or Data.Char; acceptable?

Probably more acceptable than the boot file; I'll defer to hvr on this subject though.

Changed 4 years ago by mfdyck.google

comment:8 Changed 3 years ago by bgamari

Differential Rev(s): Phab:D1122.

I have opened Phab:D1122 to track an updated version of this patch. The refactoring of GeneralCategory is tracked in Phab:D1121.

comment:9 Changed 3 years ago by bgamari

Description: modified (diff)
Summary: Tex.Read.Lex.lex brokenText.Read.Lex.lex broken

comment:10 Changed 3 years ago by Austin Seipp <austin@…>

In e4a73f4f/ghc:

Move GeneralCategory et al to GHC.Unicode

This allows these to be used from Text.Read.Lex import cycles.

Reviewed By: thomie, austin

Differential Revision: https://phabricator.haskell.org/D1121

GHC Trac Issues: #10444

comment:11 Changed 3 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:12 Changed 3 years ago by thomie

Owner: ekmett deleted

Changed 3 years ago by mfdyck.google

Behave as GHC lexer

comment:13 Changed 3 years ago by mfdyck.google

Cc: ekmett added
Differential Rev(s): Phab:D1122.Phab:D1122, Phab:D1480.

comment:14 Changed 3 years ago by Ben Gamari <ben@…>

In fce0465/ghc:

Unbreak Text.Read.Lex.lex on Unicode symbols

Reviewers: thomie, hvr, austin, bgamari

Reviewed By: bgamari

Subscribers: bgamari, thomie

Differential Revision: https://phabricator.haskell.org/D1480

GHC Trac Issues: #10444

comment:15 Changed 3 years ago by bgamari

Resolution: fixed
Status: patchclosed

comment:16 Changed 3 years ago by ekmett

Keywords: report-impact added
Note: See TracTickets for help on using tickets.