Opened 8 years ago

Last modified 21 months ago

#3645 new feature request

Layout and pragmas

Reported by: igloo Owned by:
Priority: low Milestone:
Component: Compiler (Parser) Version: 6.10.4
Keywords: Cc: erik.flister@…, michal.terepeta@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

With this module:

{-# LANGUAGE DeriveDataTypeable,
             FlexibleContexts
#-}

module Foo where

GHC 6.12 says:

    Cannot parse LANGUAGE pragma
    Expecting comma-separated list of language options,
    each starting with a capital letter
      E.g. {-# LANGUAGE RecordPuns, Generics #-}

but this should probably be allowed. See #3519, #3616.

Attachments (2)

3645.dpatch (21.0 KB) - added by boris 7 years ago.
allow-pragma-closing-directly-after-_n_.dpatch (15.4 KB) - added by michalt 7 years ago.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 8 years ago by eflister

just to be super pedantic, i want to make sure the following would be ok too. :) afaik, lots of people use this kind of layout to make toggling lines via comments as easy as possible.

{-# LANGUAGE
      EmptyDataDecls
    , MultiParamTypeClasses
#-} 

comment:2 Changed 8 years ago by eflister

whoops, to be really explicit, i should include a comment example:

{-# LANGUAGE
      EmptyDataDecls
    , MultiParamTypeClasses
--  , RecordPuns
    , Generics
--  , DeriveDataTypeable
    , FlexibleContexts
#-} 

comment:3 Changed 8 years ago by eflister

Cc: erik.flister@… added

comment:4 Changed 8 years ago by guest

Type: bugfeature request
Type of failure: None/Unknown

Changed 7 years ago by boris

Attachment: 3645.dpatch added

comment:5 Changed 7 years ago by boris

Please, review the patch. The problem was that line

<0,option_prags> \n				{ begin bol }

switched alex state to to bol. If there is space between newline and "#-}", the previous state is popped in do_bol function and end of pragma successfully matches in option_prags state. But without any space the beginning of "#-}" will be matched in bol state and the rest will fail to match.

Perhaps the patch is a little hacky and it is better to modify bol part.

Changed 7 years ago by michalt

comment:6 Changed 7 years ago by michalt

Cc: michal.terepeta@… added
Status: newpatch

I'm not sure if this is the right approach either. But if it is, I'd prefer to ensure that we match any number of newlines/whitespace..

comment:7 Changed 7 years ago by boris

Matching all whitespaces is redundant, although it looks not as hacky. Normally whitespaces are matched in another place. We have to match \n only to prevent entering into bol state. I did some more tracing and found that in function do_bol clause

	    GT -> do
		_ <- popLexState
		lexToken

is executed. I think that in proper solution lexToken should match "#-}" and close the pragma.

comment:8 Changed 7 years ago by michalt

What do you mean redundant? Example:

{-# LANGUAGE EmptyDataDecls
           , DeriveDataTypeable

#-}

module Foo where

import Data.Typeable

data Test = Test
  deriving (Typeable)

With

  "#-}"                                { endPrag }
  \n"#-}"                              { endPrag }

I'm getting:

Test.hs:2:14:
    Cannot parse LANGUAGE pragma
    Expecting comma-separated list of language options,
    each starting with a capital letter
      E.g. {-# LANGUAGE RecordPuns, Generics #-}

And with

  $whitechar* "#-}"              { endPrag }

It compiles just fine.

Also I'm really not convinced that it should be lexToken to match the closing. After all this function doesn't really match any actual tokens - it just checks for lexing errors/eof, updates lexer state, etc.

comment:9 Changed 7 years ago by boris

You are right, \n"#-}" is not enough. I missed the case when several \n occur. Function lexToken calls alexScanUser, so it should match some tokens, should not it? I am new to Alex and can be mistaken. Can you explain why it does not match any tokens?

It would be nice to allow entering into bol state on \n(as before any patches) and exit it on some condition(without hardcoding "#-}") so that "#-}" can be matched. If it is not possible, I support the whitespace solution.

comment:10 in reply to:  9 Changed 7 years ago by michalt

Replying to boris:

You are right, \n"#-}" is not enough. I missed the case when several \n occur. Function lexToken calls alexScanUser, so it should match some tokens, should not it? I am new to Alex and can be mistaken. Can you explain why it does not match any tokens?

Sorry for confusion, I meant that lexTokens only checks for eof/errors and calls an appropriate action to create token, but does not really directly inspect the input (it just uses alexScanUser). And that's why I don't think it is the right place to put some special case for closing the pragma.

Anyway, I guess the best thing right now is to wait for some GHC dev to have a look at this.

comment:11 Changed 7 years ago by igloo

Milestone: 7.0.17.0.2

comment:12 Changed 7 years ago by igloo

Milestone: 7.0.27.2.1

comment:13 Changed 7 years ago by igloo

Status: patchnew

I'm not sure what the spec is.

Should this be rejected (as the #-} starts a new decl)?

foo = 'a'

{-# INLINE bar
#-}
bar = 'b'

Should this be rejected (as the {-# decl doesn't have the correct indentation)?

foo = 'a'

   {-# INLINE bar #-}
bar = 'b'

The INLINE pragma looks like a declaration, but the SCC pragma looks like an expression, so how about this?:

foo = {-# SCC bar
#-} 'a'

?

Should the layout rule apply to the LANGUAGE pragma in any way? I assume not, as we haven't had the where from the module line yet.

IMO, the particular bug in this ticket is that GHC thinks it is doing implicit layout, when it shouldn't be.

comment:14 Changed 6 years ago by igloo

Milestone: 7.2.17.4.1

One possibility is something along these lines:

diff --git a/compiler/main/HeaderInfo.hs b/compiler/main/HeaderInfo.hs
index a3f7e79..738f4f8 100644
--- a/compiler/main/HeaderInfo.hs
+++ b/compiler/main/HeaderInfo.hs
@@ -39,6 +39,7 @@ import Exception
 import Control.Monad
 import System.IO
 import System.IO.Unsafe
+import Data.Char
 import Data.List
 
 ------------------------------------------------------------------------------
@@ -227,24 +228,31 @@ getOptions' toks
               | ITdocOptionsOld str <- getToken open
               = map (L (getLoc open)) ["-haddock-opts", removeSpaces str]
                 ++ parseToks xs
-          parseToks (open:xs)
-              | ITlanguage_prag <- getToken open
-              = parseLanguage xs
+          parseToks (open:close:xs)
+              | ITlanguage_prag str <- getToken open
+              , ITclose_prag     <- getToken close
+              = parseLanguage (getLoc open) str
+                ++ parseToks xs
           parseToks (x:xs)
               | ITdocCommentNext _ <- getToken x
               = parseToks xs
           parseToks _ = []
-          parseLanguage (L loc (ITconid fs):rest)
-              = checkExtension (L loc fs) :
-                case rest of
-                  (L _loc ITcomma):more -> parseLanguage more
-                  (L _loc ITclose_prag):more -> parseToks more
-                  (L loc _):_ -> languagePragParseError loc
-                  [] -> panic "getOptions'.parseLanguage(1) went past eof token
-          parseLanguage (tok:_)
-              = languagePragParseError (getLoc tok)
-          parseLanguage []
-              = panic "getOptions'.parseLanguage(2) went past eof token"
+
+          parseLanguage loc str
+              = map (checkExtension loc) $ splits (dropWhile isSpace str)
+              where isSepChar c = isSpace c || c == ','
+                    splits [] = languagePragParseError loc
+                    splits (',' : _) = languagePragParseError loc
+                    splits xs0 = case break isSepChar xs0 of
+                                 (extension, xs1) ->
+                                     extension
+                                   : (case dropWhile isSpace xs1 of
+                                      ',' : xs2 ->
+                                          splits (dropWhile isSpace xs2)
+                                      [] ->
+                                          []
+                                      _ ->
+                                          languagePragParseError loc)
 
 -----------------------------------------------------------------------------
 
@@ -263,14 +271,13 @@ checkProcessArgsResult flags
 
 -----------------------------------------------------------------------------
 
-checkExtension :: Located FastString -> Located String
-checkExtension (L l ext)
+checkExtension :: SrcSpan -> String -> Located String
+checkExtension l ext
 -- Checks if a given extension is valid, and if so returns
 -- its corresponding flag. Otherwise it throws an exception.
- =  let ext' = unpackFS ext in
-    if ext' `elem` supportedLanguagesAndExtensions
-    then L l ("-X"++ext')
-    else unsupportedExtnError l ext'
+ =  if ext `elem` supportedLanguagesAndExtensions
+    then L l ("-X" ++ ext)
+    else unsupportedExtnError l ext
 
 languagePragParseError :: SrcSpan -> a
 languagePragParseError loc =
diff --git a/compiler/parser/Lexer.x b/compiler/parser/Lexer.x
index 90e1e66..754247b 100644
--- a/compiler/parser/Lexer.x
+++ b/compiler/parser/Lexer.x
@@ -480,7 +480,7 @@ data Token
   | ITclose_prag
   | IToptions_prag String
   | ITinclude_prag String
-  | ITlanguage_prag
+  | ITlanguage_prag String
   | ITvect_prag
   | ITvect_scalar_prag
   | ITnovect_prag
@@ -2233,7 +2233,7 @@ linePrags = Map.singleton "line" (begin line_prag2)
 fileHeaderPrags = Map.fromList([("options", lex_string_prag IToptions_prag),
                                  ("options_ghc", lex_string_prag IToptions_prag
                                  ("options_haddock", lex_string_prag ITdocOptio
-                                 ("language", token ITlanguage_prag),
+                                 ("language", lex_string_prag ITlanguage_prag),
                                  ("include", lex_string_prag ITinclude_prag)])
 
 ignoredPrags = Map.fromList (map ignored pragmas)

This handles LANGUAGE in the same way we handle OPTIONS, just reading the contents as a String, and happily gets rids of a couple of panics in the process.

However, locations are slightly worse (as we don't have locations for the individual extensions), and you can no longer have nested pragmas or comments. This might be OK, except that currently hsc2hs turns

{-# LANGUAGE CPP
           , ForeignFunctionInterface
           , GeneralizedNewtypeDeriving
           , NoImplicitPrelude
           , BangPatterns
  #-}

in libraries/base/GHC/Event/EPoll.hsc into

{-# LINE 1 "libraries/base/./GHC/Event/EPoll.hsc" #-}
{-# LANGUAGE CPP
{-# LINE 2 "libraries/base/./GHC/Event/EPoll.hsc" #-}
           , ForeignFunctionInterface
           , GeneralizedNewtypeDeriving
           , NoImplicitPrelude
           , BangPatterns
  #-}

comment:15 Changed 6 years ago by igloo

Milestone: 7.4.17.6.1
Priority: normallow

comment:16 Changed 5 years ago by igloo

Milestone: 7.6.17.6.2

comment:17 Changed 3 years ago by thoughtpolice

Milestone: 7.6.27.10.1

Moving to 7.10.1.

comment:18 Changed 3 years ago by thomie

#5401 is a duplicate of this ticket.

comment:19 Changed 3 years ago by thoughtpolice

Milestone: 7.10.17.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:20 Changed 2 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:21 Changed 21 months ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.