Changes between Version 2 and Version 3 of HaddockComments


Ignore:
Timestamp:
Oct 21, 2006 9:21:40 PM (9 years ago)
Author:
waern
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • HaddockComments

    v2 v3  
    11= Work in Progress =
    22= A description of the Haddock comment support in GHC =
    3 Haddock comment support was added to GHC as part of a [http://code.google.com/soc Google Summer Of Code] project. The aim of the project was to  port the existing Haddock program to use the GHC API. The project is now over -- GHC can understand Haddock comments and they are available through the GHC API.
     3Haddock comment support was added to GHC as part of a [http://code.google.com/soc Google Summer Of Code] project. The aim of the project was to  port the existing Haddock program to use the GHC API. Since the project is now over, GHC can understand Haddock comments (here called doc comments) and they are available through the GHC API. This is a very rough overview of the implementation.
    44
     5To turn this extension on, you supply the -haddock flag on the command line. Then doc comments are lexed, parsed and renamed and end up both in the parsed and renamed abstract syntax. Without the -haddock flag, GHC behaves just like normal, i.e doc comments are treated just like normal comments.
    56
     7= Lexer details =
     8{{{
     9data Token =
     10  ..
     11  ..
     12  | ITdocCommentNext  String     -- something beginning '-- |'
     13  | ITdocCommentPrev  String     -- something beginning '-- ^'
     14  | ITdocCommentNamed String     -- something beginning '-- $'
     15  | ITdocSection      Int String -- a section heading
     16  | ITdocOptions      String     -- doc options (prune, ignore-exports, etc)
     17}}}
     18In the lexer, doc comments are recognized as tokens. There are four types of doc comments at this level, each having its own token. Each token contains the entire comment string.
     19
     20Just like the old Haddock, we support "next" and "previous"-type comments, "named" comments and section headings. The options token is used for   specifiying Haddock options. Options are specified using a pragma, like this: {-# DOCOPTIONS prune, ignore-exports }. You can no longer specify them using dash comments (e.g -- # prune).
     21
     22= Parser details =
     23The doc tokens can appear in a lot of places in the grammar and having a look at [[GhcFile(compiler/parser/Parser.y.pp)]] is probably the best way to get an overview of this.   
     24
     25When a doc token is encountered by the parser, it tries to parse the content of the token. This is done by invoking a special Alex lexer ([[GhcFile(compiler/parser/HaddockLex.x)]]) and Happy parser ([[GhcFile(compiler/parser/HaddockParse.y)]]), taken directly from the old Haddock sources. This process turns the token into a value of type {{{HsDoc RdrName}}}, representing the (internal structure of the) comment. It can then be stored in the Haskell AST by the parser at the appropriate place. A lot of places (constructors) in the AST definition ([[GhcFile(compiler/hsSyn)]]) allow {{{HsDoc}}}s, and more can be added. 
     26
     27= Binding groups =
     28Before the renaming phase, GHC restructures function definitions into binding groups. This is done by going through the list of {{{HsDecl}}}s representing the top declarations of the source file, grouping different type of declarations together.
     29
     30We do this with the top level doc comments as well. There's a problem though: An external program must be able to use the GHC API to associate multiple "next" and "prev" style comments to the right Haskell binding. This can be done by looking at the parsed syntax tree, where the file structure is preserved. But, by going through this restructuring, the renamed syntax loose this structure. We want to be able to use the renamed syntax, so instead of just grouping the comments together, we let the grouping process return a list of {{{DocEntity}}}:
     31{{{
     32-- source code entities, for representing the module structure
     33data DocEntity name
     34  = DeclEntity name
     35  | DocEntity (DocDecl name)
     36}}}
     37
     38An external program can now figure out which doc comment belongs to what "entity", i.e what Haskell binding. This solution is also used for the method declarations in classes.
     39
     40== The renamer ==
     41The doc comments go through the renamer, and the reason is that an {{{HsDoc}}} can contain a reference to an identifier. It can be important for users of the GHC API to get hold of comments that contain the original name of references ({{{HsDoc Name}}}).