|Version 8 (modified by chevalier@…, 9 years ago) (diff)|
The ExternalCore type
The ExternalCore data type is used by GHC to communicate code represented in the Core data type with the outside world. It comes with an external syntax, a parser, a pretty printer, and code to convert between Core and External Core. Unfortunately, External Core has not been widely used, and the code has bit-rotted. The recent changes in Core to use System FC have exacerbated the problem. This page documents the process of getting External Core and Core back in sync.
Once the process is finished, this page will just describe the design.
The current plan is to use an "extended" version of interface files for External Core, which contains unfoldings for all functions, not just functions GHC has decided to unfold.
Reading in External Core
The pipeline looks like:
file -> GHC parser -> IfaceSyn -> tcRnExtCore -> ModGuts -> (the rest of the compiler)
This is a change from the current External Core implementation, where HsSyn is used to represent types from External Core files and IfaceSyn is used for terms. In the new implementation, IfaceSyn is used for both.
Goals and questions
- Well-defined external format with stand-alone tools
- External tools will have to be maintained in order to stay in sync with the interface file format
- How external is "external"? There is a tension between re-using code from GHC, and having a truly independent file format that can be processed with completely stand-alone tools.
- It's already possible to use the GHC API to generate Core (though not yet to read it back in), which might be enough for some users. On the other hand, the external format allows for writing tools to manipulate Core in languages other than Haskell.
- External format should be readable by humans (though perhaps only after processing it with a pretty-printing tool)
- Not too redundant (for example, only print out type information that is necessary to reconstruct types)
- Don't export information that's internal to GHC (i.e., IdInfo fields), since external transformations probably won't preserve it anyway
- Corollary -- include only just enough information for external tools to be useful
- Does it still make sense to have a separate External Core datatype?
- Primitives have to be documented properly in order to write an stand-alone Core interpreter (which would eventually be desirable.)
- External toolset: typechecker, interpreter (operational semantics), ...
- We want to show that External Core is truly "independent", but on the other hand, maintaining these tools is a challenge.
- How likely are major changes to Core in the future?
- Should the external format look like -ddump-simpl output (as it does now), or should it be an easier-to-parse format like s-expressions (perhaps with a pretty-printer to help with debugging)?
The main source files related to External Core:
- compiler/coreSyn/ExternalCore.lhs: The definition of the External Core data type.
- compiler/coreSyn/MkExternalCore.lhs: Some code to convert Core to External Core.
- compiler/coreSyn/PprExternalCore.lhs: Some code to pretty-print External Core.
- compiler/parser/LexCore.hs: The lexer for External Core.
- compiler/parser/ParserCore.y: The parser for External Core.
- compiler/parser/ParserCoreUtils.hs: Some additional utility functions used by ParserCore.hs.
- utils/ext-core/: Old code intended as an executable specification of External Core.
Other files that contain some reference to External Core or are otherwise relevant:
- compiler/coreSyn/PprCore.lhs: Some code to pretty-print the Core data type.
- compiler/hsSyn/HsSyn.lhs: Top-level syntax tree representations for various things GHC can read, including External Core.
- compiler/main/DriverPhases.hs: Includes code to decide how to parse things based on file extension.
- compiler/main/HscMain.lhs: The main compiler pipeline.
- http://research.microsoft.com/~simonpj/papers/ext-f: Description of the System FC language which GHC now uses internally.
- docs/ext-core/: The current documentation for External Core, which should eventually become a chapter in the GHC User's Guide.
- http://www.haskell.org/ghc/docs/latest/html/users_guide/ext-core.html: What the User's Guide currently has to say about External Core.
- External Core originally parsed into a list of TyClDecl and a list of IfaceBinding. It now seems as though it might be better to replace the IfaceBinding with LHsDecl. This would require us to:
- Add a new data constructor for HsBind: data HsBind id = ... | CoreBind id (ExtCore id)
- Extend the renamer to rename ExtCore RdrName to ExtCore Name
- Extend the type checker to typecheck ExtCore Name to generate ExtCore Id
- Extend the desugarer to desugar ExtCore Id to Core
- We probably want to represent all data types as GADTs, even if they can be represented in Haskell 98 form, so that we only have one representation.
- Define an external text representation for External Core (which will probably be simply a minor modification of the old format) (mostly done?)
- Update the External Core data type to be compatible with the current Core data type. (mostly done)
- Update PprExternalCore.lhs to print stuff that LexCore and ParserCore can understand. (mostly done)
- Update MkExternalCore.lhs to support both the current Core and the new External Core. (mostly done)
- Update the parser to recognize the new external syntax, generating an empty module at first. (partly done)
- Update the parser to generate LHsBind rather than IfaceBinding
- Convert the current External Core documentation (in LaTeX) into a chapter (in XML) in the User's Guide.
- The LaTeX documentation describes PrimOps in some detail. This information is now in the library documentation, so it is probably not needed in the External Core chapter.