Changes between Version 46 and Version 47 of SafeHaskell


Ignore:
Timestamp:
May 10, 2011 11:28:26 PM (4 years ago)
Author:
dterei
Comment:

Update to reflect current implementation

Legend:

Unmodified
Added
Removed
Modified
  • SafeHaskell

    v46 v47  
    2626
    2727
    28 == Safe Language ==
     28== Safe Language Overview ==
    2929
    3030As long as no module compiled with -XTrustworthy contains a vulnerability, the goal of the Safe dialect (used through either `-XSafe` or `-XSafeLanguage`) is to guarantee the following properties:
     
    3434 * '''Constructor access control.'''  Safe code must not be able to examine or synthesize data values using data constructors the module cannot import.
    3535
    36  * '''Semantic consistency.'''  Any expression that compiles both with and without the import of a Safe module must have the same meaning in both cases.  (E.g., {{{1 + 1 == 3}}} must remain `False` when you add the import of a Safe module.)
     36 * '''Semantic consistency.'''  Any expression that compiles both with and without the import of a Safe module must have the same meaning in both cases.  (E.g., {{{1 + 1 == 3}}} must remain `False` when you add the import of a Safe module.) Safe Haskell reduces the scope of compilable Haskell code but it shouldn't change its meaning.
    3737
    3838The Safe dialect is intended to be of use for both trusted and untrusted code. It can be used for trusted code as a way to enforce good programming style (through `-XSafeLanguage`). It is also useful on untrusted code to allow that code to be trusted (through `-XSafe`). Please keep in mind though that the issue of trust is at a higher level than the safe dialect. Using the safe dialect doesn't automatically imply trust, trust is defined separately below.
     
    5454The !SafeHaskell project will introduce two new GHC LANGUAGE options. Intuitively:
    5555  * `-XSafe`: enables the Safe Language dialect of Haskell in which GHC rejects any module that might produce unsafe effects or otherwise subvert the type system.
    56   * `-XTrustworthy`: means that, though a module may invoke unsafe functions internally, the module's author claims that the set of exported symbols cannot be used in an unsafe way.
     56  * `-XTrustworthy`: means that, though a module may invoke unsafe functions internally, the module's author claims that the exported API cannot be used in an unsafe way.
    5757
    5858A '''client''' (C) is someone compiling a source module with GHC.
     
    8080When required we will differentiate between `-XSafe` and `-XTrustworthy` using '''safe''' and '''trustworthy''' respectively.
    8181
     82
    8283=== Intuition ===
    8384
    8485The intuition is this. The '''author''' of a package undertakes the following obligations:
    8586  * When the author of code compiles it with -XSafe, he asks the compiler to check that it is indeed safe. He takes on no responsibility himself. Although he must trust imported packages in order to compile his package, he takes not responsibility for them.
    86   * When the author of code compiles it with -XTrustworthy he takes on responsibility for the stafety of that code, under the assumption that safe imports are indeed safe.
     87  * When the author of code compiles it with -XTrustworthy he takes on responsibility for the safety of that code, under the assumption that safe imports are indeed safe.
    8788
    8889When a '''client''' C trusts package P, he expresses trust in the author of that code. But since the author makes no guarantees about safe imports, C may need to chase dependencies to decide which modules in P should be trusted by C.
     
    110111
    111112
    112 === Safe Language & Imports (Without Trust) ===
     113=== Safe Language & Imports Without Trust ===
    113114
    114115We also want to be able to enable the safe dialect and safe import extensions without any corresponding trust assertion for the code:
    115116
    116  * `-XSafeImports` ('''previously''' `-XUntrustworthy`) enables the safe import extension. Module M is left untrusted though. (See [#UsecasesforSafeImports use cases])
    117  * `-XSafeLanguage` ('''previously''' `-XUntrustworthy` `-XSafe`) enables the safe language (and therefore safe imports). Module M is left untrusted though. (See [#UsecasesforSafeLanguage use cases])
     117 * `-XSafeImports` enables the safe import extension. Module M is left untrusted though. (See [#UsecasesforSafeImports use cases])
     118 * `-XSafeLanguage` enables the safe language (and therefore safe imports). Module M is left untrusted though. (See [#UsecasesforSafeLanguage use cases])
    118119
    119120We see these being used both for good coding style and more flexibility during development of trusted code. We have this relation between the flags:
     
    124125
    125126In Summary we have the following LANGUAGE options and affects:
    126   * `-XSafe`:
     127  * '''`-XSafe`''':
    127128     To be trusted, all of the module's direct imports must be
    128129     trusted, but the module itself need not reside in a trusted
     
    132133     tagged.
    133134
    134   * `-XSafeLanguage`:
     135     '''Module Trusted''': Yes [[BR]]
     136     '''Haskell Language''': Restricted to Safe Language [[BR]]
     137     '''Imported Modules''': All forced to be safe imports, all must be trusted. [[BR]]
     138
     139  * '''`-XSafeLanguage`''':
    135140     The module is never trusted, because the author does not claim
    136141     it is trustworthy.  As long as the module compiles both ways,
    137142     the result is identical whether or not the `-XSafeLanguage` flag
    138      is supplied.  As with `-XSafe`, "safe" is allowed but
    139      meaningless--all imports must be safe.
    140 
    141   * `-XSafeLanguage -XTrustworthy`:
    142      To be trusted, the module itself must reside in a trusted
    143      package.  However, not all of its imports need be trusted, only
    144      those flagged safe.  Thus, the "safe" keyword has no effect on
    145      whether or not compilation succeeds, but it has an effect on
    146      when the resulting file may be trusted.
    147 
    148   * `-XTrustworthy`:
    149      The effect is identical to `-XSafeLanguage -XTrustworthy`, except
    150      that the compiler accepts a wider range of programs.  In
    151      particular, you can import System.IO.Unsafe, do foreign imports
    152      without IO, and write any code that would ordinarily compile.
    153 
    154   * `-XSafeImport`:
     143     is supplied.  As with `-XSafe`, the "safe" import keyword is
     144     allowed but meaningless--all imports must be safe.
     145
     146     '''Module Trusted''': No [[BR]]
     147     '''Haskell Language''': Restricted to Safe Language [[BR]]
     148     '''Imported Modules''': All forced to be safe imports, all must be trusted. [[BR]]
     149
     150  * '''`-XTrustworthy`''':
     151     This establishes that the module is trusted, but the guarantee is
     152     provided by the module's author. A client of this module then
     153     specifies that they trust the module author by specifying they
     154     trust the package containing the module. '-XTrustworthy' has
     155     no effect on the accepted range of Haskell programs or their
     156     semantics.
     157
     158     '''Module Trusted''': Yes but only if Package the module resides in is also trusted. [[BR]]
     159     '''Haskell Language''': Unrestricted [[BR]]
     160     '''Imported Modules''': Under control of module author which ones must be trusted. [[BR]]
     161
     162  * '''`-XSafeLanguage -XTrustworthy`''':
     163     For the trust property this has the same effect as '-XTrustworthy'
     164     by itself. However unlike `-XTrustworthy` it also restricts the
     165     range of acceptable Haskell programs to the Safe language. The
     166     difference from this and using `-XSafe` is the different trust
     167     type and that not all imports are forced to be safe imports, they
     168     are instead optionally specified by the module author.
     169
     170     '''Module Trusted''': Yes but only if Package the module resides in is also trusted. [[BR]]
     171     '''Haskell Language''': Restricted to Safe Language [[BR]]
     172     '''Imported Modules''': Under control of module author which ones must be trusted. [[BR]]
     173
     174  * '''`-XSafeImport`''':
    155175     Enable the Safe Import extension so that a module can require
    156176     a dependency to be trusted without asserting any trust about itself.
    157177
     178     '''Module Trusted''': No [[BR]]
     179     '''Haskell Language''': Unrestricted [[BR]]
     180     '''Imported Modules''': Under control of module author which ones must be trusted. [[BR]]
     181
     182
    158183=== Specifying Package Trust  ===
    159184
     
    171196=== Interaction of Options ===
    172197
    173 The `-XSafe`, `-XTrustworthy`, `-XSafeLanguage` and `-XSafeImport` GHC LANGUAGE options are all order independent. When they are used they disable certain other GHC LANGUAGE and OPTIONS_GHC options. There are some options though that while disabled for source file pragmas are allowed when used on the command line. The idea behind this is that in source pragmas are generally specified by the module author, who is untrusted, while command line options are specified by the client since they are compiling the module, who has to be trusted. In the case of Cabal files, while they are specified by the untrusted module author, since it is a single source file it is easy to validate by hand. Below follow the new !SafeHaskell options and what they disallow:
     198The `-XSafe`, `-XTrustworthy`, `-XSafeLanguage` and `-XSafeImport` GHC LANGUAGE options are all order independent. When they are used they disable certain other GHC LANGUAGE and OPTIONS_GHC options.
     199
    174200
    175201  * '''`-XSafe`''':
    176     * '''Disallowed completely''': `GeneralizedNewtypeDeriving`, `TemplateHaskell`, {{{RULES}}}, `-XSafeLanguage`
    177     * '''Only allowed on command line''': `-cpp`, `-pgm{L,P,lo,lc,m,s,a,l,dll,F,windres}`, `-opt{L,P,lo,lc,m,s,a,l,dll,F,windres}`, `-F`, `-l''lib''`, `-framework`, `-L''dir''`, `-framework-path''dir''`, `-main-is`, `-package-name`, `-D''symbol''`, `-U''symbol''`, `-I''dir''`, `-with-rts-opts`, `-dylib-install-name`, `-hcsuf`, `-hidir`, `-hisuf`, `-o`, `-odir`, `-ohi`, `-osuf`, `-stubdir`, `-outputdir`, `-tmpdir`
    178     * '''Restricted functionality''':
    179       * `OverlappingInstances` (requires that Overlapping instance declarations must either all reside in modules compiled without -XSafe, or else all reside in the same module.)
    180       * `ForeignFunctionInterface` (foreign imports must have an `IO` return type)
    181     * '''Doesn't Matter''': all remaining flags.
    182 
     202    * Enables the Safe Language dialect which disallows the use of some LANGUAGE and OPTIONS, as well as restricting how certain Haskell language features operate. See [#SafeLanguage] below for details.
    183203
    184204  * '''`-XTrustworthy`''' has no special interactions, except for
    185205    * If `-XSafeLanguage`: See summary of !SafeHaskell options at bottom of [#SafeLanguageImportsWithoutTrust Safe Language & Imports (Without Trust)]
    186206
     207  * '''`-XSafeLanguage`''' has the exact same restrictions as `-XSafe`. `-XSafe` and `-XSafeLanguage` can't be used together though simply as it is most likely a mistake by the module author of client if they do.
     208
    187209  * '''`-XSafeImports`''' has no special interactions. `-XSafe`, `-XTrustworthy` and `-XSafeLanguage` are all compatible with it as they all imply `-XSafeImports` anyway.
    188  
    189   * '''`-XSafeLanguage`''' has the exact same restrictions as `-XSafe`. `-XSafe` and `-XSafeLanguage` can't be used together though simply as it is most likely a mistake by the module author of client if they do.
    190 
    191 
    192 == Safe Language Threats ==
    193 
    194 '''SLPJ note''': we should enumerate precisely what is and is not allowed with `-XSafe`.  '''End of note'''
    195 
    196 The following aspects of Haskell can be used to violate the safety goal, and thus need to be disallowed or modified for the Safe dialect. ''Please add more issues to this list, as some are likely missing.''
     210
     211
     212== Safe Language ==
     213
     214The Safe Language restricts things in three different ways:
     215
     216  1. Certain GHC OPTIONS and LANGUAGE extensions are only allowed on the command line and not in source PRAGMAS.
     217  2. Certain GHC LANGUAGE extensions are disallowed completely.
     218  3. Certain GHC LANGUAGE extensions are restricted in functionality.
     219
     220The idea behind this divide is that source pragmas are generally specified by the module author, who is untrusted, while command line options are specified by the client since they are compiling the module, who has to be trusted. In the case of Cabal files, while they are specified by the untrusted module author, since it is a single source file it is easy to validate by hand. Below is precisely what flags and extensions fall into each category:
     221
     222    * '''Only allowed on command line''': `-cpp` and `-XCPP`, `-pgm{L,P,lo,lc,m,s,a,l,dll,F,windres}`, `-opt{L,P,lo,lc,m,s,a,l,dll,F,windres}`, `-F`, `-l''lib''`, `-framework`, `-L''dir''`, `-framework-path''dir''`, `-main-is`, `-package-name`, `-D''symbol''`, `-U''symbol''`, `-I''dir''`, `-with-rts-opts`, `-rts-opts=`, `-dylib-install-name`, `-hcsuf`, `-hidir`, `-hisuf`, `-o`, `-odir`, `-ohi`, `-osuf`, `-stubdir`, `-outputdir`, `-tmpdir`, `-trust`, `-distrust`, `-distrust-all-packages`
     223    * '''Disallowed completely''': `GeneralizedNewtypeDeriving`, `TemplateHaskell`, `-XSafeLanguage`
     224    * '''Restricted functionality''': `OverlappingInstances`, `ForeignFunctionInterface`, {{{RULES}}}
     225      * See [#RestrictedandDisabledGHCHaskellFeatures Restricted Features below]
     226    * '''Doesn't Matter''': all remaining flags.
     227
     228=== Restricted and Disabled GHC Haskell Features ===
     229
     230In the Safe language dialect we restrict the following Haskell language features:
     231
     232 * {{{ForeignFunctionInterface}}}: This is mostly safe, but {{{foreign import}}} declarations that import a function with a non-`IO` type are be disallowed. All FFI imports must reside in the IO Monad.
     233
     234 * {{{RULES}}}: As they can change the behaviour of trusted code in unanticipated ways, violating semantic consistency they are restricted in function. Specifically any {{{RULES}}} defined in a module M compiled with `-XSafe` or `-XSafeLanguage` are dropped. {{{RULES}}} defined in trustworthy modules that M imports are still valid and will fire as usual.
     235
     236 * {{{OverlappingInstances}}}: This extension can be used to violate semantic consistency, because malicious code could redefine a type instance (by containing a more specific instance definition) in a way that changes the behaviour of code importing the untrusted module. The extension is not disabled for a module M compiled with `-XSafe` or `-XSafeLanguage` but restricted. While M can defined overlapping instance declarations, they can only be used in M. If in a module N that imports M, at a call site that uses a type-class function there is a choice of which instance to use (i.e overlapping) and the most specific choice is from M (or any other Safe compiled module), then compilation will fail. It is irrelevant if module N is considered Safe, or Trustworthy or neither.
     237
     238In the Safe language dialect we disable completely the following Haskell language features:
     239
     240 * {{{GeneralizedNewtypeDeriving}}}: It can be used to violate constructor access control, by allowing untrusted code to manipulate protected data types in ways the data type author did not intend. I.e can be used to break invariants of data structures.
     241
     242 * {{{TemplateHaskell}}}: Is particularly dangerous, as it can cause side effects even at compilation time and can be used to access abstract data types. It is very easy to break module boundaries with TH.
     243
     244
     245=== Safe Language Threats (Libraries) ===
     246
     247The following aspects of Haskell can be used to violate the safety goal and thus need to be disallowed in the Safe dialect. Not that these are all threats that reside in libraries and as such are dealt with simply through correctly leaving their modules as untrusted. Threats that can't be dealt with this way are those that are either language features that are disallowed completely or restricted as list above.
    197248
    198249 * Some symbols in {{{GHC.Prim}}} can be used to do very unsafe things.  At least one of these symbols, {{{realWorld#}}}, is magically exported by {{{GHC.Prim}}} even though it doesn't appear in the {{{GHC.Prim}}} module export list.  ''Are there other such magic symbols in this or other modules?''
     
    204255 * Certain exposed constructors of otherwise mostly safe data types allow unsafe actions.  For instance, the {{{PS}}} constructor of {{{Data.ByteString.ByteString}}} contains a pointer, offset, and length.  Code that can see the pointer value can act in a non-deterministic way by depending on the address rather than value of a {{{ByteString}}}.  Worse, code that can use {{{PS}}} to construct {{{ByteString}}}s can include bad lengths that will lead to stray pointer references.
    205256
    206  * The {{{ForeignFunctionInterface}}} extension is mostly safe, but {{{foreign import}}} declarations that import a function with a non-`IO` type must be disallowed.
    207 
    208  * {{{TemplateHaskell}}} is also particularly dangerous, as it can cause side effects even at compilation time.
    209 
    210  * The {{{OverlappingInstances}}} extension can be used to violate semantic consistency, because malicious code could redefine a type instance (by containing a more specific instance definition) in a way that changes the behaviour of code importing the untrusted module. The extension is not disabled under `-XSafe` or `-XSafeLanguage`, instead it just requires that Overlapping instance declarations must either all reside in modules compiled without -XSafe, or else all reside in the same module.
    211 
    212  * Likewise, the {{{RULES}}} pragma can change the behavior of trusted code in unanticipated ways, violating semantic consistency.
    213 
    214  * {{{OPTIONS_GHC}}} is dangerous in unfiltered form.  Among other things, it could use `-trust` to trust packages the invoking user doesn't in fact trust.
    215 
    216  * Similarly, {{{GeneralizedNewtypeDeriving}}} can violate constructor access control, by allowing untrusted code to manipulate protected data types in ways the data type author did not intend.
    217 
    218257
    219258== Implementation details ==
    220259
    221 Determining trust requires two modifications to the way GHC manages modules.  First, the interface file format must change to record each module's trust type (Safe, Trustworthy, Untrusted). Second, we need compiler options to specify which packages are trusted by an application.
     260Determining trust requires two modifications to the way GHC manages modules.  First, the interface file format must change to record each module's trust type. Second, we need compiler options to specify which packages are trusted by an application.
    222261
    223262Currently, in any given run of the compiler, GHC classifies each package as either exposed or hidden.  To incorporate trust, we add a second bit specifying whether each package is trusted or untrusted. This bit will be controllable by two new options to `ghc-pkg`, `trust` and `distrust`, which are analogous to `expose` and `hide`.
    224263
    225  * We want to be able to change a package P from trusted to untrusted and then have compilation of code that directly or transversely depends on it to fail accordingly if it relies on that package being trusted. That is trust should be checked recursively at link time and not just for code being compiled. Having the interface file format record each modules trust type should be enough for this.
     264 * We want to be able to change a package P from trusted to untrusted and then have compilation of code that directly or transversely depends on it to fail accordingly if it relies on that package being trusted.
    226265   * If a module M is Untrusted then no further processing needs to be done.
    227    * If a module M is Safe then
     266   * If a module M is Safe then:
    228267     * At compile time we check each of M's imports are trusted
    229    * If a module M is Trustworthy then we handle it differently when linking than compiling:
    230      * At both link time and compile time M itself must be in a trusted package.
    231      * At compile time we check each of M's safe imports are trusted
    232      * At link time we don't check that M's safe imports are still considered trusted. The reasoning behind this is that at compile time we had a guarantee that any modules marked Trustworthy did indeed reside in a package P that was trusted. If at link time some of M's safe imports that are marked Trustworthy now reside in a package marked untrusted this is because the client C changed the package trust. Since C is the one guaranteeing trustworthy modules we believe its fine to not fail.
    233      * Guaranteeing trustworthy at link time wouldn't be too hard, it would just require we also record in the interface file format for modules marked as trustworthy, which of their dependencies were safe imports.
    234 
    235  * {{{GHC.Prim}}} will need to be made (or just kept) unsafe.
    236 
    237  * {{{-XSafe}}} should disallow the {{{TemplateHaskell}}}, {{{GeneralizedNewtypeDeriving}}}, and {{{CPP}}} language extensions, as well as the {{{RULES}}} pragma. (See [#InteractionofOptions Interaction of Options] above for details).
    238 
    239  * Overlapping instance declarations must either all reside in modules compiled without `-XSafe`, or else all reside in the same module.  It violates semantic consistency to allow Safe code to change the instance definition associated with a particular type.
    240 
    241  * {{{OPTIONS_GHC}}} pragmas will have to be filtered.  Some options, (e.g., `-fno-warn-unused-do-bind`) are totally fine, but many others are likely problematic (e.g., {{{-cpp}}}, which provides access to the local file system at compilation time, or {{{-F}}} which allows an arbitrary file to be executed, possibly even one named {{{/afs/}}}... and hence entirely under an attacker's control).
     268   * If a module M is Trustworthy then:
     269     * At compile time we check that M resides in a trusted package and that all of M's safe imports are trusted
     270
     271 * Need to modify GHC to implement the restrictions of the Safe Language as described [#RestrictedandDisabledGHCHaskellFeatures above]
    242272
    243273 * Libraries will progressively need to be updated to export trustable interfaces, which may require moving unsafe functions into separate modules, or adding new `{-# LANGUAGE Trustworthy #-}` modules that re-export a safe subset of symbols.  Ideally, most modules in widely-used libraries will eventually contain either `{-# LANGUAGE Safe -#}` or `{-# LANGUAGE Trustworthy -#}` pragmas, except for internal modules or a few modules exporting unsafe symbols.  Maybe haddock can add some indicator to make it obvious which modules are trustable and show the trust dependencies.
    244274
    245  * When `-XTrustworthy` and `-XSafeLanguage` are used together, the language is restricted to the Safe dialect and the module is marked as trusted.  Unlike `-XSafe` though all imports aren't forced to be safe imports and so trust is provided by the client C, not ghc.  A plausible use for this is to prune the list of trusted modules -- for instance, if a module imports a bunch of trusted modules but does not use any of their trusted features, or only uses those features in a very limited way.  If the code happens also to be safe, the programmer may want to add `-XSafeLanguage` to catch accidental unsafe actions.
     275 * Would be worthwhile modifying Hackage and Haddock to display Safe Haskell information.
    246276
    247277
     
    409439 * [http://hackage.haskell.org/trac/ghc/ticket/1338#comment:29]
    410440
    411