Changes between Version 13 and Version 14 of SafeHaskell


Ignore:
Timestamp:
Nov 22, 2010 1:28:57 AM (3 years ago)
Author:
David
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SafeHaskell

    v13 v14  
    11= Safe Haskell = 
    22 
    3 This is a proposal for a Haskell extension through which people can safely execute untrusted Haskell code, much the way web browsers currently run untrusted Java and !JavaScript, or the way the Spin and Singularity operating systems ran untrusted Modula-3 and C#/Sing#.  
     3This is a proposal for a Haskell extension through which people can safely execute untrusted Haskell code, much the way web browsers currently run untrusted Java and !JavaScript, or the way the Spin and Singularity operating systems ran untrusted Modula-3 and C#/Sing#. 
    44 
    55== Setup == 
    66 
    7 Safe Haskell assumes the following setup. 
    8  * A server S wants to run code provided by untrusted (and perhaps malicious) clients X. 
    9  * Clients X may send untrusted Haskell code to S ''in source form'' 
    10  * The server compiles this untrusted code, with the `-XSafe` flag. 
    11  * If compilation succeeds, S can safely run the code, knowing that it cannot cause unsafe effects. 
    12  * The server S trusts  
    13    * GHC, its supporting tools 
    14    * Any Haskell modules that S chooses to compile without `-XSafe` 
    15  * The server S does not trust client code. That is why it compiles it with `-XSafe`. 
     7The proposal addresses security in the following scenario. 
     8 * An application A needs to incorporate a module M provided by an untrusted (and perhaps malicious) programmer. 
     9 * Module M is made available ''in source form'' to the user constructing A. 
     10 * The user constructing A compiles M with a new flag, `-XSafe`. 
     11 * If compilation succeeds, A can safely import M, knowing that M cannot cause effects that are not visible in the types of M's functions. 
     12 * The user constructing A must trust: 
     13   * GHC, its supporting tools, and 
     14   * Any Haskell modules of A compiled without `-XSafe`. 
     15 * The user does not trust M, which is why it compiles M with `-XSafe`. 
    1616 
    17 More specifically, there are two parts to this proposed extension: 
     17More specifically, there are two parts to the proposed extension: 
    1818 
    19  1. An option to GHC ({{{-XSafe}}}) that causes it to reject any source code that might produce unsafe effects. 
     19 1. A new GHC option (`-XSafe`) enabling a "Safe" dialect of Haskell in which GHC rejects any source code that might produce unsafe effects. 
    2020 
    21  2. An option to GHC ({{{-XTrusted}}}) indicating that, even though a module might invoke unsafe functions internally, the set of exported symbols cannot be used in an unsafe way. 
     21 2. Another new GHC option (`-XTrustworthy`) indicating that, though a module may invoke unsafe functions internally, the set of exported symbols cannot be used in an unsafe way.  The `-XTrustworthy` option makes a small extension to the syntax of import statements, adding a `safe` keyword: 
     22   
     23  impdecl -> `import` [`safe`] [`qualified`] modid [`as` modid] [impspec] 
    2224 
    23 A module compiled with `-XSafe` can only import modules compiled with `-XTrusted` or `-XSafe`. 
     25Roughly speaking, import declarations are either safe or unsafe, while modules are classified as either trusted or untrusted.  A safe import declaration causes compilation to fail if the imported module is not trusted.  In the Safe dialect, all import declarations are implicitly safe.  When compiling with `-XTrustworthy`, import declarations with the `safe` keyword are safe, and those without unsafe.  A module can be trusted only if it was compiled with `-XSafe` or `-XTrustworthy`, but there are additional restrictions as discussed below under '''module trust'''. 
    2426 
    2527== Safety Goal == 
    2628 
    27 As long as no module compiled with {{{-XTrusted}}} contains a vulnerability, the goal of {{{-XSafe}}} is to guarantee the following properties: 
     29As long as no module compiled with `-XTrustworthy` contains a vulnerability, the safe dialect's goal is to guarantee the following properties: 
    2830 
    29  * '''Referential transparency.'''  Functions in a module compiled {{{-XSafe}}} must be deterministic.  Moreover, evaluating them should have no side effects, and should not halt the program (except by throwing uncaught exceptions or looping forever). 
     31 * '''Referential transparency.'''  Functions in the Safe dialect must be deterministic.  Moreover, evaluating them should have no side effects, and should not halt the program (except by throwing uncaught exceptions or looping forever). 
    3032 
    31  * '''Constructor access control.'''  Code in a module compiled {{{-XSafe}}} must not be able to examine or synthesize data values using constructors the module cannot import. 
     33 * '''Constructor access control.'''  Safe code must not be able to examine or synthesize data values using data constructors the module cannot import. 
    3234 
    33 Note that {{{-XSafe}}} should not prevent use of the symbol {{{IO}}}.  Authors of normal (trusted) code may wish to use {{{ {-# LANGUAGE Safe #-} }}} as a means of ensuring they do not accidentally invoke unsafe actions, directly or indirectly.   
     35 * '''Semantic consistency.'''  Any expression that compiles both with and without the import of a Safe module must have the same meaning in both cases.  (E.g., {{{1 + 1 == 3}}} must remain `False` when you add the import of a Safe module.) 
    3436 
    35 Of course, if an untrusted module exports an {{{IO}}} action, that action may have arbitrary side effects.  Compiling the module with {{{-XSafe}}} does not meaningfully restrict the effects of exported {{{IO}}} actions.  Hence, an application importing an untrusted but safe module may safely invoke pure functions from the untrusted module, but must avoid executing {{{IO}}} actions from the module. 
     37The Safe dialect is intended to be of use for both normal (trusted) and untrusted code.  Authors of trusted modules may wish to include `{-# LANGUAGE Safe #-}` pragmas to ensure they do not accidentally invoke unsafe actions (directly or indirectly), or to allow other Safe code to import their modules. 
     38 
     39Note that the safe dialect does not prevent use of the symbol `IO`. If an untrusted module exports an `IO` action, that action may have arbitrary side effects.  Simply compiling a module with `-XSafe` does not meaningfully restrict the effects of exported `IO` actions. Hence, an application A importing an untrusted but safe module M may safely invoke pure functions from M, but must avoid executing `IO` actions unless some other mechanism has been used to ensure M can only construct `IO` actions that conform to A's security goal. (See `-ultrasafe` below for one such mechanism.) 
     40 
     41== Module trust == 
     42 
     43Recall that a safe import of a module M fails unless M is trusted. M is trusted when two conditions hold: 
     44 
     45 1. M is compiled with `-XSafe` or `-XTrustWorthy`, in which case we say M is ''trustable'', and 
     46 
     47 2. The modules on which M depends (transitively) for safety all reside in packages that the application trusts. 
     48 
     49Determining trust requires two modifications to the way GHC manages modules.  First, the interface file format must change to record whether each module is trustable, and if so what its trust dependencies are.  Second, we need run-time options to specify which packages are trusted by an application. 
     50 
     51We therefore change the interface file format to record for each module M whether or not M is trustable.  If M is trustable, the interface file contains a list of ''trust dependencies'', which are (package, module) pairs on which the current application depends for safety.  The trust dependencies of a module are determined as follows: 
     52 
     53 * ''If a module is compiled with `-XTrustworthy`'', then its trust dependencies always include itself, plus the union of trust dependencies of all modules imported with the `safe` keyword. 
     54 
     55 * ''If a module is compiled with `-XSafe` but not `-XTrustworthy`'', then its trust dependencies are the union trust dependencies of all modules it imports. 
     56 
     57 * If a module was compiled with neither `-XSafe` nor `-XTrustworthy`, or, regardless of these options, if a module contains a `{-# LANGUAGE Untrustworthy #-}` pragma, then the interface file marks the module as not trustable, so there are no trust dependencies. 
     58 
     59Currently, in any given run of the compiler, GHC classifies each package as either exposed or hidden.  To incorporate trust, we now allow a second bit specifying whether each package is trusted or untrusted.  This bit will be controllable by two new options to `ghc-pkg`, `trust` and `distrust`, which are analogous to `expose` and `hide`.  (The trust bit is meaningless while a package is hidden, but will stay in the package database for when packages are exposed.) 
     60 
     61On the command line, several new options control which packages are trusted: 
     62 
     63 * `-trust` __P__ - exposes package __P__ (if it was hidden), and considers it a trusted package regardless of the contents of the package database. 
     64 
     65 * `-distrust` __P__ - exposes package __P__ (if it was hidden), and considers it an untrusted package, regardless of the contents of the package database. 
     66 
     67 * `-distrust-all-packages` - considers all packages untrusted unless they are explicitly trusted by subsequent command-line options. 
     68 
     69 * A convenience option `-ultrasafe` is equivalent to `-distrust-all-packages` `-XNoForeignFunctionInterface`. 
     70 
     71None of these options can be specified or overwritten by `OPTIONS_GHC` pragmas in the Safe dialect. 
    3672 
    3773== Threats == 
    3874 
    39 The following aspects of Haskell can be used to violate the safety goal, and thus would need to be disallowed when {{{-XSafe}}} is in effect.  ''Please add more examples to this list, as some are likely missing.'' 
     75The following aspects of Haskell can be used to violate the safety goal, and thus need to be disallowed or modified for the Safe dialect.  ''Please add more issues to this list, as some are likely missing.'' 
    4076 
    4177 * Some symbols in {{{GHC.Prim}}} can be used to do very unsafe things.  At least one of these symbols, {{{realWorld#}}}, is magically exported by {{{GHC.Prim}}} even though it doesn't appear in the {{{GHC.Prim}}} module export list.  ''Are there other such magic symbols in this or other modules?'' 
    4278 
    43  * A number of functions can be used to violate safety.  Many of these have names prefixed with {{{unsafe}}} (e.g., {{{unsafePerformIO}}}, {{{unsafeIterleaveIO}}}, {{{unsafeCoerce}}} {{{unsafeSTToIO}}}, ...).  However, not all unsafe functions fit this pattern.  For instance, {{{inlinePerformIO}}} and {{{fromForeignPtr}}} from the {{{bytestring}}} package are unsafe. 
     79 * A number of functions can be used to violate safety.  Many of these have names prefixed with `unsafe` (e.g., {{{unsafePerformIO}}}, {{{unsafeIterleaveIO}}}, {{{unsafeCoerce}}} {{{unsafeSTToIO}}}, ...). However, not all unsafe functions fit this pattern.  For instance, {{{inlinePerformIO}}} and {{{fromForeignPtr}}} from the `bytestring` package are unsafe. 
    4480 
    4581 * Code that defines hand-crafted instances of {{{Typeable}}} can violate safety by causing {{{typeOf}}} to return identical results on two distinct types, then using {{{cast}}} to coerce between the two unsafely.  ''Are there other classes?  Perhaps {{{Data}}} should also be restricted?  Simon says {{{Ix}}} doesn't need to be protected any more.'' 
     
    4783 * Certain exposed constructors of otherwise mostly safe data types allow unsafe actions.  For instance, the {{{PS}}} constructor of {{{Data.ByteString.ByteString}}} contains a pointer, offset, and length.  Code that can see the pointer value can act in a non-deterministic way by depending on the address rather than value of a {{{ByteString}}}.  Worse, code that can use {{{PS}}} to construct {{{ByteString}}}s can include bad lengths that will lead to stray pointer references. 
    4884 
    49  * The {{{ForeignFunctionInterface}}} extension is mostly safe, but `foreign import` declarations that import a function with a non-`IO` type must be disallowed. 
     85 * The {{{ForeignFunctionInterface}}} extension is mostly safe, but {{{foreign import}}} declarations that import a function with a non-`IO` type must be disallowed. 
    5086 
    5187 * {{{TemplateHaskell}}} is also particularly dangerous, as it can cause side effects even at compilation time. 
    5288 
    53  * The {{{OverlappingInstances}}} extension may allow unsafe actions, because untrusted code can potentially redefine a type instance (by containing a more specific instance definition) in a way that changes the behaviour of code importing the untrusted module. To avoid that, overlapping instances declarations must come only from modules compiled with {{{-XTrusted}}} or modules compiled with {{{-XSafe}}}. It is not safe to allow an overlapping instance declaration for a given class in a modules compiled with {{{-XTrusted}}} and another overlapping instance declaration for the same class in a module compiled with {{{-XSafe}}}. It is also not safe to allow overlapping instances placed in different modules compiled with {{{-XSafe}}}. '''SLPJ: this may be undesirable, but does it violate the Safety Goal?''' 
     89 * The {{{OverlappingInstances}}} extension can be used to violate semantic consistency, because untrusted code can potentially redefine a type instance (by containing a more specific instance definition) in a way that changes the behaviour of code importing the untrusted module. 
    5490 
    55  * Likewise, {{{RULES}}} and {{{SPECIALIZE}}} pragmas can change the behavior of trusted code in unanticipated ways. '''SLPJ: same question''' 
     91 * Likewise, {{{RULES}}} and {{{SPECIALIZE}}} pragmas can change the behavior of trusted code in unanticipated ways, violating semantic consistency. 
    5692 
    57  * {{{OPTIONS_GHC}}} is dangerous in unfiltered form, as it could potentially expose packages with trusted but not trustworthy modules. {{{-XSafe}}} must be processed last after all other options.  If previous options conflict with {{{-XSafe}}}, they must be overrided or compilation must fail. 
     93 * {{{OPTIONS_GHC}}} is dangerous in unfiltered form, as it could potentially make packages trusted that the invoking user doesn't trust. 
    5894 
    5995 * The {{{StandaloneDeriving}}} extension can be used to violate constructor access control by defining instances of {{{Read}}} and {{{Show}}} to examine and construct data values with inaccessible constructors. 
     
    6399== Implementation details == 
    64100 
    65  * An interface file should record whether a module is safe.  When the module is safe, the interface file should additionally include a set of trusted modules on which the module depends.  '''SLPJ:what is the function of the "set of trusted modules on which it depends"?''' There could be some option like {{{--trust-only}}} that restricts the set of packages from which trusted modules may be imported.  Thus one could restrict what modules safe code imports in a way that is independent of whatever happens to be installed in a user's {{{~/.cabal}}} directory. 
    66  
    67  * A module compiled with {{{-XTrusted}}} should be marked safe; its set of trusted modules should contain itself and only itself. 
    68  
    69  * A module compiled with {{{-XSafe}}} should only be able to import modules that are marked safe.  Its set of trusted modules should be the union of the trusted sets of all the modules it imports. 
     101 * The `-XSafe` option must be processed last after all other options.  If previous options conflict with {{{-XSafe}}}, they must be overwritten or compilation must fail.  Alternatively, maybe `-XSafe` and `{-# LANGUAGE Safe #-)` should always be processed before all other pragmas, especially `LANGUAGE` and `OPTIONS_GHC`, and any subsequent dangerous pragmas should cause compilation to fail. 
    70102 
    71103 * {{{GHC.Prim}}} will need to be made (or just kept) unsafe. 
    72104 
    73  * {{{-XSafe}}} should disallow the {{{FFI}}}, {{{TemplateHaskell}}}, {{{OverlappingInstances}}}, {{{StandaloneDeriving}}}, {{{GeneralizedNewtypeDeriving}}}, and {{{CPP}}} language extensions, as well as {{{RULES}}} and {{{SPECIALIZE}}} pragmas. 
     105 * {{{-XSafe}}} should disallow the {{{TemplateHaskell}}}, {{{StandaloneDeriving}}}, {{{GeneralizedNewtypeDeriving}}}, and {{{CPP}}} language extensions, as well as {{{RULES}}} and {{{SPECIALIZE}}} pragmas. 
    74106 
    75  * {{{OPTIONS_GHC}}} pragmas will have to be filtered.  Some options, (e.g., -fno-warn-unused-do-bind) are totally fine, but many others are likely problematic (e.g., {{{-cpp}}}, which provides access to the local file system at compilation time, or {{{-F}}} which allows an arbitrary file to be executed, possibly even one named {{{/afs/}}}... and hence entirely under an attacker's control). 
     107 * Overlapping instance declarations must either all reside in modules compiled without `-XSafe`, or else all reside in the same module.  It violates semantic consistency to allow Safe code to change the instance definition associated with a particular type. 
    76108 
    77  * Libraries will progressively need to be updated to export safe interfaces, which may require moving unsafe functions into separate modules, or adding new {{{ {-# LANGUAGE Safe #-} }}} modules that re-export a safe subset of symbols.  Ideally, most modules in widely-used libraries would eventually contain either {{{ {-# LANGUAGE Safe -#} }}} or {{{ {-# LANGUAGE Trusted -#} }}} pragmas, except for internal modules or a few modules exporting unsafe symbols.  Maybe haddock could add some indicator to make it obvious which modules are safe. 
     109 * {{{OPTIONS_GHC}}} pragmas will have to be filtered.  Some options, (e.g., `-fno-warn-unused-do-bind`) are totally fine, but many others are likely problematic (e.g., {{{-cpp}}}, which provides access to the local file system at compilation time, or {{{-F}}} which allows an arbitrary file to be executed, possibly even one named {{{/afs/}}}... and hence entirely under an attacker's control). 
    78110 
    79  * The {{{-XTrusted}}} command-line option and corresponding pragma do not increase what a module can do--the only effect is to mark the module's interface file as safe.  In particular, if both {{{-XTrusted}}} and {{{-XSafe}}} are supplied, then any unsafe actions will still cause a compilation error.  A plausible use for both pragmas simultaneously is to prune the list of trusted modules--for instance if a module imports a bunch of trusted modules but does not use any of their trusted features, or only uses those features in a very limited way.  If the code happens also to be safe, the programmer may want to add {{{-XSafe}}} to catch accidental unsafe actions. 
     111 * Libraries will progressively need to be updated to export safe interfaces, which may require moving unsafe functions into separate modules, or adding new {{{ {-# LANGUAGE Safe #-} }}} modules that re-export a safe subset of symbols.  Ideally, most modules in widely-used libraries would eventually contain either {{{ {-# LANGUAGE Safe -#} }}} or {{{ {-# LANGUAGE Trustworthy -#} }}} pragmas, except for internal modules or a few modules exporting unsafe symbols.  Maybe haddock could add some indicator to make it obvious which modules are safe. 
    80112 
    81  * It might be nice to add a third option, {{{ {-# LANGUAGE Unsafe -#} }}}, which prevents a module from compiling with {{{-XSafe}}} and disables the {{{-XTrusted}}} option so that the interface file is guaranteed not to be marked safe.  This option could be used in seemingly safe modules that export constructors that would cause other modules to do unsafe things.  (The {{{PS}}} constructor discussed above is an example of a dangerous constructor that could potentially be defined in a module that happily compiles with {{{-XSafe}}}.)  This isn't strictly necessary, since one could always just enable another extension such as {{{FFI}}}, but having a specific {{{Unsafe}}} language option seems cleaner. 
     113 * The {{{-XTrustworthy}}} command-line option and corresponding pragma do not increase what a module can do--the only effect is to mark the module's interface file as trustable and add the module to its own trust dependency set.  If both {{{-XTrustworthy}}} and {{{-XSafe}}} are supplied, then the language is still restricted to the Safe dialect and any unsafe actions will still cause a compilation error.  A plausible use for both pragmas simultaneously is to prune the list of trusted modules--for instance, if a module imports a bunch of trusted modules but does not use any of their trusted features, or only uses those features in a very limited way.  If the code happens also to be safe, the programmer may want to add {{{-XSafe}}} to catch accidental unsafe actions. 
     114 
     115 * The option `{-# LANGUAGE Untrustworthy -#}` is not incompatible with `{-# LANGUAGE Safe -#}`.  The former causes the interface file to be marked not trustable, while the latter causes the source code to be confined to the safe dialect.  `Untrustworthy` should be used in seemingly safe modules that export constructors that would cause other modules to do unsafe things.  (The `PS` constructor discussed above is an example of a dangerous constructor that could potentially be defined in a module that happily compiles with `-XSafe`.) 
    82116 
    83117== References ==