Changes between Version 22 and Version 23 of SafeHaskell


Ignore:
Timestamp:
Jan 18, 2011 12:57:28 AM (5 years ago)
Author:
David
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SafeHaskell

    v22 v23  
    1313   * GHC, its supporting tools, and
    1414   * Any Haskell modules of A compiled without `-XSafe`.
    15  * The user does not trust M, which is why it compiles M with `-XSafe`.
     15 * The user does not trust M, which is why he or she compiles M with `-XSafe`.
    1616
    1717More specifically, there are two parts to the proposed extension:
    1818
    19  1. A new GHC option (`-XSafe`) enabling a "Safe" dialect of Haskell  in which GHC rejects any source code that might produce unsafe effects or otherwise subvert the type system.
    20 
    21  2. Another new GHC option (`-XTrustworthy`) indicating that, though a module may invoke unsafe functions internally, the set of exported symbols cannot be used in an unsafe way.  The `-XTrustworthy` option makes a small extension to the syntax of import statements, adding a `safe` keyword:
     19 1. A new GHC option (`-XSafe`) enabling a "Safe" dialect of Haskell in which GHC rejects any source code that might produce unsafe effects or otherwise subvert the type system.
     20
     21 2. Another new GHC option (`-XTrustworthy`) indicating that, though a module may invoke unsafe functions internally, the module's author claims the set of exported symbols cannot be used in an unsafe way.
     22
     23The presence of either the `-XSafe` or `-XTrustworthy` option introduces a small extension to the syntax of import statements, adding a `safe` keyword:
    2224 
    2325  impdecl -> `import` [`safe`] [`qualified`] modid [`as` modid] [impspec]
    2426
    25 Roughly speaking, import declarations are either safe or unsafe, while modules are classified as either trusted or untrusted.  A safe import declaration causes compilation to fail if the imported module is not trusted.  In the Safe dialect, all import declarations are implicitly safe.  When compiling with `-XTrustworthy`, import declarations with the `safe` keyword are safe, while those without the keyword are unsafe.  A module can be trusted only if it was compiled with `-XSafe` or `-XTrustworthy`, but there are additional restrictions as discussed below under '''module trust'''.
     27Import declarations are either safe or unsafe, while modules are classified as either trusted or untrusted.  A safe import declaration causes compilation to fail if the imported module is not trusted.
     28
     29In the Safe dialect, all import declarations are implicitly safe, regardless of the presence of the `safe` keyword.  When compiling with `-XTrustworthy` but not `-XSafe`, import declarations with the `safe` keyword are safe, while those without the keyword are unsafe.  A module can be trusted only if it was compiled with `-XSafe` or `-XTrustworthy`, but there is an additional restriction detailed below under '''Trusted modules''':  roughly speaking, for a module M to be trusted, any modules reachable from M and compiled with `-XTrustworthy` must reside in packages trusted by the user invoking the compiler.
    2630
    2731== Safety Goal ==
    2832
    29 As long as no module compiled with `-XTrustworthy` contains a vulnerability, the goal of the Safe dialect (ie code compiled with `-XSafe`) is to guarantee the following properties:
     33As long as no module compiled with `-XTrustworthy` contains a vulnerability, the goal of the Safe dialect (i.e., code compiled with `-XSafe`) is to guarantee the following properties:
    3034
    3135 * '''Referential transparency.'''  Functions in the Safe dialect must be deterministic.  Moreover, evaluating them should have no side effects, and should not halt the program (except by throwing uncaught exceptions or looping forever).
     
    5660  rm = RM deleteAllFiles
    5761}}}
    58 The flag (and LANGUAGE pragma) `UltraSafe` is just like `Safe` except that it also disables `foreign import`.  This strengtens the safety guarantee, by esuring that a `UltraSafe` module can construct IO actions only by composing together IO actions that it imports from trusted modules.  Note that `UltraSafe` does not disable the use of IO itself. For example this is fine:
     62The flag (and LANGUAGE pragma) `UltraSafe` is just like `Safe` except that it also disables `foreign import`.  This strengtens the safety guarantee, by esuring that an `UltraSafe` module can construct IO actions only by composing together IO actions that it imports from trusted modules.  Note that `UltraSafe` does not disable the use of IO itself. For example this is fine:
    5963{{{
    6064{-# LANGUAGE UltraSafe #-}
     
    6670
    6771
    68 == Module trust ==
    69 
    70 ---------------
    71 '''SLPJ comment.'''  This section seems over-complicated.  For one thing it appears to define two terms: "trustable" and "trusted", and I can't tell if they are the same or not.  Second, the definitions seem over complicated. MOreover it mixes specification and implementation. Here's an alternative attempt:
    72 
    73  * A module is '''trusted''' iff it was compiled with `-XSafe` or `-XTrustworthy`.
    74  * A '''trusted package''' is one that is declared trusted by command-line flags (using the rules you give below).  But a module in an untrusted package can still be a trusted module (see above).
    75  * A '''safe import declaration''' is one that
    76    * Imports a '''trusted module''' module, or
    77    * Imports a module from a '''trusted package'''
    78  * In a module compiled with `-XSafe`, you may use only '''safe import declarations'''
    79  * In module compiled with `-XTrustworthy`, any import declaration marked "`safe`" must be a '''safe import declaration'''.
    80 
    81 '''End of SLPJ comment'''
    82 -------------------
    83 
    84 Recall that a safe import of a module M fails unless M is trusted.  M is trusted when two conditions hold:
    85 
    86  1. M was compiled with `-XSafe` or `-XTrustWorthy`, in which case we say M is ''trustable'', and
    87 
    88  2. The modules on which M depends (transitively) for safety all reside in packages that the application trusts.
    89 
    90 Determining trust requires two modifications to the way GHC manages modules.  First, the interface file format must change to record whether each module is trustable, and if so what its trust dependencies are.  Second, we need compiler options to specify which packages are trusted by an application.
    91 
    92 We therefore extend the interface file format to record for each module M whether or not M is trustable.  If M is trustable, the interface file contains a list of ''trust dependencies'', which are (package, module) pairs on which the current application depends for safety.  The trust dependencies of a module are determined as follows:
    93 
    94  * ''If a module is compiled with `-XTrustworthy`'', then its trust dependencies always include itself, plus the union of trust dependencies of all modules imported with the `safe` keyword.
    95 
    96  * ''If a module is compiled with `-XSafe` but not `-XTrustworthy`'', then its trust dependencies are the union of trust dependencies of all modules it imports.
    97 
    98  * If a module was compiled with neither `-XSafe` nor `-XTrustworthy`, or, regardless of these options, if a module contains a `{-# LANGUAGE Untrustworthy #-}` pragma, then the interface file marks the module as not trustable, so there are no trust dependencies.
    99 
    100 Currently, in any given run of the compiler, GHC classifies each package as either exposed or hidden.  To incorporate trust, we add a second bit specifying whether each package is trusted or untrusted.  This bit will be controllable by two new options to `ghc-pkg`, `trust` and `distrust`, which are analogous to `expose` and `hide`.
     72== Trusted modules ==
     73
     74Recall that a safe import of a module M fails unless M is trusted. Whether or not M is trusted depends on the following factors:
     75
     76 - Which packages the user invoking the compiler trusts,
     77
     78 - What modules M depends on, and
     79
     80 - How M was compiled.
     81
     82To capture the intentions of a user compiling potentially untrusted code, we introduce a notion of '''trusted package'''.  A trusted package is one that is declared trusted by command-line flags to the compiler.  The user must manually ensure that code in trusted packages does not violate the safety goal; the compiler assumes that responsibility for the rest of the program.
     83
     84Next, we define a notion of a module's '''trust dependency set'''. The trust dependency set of module M consists of either a) the names and packages of all the modules that the user must manually verify to ensure M satisfies the safety goal, or b) the singleton set containing pseudo-module name `untrusted` in pseudo-package `_untrusted`. (`_untrusted` is never a trusted package, regardless of command-line options.)  More specifically, the trust dependency set of module M is computed as follows:
     85
     86 * If M was compiled with neither `-XSafe` nor `-XTrustworthy`, or, regardless of these options, if M contains the `{-# LANGUAGE Untrustworthy #-}` pragma, then M's trust dependency set consists solely of the `untrusted` pseudo-module.
     87
     88 * ''If M was compiled with `-XTrustworthy`'', then its trust dependency set includes M plus the union of the trust dependency sets of all modules M imports with the `safe` keyword.
     89
     90 * ''If M was compiled with `-XSafe` but not `-XTrustworthy`'', then its trust dependency set is the union of the trust dependency sets of all modules M imports.
     91
     92Finally, we say an imported module M is '''trusted''' when every module in M's trust dependency set resides in a trusted package.  A safe import of module M fails unless M is trusted by this definition.
     93
     94=== Implementation details ===
     95
     96Determining trust requires two modifications to the way GHC manages modules.  First, the interface file format must change to record each module's trust dependency set.  Second, we need compiler options to specify which packages are trusted by an application.
     97
     98We therefore extend the interface file format to record the trust dependency set of each module.  The set is represented as a list of ''trust dependencies'', each of which is a (package, module) pair.
     99
     100Currently, in any given run of the compiler, GHC classifies each package as either exposed or hidden.  To incorporate trust, we add a second bit specifying whether each package is trusted or untrusted. This bit will be controllable by two new options to `ghc-pkg`, `trust` and `distrust`, which are analogous to `expose` and `hide`.
    101101
    102102On the command line, several new options control which packages are trusted:
     
    108108 * `-distrust-all-packages` - considers all packages untrusted unless they are explicitly trusted by subsequent command-line options.  (This option does not change the exposed/hidden status of packages, so is not equivalent to applying `-distrust` to all packages on the system.)
    109109
    110  * A convenience option `-ultrasafe` is equivalent to {{{-distrust-all-packages -XNoForeignFunctionInterface -XNoImplicitPrelude -XSafe}}}.  '''SLPJ note''' I don't agree.  An ultrasafe module should be able to import trusted packages, otherwise how could it do any IO?  It's just that an ultrasafe module should not do foreign-import.  '''End of SLPJ note'''.
    111 
    112 None of these options can be specified or overwritten by `OPTIONS_GHC` pragmas in the Safe dialect.
     110 * A convenience option `-ultrasafe` which is equivalent to specifying {{{-distrust-all-packages -XNoForeignFunctionInterface -XNoImplicitPrelude}}} at the point of the `-ultrasafe` option, and `-XSafe` at the end of the command line.
     111
     112=== Order of options ===
     113
     114Safety-critical options must not be specified or overwritten by `LANGUAGE` and `OPTIONS_GHC` pragmas in the Safe dialect.  To avoid such surprises, certain options and pragmas are '''restricted''', meaning they can only be supplied before the safe dialect is enabled. The order of options is considered to be:  first, all command-line options in the order they appear on the command line, second, all `LANGUAGE` and `OPTIONS_GHC` pragmas, in the order they appear in the module source.  Thus, the `-XSafe` command-line option disallows all restricted pragmas, but, in the absence of `-XSafe` on the command line, `{-# LANGUAGE Safe #-}` may appear below restricted pragmas in the source, just not above them.
     115
     116At least the following options (and their pragma equivalents) are restricted:
     117
     118 - `-XTrustworthy` - Rationale:  Trusted packages may wish to include untrusted code compiled with `-XSafe`.  Yet once `-XSafe` is applied, the module must not be able to prune its trust dependency set, which it could with `{-# Trustworthy #-}`.
     119 - `-XUntrustworthy` - Rationale: Unless -XUntrustworthy is applied first, if compilation does not fail, then `-XSafe` should produce code that can be trusted with the specified set of trusted packages.
     120 - `-XForeignFunctionInterface` - Rationale: Trustworthy code in the Safe dialect may wish to have foreign import declarations, but modules from untrusted sources do not need this feature.  Thus, `-XSafe` on the command line should disable `{-# ForeignFunctionInterface #-}` pragmas.
     121 - `-trust` - it is not safe to increase the set of trusted packages.
     122 - `-package`, `-package-id`, `-package-conf`, `-no-user-package-conf` - untrusted code should not have access to explicitly hidden packages.
     123 - `-package-name` - package name should be set only by trusted user
     124 - `-F`, `-cpp`, `-XCPP` - these options allow access to the local file system.
     125 - All of the linking options should be restricted (`-main-is`, `-l`__lib__, `-L`__dir__, `-framework`, etc.)
     126 - Several other options discussed below: `-XTemplateHaskell`, `-XStandaloneDeriving`, `-XGeneralizedNewtypeDeriving`.
     127 - The `RULES` and `SPECIALIZE` pragmas are also restricted and cannot appear below `{-# LANGUAGE Safe #-}` or when the `-XSafe` option has been specified on the command line.
     128
     129Note that `-ultrasafe` only enables Safe mode at the end of the command-line.  Thus, one can supply one or more `-trust` options after `-ultrasafe` to allow ultrasafe code to do I/O.
    113130
    114131== Threats ==
    115132
    116 The following aspects of Haskell can be used to violate the safety goal, and thus need to be disallowed or modified for the Safe dialect.  ''Please add more issues to this list, as some are likely missing.''
     133The following aspects of Haskell can be used to violate the safety goal, and thus need to be disallowed or modified for the Safe dialect. ''Please add more issues to this list, as some are likely missing.''
    117134
    118135 * Some symbols in {{{GHC.Prim}}} can be used to do very unsafe things.  At least one of these symbols, {{{realWorld#}}}, is magically exported by {{{GHC.Prim}}} even though it doesn't appear in the {{{GHC.Prim}}} module export list.  ''Are there other such magic symbols in this or other modules?''
    119136
    120  * A number of functions can be used to violate safety.  Many of these have names prefixed with `unsafe` (e.g., {{{unsafePerformIO}}}, {{{unsafeIterleaveIO}}}, {{{unsafeCoerce}}} {{{unsafeSTToIO}}}, ...). However, not all unsafe functions fit this pattern.  For instance, {{{inlinePerformIO}}} and {{{fromForeignPtr}}} from the `bytestring` package are unsafe.
     137 * A number of functions can be used to violate safety.  Many of these have names prefixed with `unsafe` (e.g., {{{unsafePerformIO}}}, {{{unsafeIterleaveIO}}}, {{{unsafeCoerce}}}, {{{unsafeSTToIO}}}, ...). However, not all unsafe functions fit this pattern.  For instance, {{{inlinePerformIO}}} and {{{fromForeignPtr}}} from the `bytestring` package are unsafe.
    121138
    122139 * Code that defines hand-crafted instances of {{{Typeable}}} can violate safety by causing {{{typeOf}}} to return identical results on two distinct types, then using {{{cast}}} to coerce between the two unsafely.  ''Are there other classes?  Perhaps {{{Data}}} should also be restricted?  Simon says {{{Ix}}} doesn't need to be protected anymore.''
     
    140157== Implementation details ==
    141158
    142  * The `-XSafe` option must be processed last after all other options.  If previous options conflict with {{{-XSafe}}}, they must be overwritten or compilation must fail.  Alternatively, maybe `-XSafe` and `{-# LANGUAGE Safe #-)` should always be processed before all other pragmas, especially `LANGUAGE` and `OPTIONS_GHC`, and any subsequent dangerous pragmas should cause compilation to fail.
    143 
    144159 * {{{GHC.Prim}}} will need to be made (or just kept) unsafe.
    145160
     
    152167 * Libraries will progressively need to be updated to export trustable interfaces, which may require moving unsafe functions into separate modules, or adding new `{-# LANGUAGE Trustworthy #-}` modules that re-export a safe subset of symbols.  Ideally, most modules in widely-used libraries will eventually contain either `{-# LANGUAGE Safe -#}` or `{-# LANGUAGE Trustworthy -#}` pragmas, except for internal modules or a few modules exporting unsafe symbols.  Maybe haddock can add some indicator to make it obvious which modules are trustable and show the trust dependencies.
    153168
    154  * When `-XSafe` and `-XTrustworthy` are used together, the language is restricted to the Safe dialect.  The effect of `-XTrustworthy` is to change the trust dependency set.  Specifically, the trust dependency set will include the module itself.  However, rather than include the union of trust dependency sets of all imported modules, only dependencies of modules imported with the `safe` keyword are added to the current module's set.  A plausible use for both pragmas simultaneously is to prune the list of trusted modules--for instance, if a module imports a bunch of trusted modules but does not use any of their trusted features, or only uses those features in a very limited way.  If the code happens also to be safe, the programmer may want to add `-XSafe` to catch accidental unsafe actions.
     169 * When `-XTrustworthy` and `-XSafe` are used together, the language is restricted to the Safe dialect.  The effect of `-XTrustworthy` is to change the trust dependency set.  Specifically, the trust dependency set will include the module itself.  However, rather than include the union of trust dependency sets of all imported modules, only dependencies of modules imported with the `safe` keyword are added to the current module's set.  A plausible use for both pragmas simultaneously is to prune the list of trusted modules--for instance, if a module imports a bunch of trusted modules but does not use any of their trusted features, or only uses those features in a very limited way.  If the code happens also to be safe, the programmer may want to add `-XSafe` to catch accidental unsafe actions.
    155170
    156171 * The option `{-# LANGUAGE Untrustworthy -#}` is also not incompatible with `{-# LANGUAGE Safe -#}`.  The former causes the interface file to be marked not trustable, while the latter causes the source code to be confined to the Safe dialect.  `Untrustworthy` should be used in seemingly safe modules that export constructors that would allow other modules to do unsafe things.  (The `PS` constructor discussed above is an example of a dangerous constructor that could potentially be defined in a module that happily compiles with `-XSafe`.)