wiki:SafeHaskell/BasePackage

Version 2 (modified by dterei, 3 years ago) (diff)

--

Base Package Safety

This page presents a module breakdown of the safety of the Base package.

  • Green: Made safe with no modifications
  • Blue: Made trustworthy with no modifications
  • Yellow: Split out some unsafe functions to Module.Unsafe, made Module trustworthy
  • Red: Left unsafe

Most blue squares are blue because they import GHC.Base which is currently unsafe. Other also import unsafePerformIO operations.

For splitting modules that contain both Safe and Unsafe Symbols, I've moved the entire definition to a new module called say GHC.Arr.Imp. Then added two new module, GHC.Arr.Safe, GHC.Arr.Unsafe. Then changed GHC.Arr to import the Safe and Unsafe modules and either just export the Safe API or export both Safe and Unsafe depending on a CPP flag. This allows us to choose at compile time if we want the base package to be safe by default or not. I could have used a simpler approach like having the entire module defined in GHC.Arr.Unsafe and not have a Imp module but I preferred the Safe and Unsafe modules having disjoint API's rather than Safe being a subset.

Base Package

Top Level Control Data Debug Foreign System Text Unsafe
Foreign Applicative Bits Trace C CPUTime Printf Coerce
Numeric Arrow Bool Concurren Enviornment Read
Prelude Category Char ForeignPtr Exit Show
Concurrent Complex Marshal IO Text.ParserCombinators
Exception Data Ptr Info ReadP
Monad Dynamic StablePtr Mem ReadPrec
OldException Either Storable Timeout Text.Read
Control.Concurrent Eq Foregin.C System.Console Lex
Chan Fixed Error GetOpt Text.Show
MVar Foldable String System.IO Functions
QSem Function Types Error
QSemN Functor Foreign.Marshal System.Mem
SampleVar HashTable Alloc StableName
Control.Exception IORef Array Weak
Base Int Error System.Posix
Control.Monad Ix Pool Internals
Fix List Utils Types
Group Maybe
Instances Monoid
ST Ord
Zip Ratio
Control.Monad.ST STRef
Lazy String
Strict Traversable
Tuple
Typeable
Unique
Version
Word
Data.STRef
Lazy
Strict

GHC

Below is the breakdown for just the GHC modules in base:

GHC GHC.Conc GHC.Float GHC.IO
Arr IO ConversionUtils Buffer
Base Signal RealFracMethods BufferedIO
Classes Sync Device
Conc Windows Encoding
ConsoleHandler Exception
Constants FD
Desugar Handle
Enum IOMode
Environment GHC.IO.Encoding
Err CodePage
Event Failure
Exception Iconv
Exts Latin1
Float Types
Foreign UTF16
ForeignPtr UTF32
Handle UTF8
IO GHC.IO.Encoding.CodePage
IOArray Table
IOBase
IORef
Int
List
MVar
Num
PArr
Pack
Ptr
Prim
Read
Real
ST
STRef
Show
Stable
Storable
TopHandler
Unicode
Weak*
Windows
Word

*I tried to split Weak into Unsafe and Safe modules and have GHC.Weak just expose the Safe api (i.e this would make it a yellow box like the others). However I wasn't able to figure out how to move the definition of Weak. Many of the GHC modules are wired in and require changes to compiler/prelude/PreNames. For all other modules I was able to update their builtin location fine but for Weak I continually got links errors when trying to build libRts.a if I tried to move the definition of GHC.Weak around.

Notes

These are notes on specific modules and why they are the colour they are... ect.

GHC.Base and GHC.Prim: Leaving unsafe. Had a go at making safe versions but gets pretty ugly and complex quickly. See Base Module for a more detailed discussion.

GHC.Conc: Is it safe to expose ThreadId's constructors?

For the moment I've hidden both

GHC.Conc.IO and GHC.Conc.IO.Windows: Made safe version that doesn't contain the asyncReadBA, asyncWriteBA functions. Perhaps these can be left in and GHC.Conc.IO just made trustworthy since their result is in the IO monad but they take a 'MutableByteArray# RealWorld' as a second parameter.

GHC.Event: Made trustworthy... Not sure of this though

GHC.Exts: Left unsafe and didn't make safe / unsafe split Mostly seems fine, only worry is access to Ptr constructor. Also re-exports GHC.Prim

GHC.Ptr: made safe/unsafe split Exposes Ptr constructor Cast operations of funptr to ptr seem dangerous as well, removed from safe version.

GHC.ForeignPtr: Made ForeignPtr type abstract Has an '!unsafeForeignPtrToPtr' function also excluded The whole module seems a little dangerous. (e.g castForeignPtr) As long as pointers can only be dereferenced in the IO monad we should be OK though.

(Foreign.ForeignPtr - as above) (Foreign.Ptr - as above)

GHC.IO.Encoding.CodePage?.Table: Exports raw Addr# arrays. Also pretty specific code so doesn't seem that useful outside of the base package.

GHC.IOBase: keeping unsafe and no safe version as depreciated module.

GHC.IORef: Made safe version due to access to IORef constructor

GHC.Pack: keeping unsafe and no safe version. unpackCString# Among others seem quite unsafe.

GHC.Weak: *Made a Safe version but I had to leave GHC.Weak alone. When I tried to move GHC.Weak to GHC.Weak.Imp I would constantly get link errors when linking the libRts library. I changed the values in compiler/prelude/PrelNames.hs for GHC.Weak but this didn't seem to work. So there is GHC.Weak.Safe and GHC.Weak.Unsafe but no GHC.Weak.Imp and GHC.Weak has to be unsafe.

GHC.Word: Left unmodified and made trustworthy 'uncheckedShiftRL64' is a little scary sounding but seems fine.

Data.Data and Data.Dynamic and Data.Typeable' Left unsafe due to whole Typeable issue.

Debug.Trace: Was left unsafe. It can leak information to the console without detection.

Base Module

The root of the base package and so of Haskell is GHC.Base and GHC.Prim. These both contain a lot of code and a lot of it is unsafe. Some of it obviously other less so. For example:

  • Addr# and Array# types are basically C style pointers, so no bounds checks. Can access arbitary memory with them, buffer overflows... ect
  • divInt :: Int -> Int -> Int seems perfectly safe but division by zero throws an uncatchable exception that crashes the program. (Is this intentional or a bug?)

It is also quite difficult to split this up since 1) GHC.Prim is defined inside of GHC not in any module text file, 2) GHC.Base is defined in a text file but extended by GHC (so GHC.Base exports Bool but Bool isn't defined in the actual GHC.Base text file).

This is potentially another argument for symbol level safety, it would make handling Base and Prim easier.

This does mean a lot of stuff is trustworthy though since they import Base. I'd be happy to deal with the complexity of making Safe versions but it seemed like the ongoing maintenance work wouldn't be worth the benefits.

The best solution might be to leave Base and Prim alone and make Base.Safe and Prim.Safe that are both extended on demmand. (e.g we just add safe symbols to them as needed to get modules that use Base and Prim in a safe way to work in -XSafe). A fine grained total split of Base and Prim is doable but seems like it might be a maintenance problem.

Data.Typeable

I feel we could enable all of this except make Typeable abstract so that instances can't be defined. (Could also still allow deriving of these instances). My understanding is that all of this dynamic stuff works fine as long as the typeOf method basically doesn't lie and pretend two types are the same. The original SYB paper on Typeable from memory basically said this and said that allowing programmers to define their own instances of typeOf was really an implementation artifact and that it should be left up to the compiler.