Opened 18 months ago

Last modified 13 days ago

#8827 new bug

Inferring Safe mode with GeneralizedNewtypeDeriving is wrong

Reported by: goldfire Owned by:
Priority: normal Milestone: 7.12.1
Component: Compiler Version: 7.9
Keywords: Cc: mail@…, dterei, mazieres-i58umkudjfnfpfdx6jbbn3ymzi@…, oerjan
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #8226, #8745 Differential Revisions:

Description

Consider the following modules:

module A (List, ints) where

data List a = Nil | Cons a (List a)
infixr 4 `Cons`

ints :: List Int
ints = 1 `Cons` 2 `Cons` 3 `Cons` Nil
{-# LANGUAGE GeneralizedNewtypeDeriving #-}

module B where

import A

newtype Age = MkAge Int
  deriving C

class C a where
  cList :: List a

instance C Int where
  cList = ints
{-# LANGUAGE Safe #-}

module C (ages) where

import A
import B

ages :: List Age
ages = cList

Module C compiles without a hiccup. But, it shouldn't: the coercion between ages and ints (performed by GeneralizedNewtypeDeriving in module B) isn't Safe, as it breaks through List's abstraction. (Note that the constructors of List are not exported from module A!)

If module B includes {-# LANGUAGE Safe #-}, it duly doesn't compile, because of the stringent "constructors-must-be-in-scope" check done by the Coercible mechanism. The problem is that safety can be inferred without this check taking place.

You may also want to read the commentary on #8745 for related discussion.

Change History (46)

comment:1 Changed 18 months ago by simonpj

Why does Safe Haskell require that the constructors of List should be in scope in module B?

Point 3 of Note [Coercible Instances] in TcInteract doesn't explain.

Simon

comment:2 Changed 18 months ago by goldfire

From the manual, describing the guarantees of Safe Haskell:

Module boundary control — Haskell code compiled using the safe language is guaranteed to only access symbols that are publicly available to it through other modules export lists. An important part of this is that safe compiled code is not able to examine or create data values using data constructors that it cannot import. If a module M establishes some invariants through careful use of its export list then code compiled using the safe language that imports M is guaranteed to respect those invariants. Because of this, Template Haskell and GeneralizedNewtypeDeriving are disabled in the safe language as they can be used to violate this property.

Without the check to make sure that all relevant constructors are in scope, the above property would be false.

comment:3 Changed 18 months ago by simonpj

OK, but what does "access the symbols" mean, exactly? The kind of a type constructor may change if you change its implementation; and similarly its roles. I think it should be fine to coerce T Age to T Int, even if T's constructors are not visible, unless T's author declares T's role to be nominal.

Simon

comment:4 Changed 18 months ago by nomeata

  • Cc mail@… added

comment:5 Changed 18 months ago by thoughtpolice

  • Version changed from 7.8.1-rc1 to 7.8.1-rc2

comment:6 Changed 18 months ago by simonpj

  • Cc dterei added

David Terei may have an opinion.

comment:7 Changed 18 months ago by simonpj

In comment 3 I argue that it's right to infer Safe for B, and hence that B should compile fine with {-# LANGUAGE Safe #-}. That would mean removing the recursive check.

Does anyone object?

Simon

Version 1, edited 18 months ago by simonpj (previous) (next) (diff)

comment:8 Changed 18 months ago by nomeata

Under the assumption that library authors will get their role annotations right it is ok to remove that check.

comment:9 Changed 18 months ago by Richard Eisenberg <eir@…>

In 59722295bb8da8f01d37356fbed6aef7321a8195/ghc:

Remove "Safe mode" check for Coercible instances

We assume that library authors supply correct role annotations
for their types, and therefore we do not need to check for
the availability of data constructors in Safe mode. See
discussion in #8725. This effectively fixes #8827 and #8826.

comment:10 Changed 18 months ago by goldfire

  • Status changed from new to merge

comment:11 Changed 18 months ago by Richard Eisenberg <eir@…>

In 7602bd4de901e4304a3a45dca08fc630d1bb5bf2/ghc:

Remove code reporting issues with Safe Haskell and coerce.

This is a followup to the fix for #8827, and should be merged
with that change.

comment:12 Changed 18 months ago by thoughtpolice

  • Resolution set to fixed
  • Status changed from merge to closed

Merged in 7.8.

comment:13 Changed 18 months ago by dterei

  • Cc mazieres-i58umkudjfnfpfdx6jbbn3ymzi@… added
  • Resolution fixed deleted
  • Status changed from closed to new

Reopening ticket for now. David Mazieres and I don't think that this is the right solution. It means that a developer now needs to understand beyond Haskell2010 and some subtle details just to get correctly working module export control.

Including email DM sent describing our position:

At any rate, David and I just discussed the new Coerce typeclass.
Based on David's understanding of its behavior, it sounds pretty
dangerous for Safe Haskell.  At a minimum, the programmer is going to
need to understand a lot more than Haskell 2010 to write secure code.

Based on my possibly limited understanding of the new
feature--automatically generating instances of the Coerce type seems
very un-Haskell-like.  By analogy, we could automatically generate
instance of Read and Show (or add equivalent DebugRead/DebugShow
classes) wherever possible, but this would similarly break abstraction
by providing effective access to non-exported constructors.

I understand why there is a need for something better than
GeneralizedNewtypeDeriving.  However, implementing Coerce as a
typeclass has the very serious disadvantage that there is no Haskell
mechanism for controlling instance exports.  And if we are going to
add a new mechanism (roles) to control such exports, exporting an
instance that is never requested and that undermines modularity and
abstraction is an unfortunate default.

It may be too late for this, but a cleaner solution more in keeping
with other extensions would be to have a -XDeriveCoerce extension that
allows Coerce to be explicitly derived when safe.  This could be
combined with leaving the previous behavior of
GeneralizedNewtypeDeriving and just deprecating the language feature.

Though controlling instance exports does not have a precedent, another
option might be to special-case the Coerce class and only export
instances of Coerce when all constructors of a type are also exported.
This would prevent anyone from using Coerce to do things they couldn't
already do manually.

comment:14 Changed 18 months ago by nomeata

Just as a data point:

Though controlling instance exports does not have a precedent, another option might be to special-case the Coerce class and only export instances of Coerce when all constructors of a type are also exported. This would prevent anyone from using Coerce to do things they couldn't already do manually.

This is what we had originally; the check was removed in 59722295bb8da8f01d37356fbed6aef7321a8195/ghc. It wouldn’t be hard to re-introduce it; but it will mean that a lot of uses of coerce or GND in Safe Haskell will fail, including coerce :: Set Age -> Set Int.

Maybe requiring a deriving (Coercible), or -XDeriveCoerce, or a standalone deriving declaration to get the coerce-under-type constructor behaviour isn’t such a bad idea after all, even outside the context of Safe Haskell? It would turn the current blacklisting (“add role annotatoin if it is not safe”) into whitelisting (“tell us that Coercing is safe, and how”).

For additional convenience we could retain the previous behaviour of „instance available if all constructors are in scope“, so that the author of simple cases like [], Either, Maybe do not have to do something; only those who hide their constructors have to act to allow their users to coerce under their type constructor.

TL;DR: Coercing under a type constructor is allowed if (a) all involved constructors are in scope (“could write it by hand test”) or (b) someone who had access to the constructors (most likely the author) explicitly declared the instance, using some form of deriving. No special behaviour needed for Safe Haskell.

comment:15 Changed 18 months ago by simonpj

Coercible is not a regular type class, and its "instances" are not subject to the usual rules of "always-exported". So you can 100% ignore the rules for normal instances.

The question is "under what circumstances are the instances of Coercible visible?". Or, to make it sound less type-class-ish, you could ask "what rules are used to solve Coercible constraints?".

Section 3 of the paper discusses this in some detail -- I hope everyone in this conversation has read it carefully. Specifically:

  • The "newtype-unwrapping" instance is available only if the newtype data constructor is visible. This is uncontroversial, just what Safe Haskell wants, and is clearly not what a "normal instance" is like.
  • The "lifting" instances are controlled by roles. And here there is room for debate about the proper defaults for those roles. For example, we could say that with Safe Haskell the default roles (ie if you don't give an explicit signature) of any data type declared in that module are nominal rather than representational.

Simon

comment:16 Changed 18 months ago by nomeata

For example, we could say that with Safe Haskell the default roles (ie if you don't give an explicit signature) of any data type declared in that module are nominal rather than representational.

What would that mean for inferring a module as Safe? Is A in the example Safe or nor? Would a module only be safe if it annotates everything as “nominal”?

It seems to me that any special handling of Safe Haskell module is doomed to fail, because we want to be able to infer modules as Safe, i.e. we’ll have code where we do not know whether they are intended to be used as Safe modules or not.

comment:17 Changed 18 months ago by simonpj

I think it would be helpful first for David & David to say exactly what they don't like about this:

{-# LANGUAGE Safe #-}
module Lib( T, ... ) where

  data T a = Leaf a | Node (Tree a) (Tree a)

{-# LANGUAGE Safe #-}
module Client( main )

  import Lib

  newtype Age = MkAge Int

  foo :: T Age 
  foo = ....

  bar :: T Int
  bar = coerce foo

To me this seems fine. If (inside Lib) the functions over T make use of type-class constraints over T's parameter, then we can declare a nominal role:

{-# LANGUAGE Safe #-}
module Lib( T, ... ) where

  data T a = Leaf a | Node (Tree a) (Tree a)
                -- Balanced, based on Ord a
  role T nominal

I can see the following possible choices:

  • (A) Status quo. Data types without role signatures get representational roles by default. The above is accepted as-is
  • (B) Data types without role signatures get representational roles by default, except with {-# LANGUAGE SAFE #-} which changes the default to nominal. So Lib will compile without complaint, but Client will fail. Presumably, Lib without a Safe pragma would be inferred as Unsafe, since T would have representational role.
  • (C) Data types without role signatures always get nominal roles by default. That would mean that many many data types in libraries would have to have a role signature added, if you want to lift coercions over them.

David, David, can you say what you want and why?

Simon

comment:18 Changed 18 months ago by dterei

So I need to educate myself better still but based on my current understanding (from the wiki page on roles) and the example Simon, what David and I don't like about this is that you can no longer look at just an export list for your module to verify what a consumer of your module is capable of doing.

I.e., the Safe Haskell specific issue with GND that we cared about when we disabled GND was #5498. My understanding is that the new roles and coercion infrastructure that forms the base of 7.8's GND deals with all the type safety issues where you could easily segfault a program or inspect a bool as an int. However, by default it doesn't deal with the module boundary issue. In fact, it makes it worse as now phantom roles allow coercion between them.

For example, Ptr requires explicit use of roles to ensure safety. This isn't very satisfactory to DM and I. Now a developer needs to understand GHC specific Haskell and a some subtle details to correctly implement abstractions. We'd prefer if export lists were self-contained and authoritative, not export lists + roles.

Speaking just for myself, I think (C) would be ideal. It coincides most closely with my (and what I believe most programmers) intuitive expectations of data types and abstraction is and gives the strongest safety by default. (B) has the downside of making a lot of code by default not safe and we were very much hoping solving GND would expand the default world that is safe, not reduce it. (A) as explained above isn't great in our opinion.

P.S., I'm traveling right now so may be some delays in replying but I'll try to be prompt.

comment:19 Changed 18 months ago by simonpj

I don't have a strong opinion; (C) is fine with me if library authors are happy. Maybe they will be; after all, they didn't have lifted coercions before, and (by default) they still won't.

Note that (C) is not related to the visiblity or otherwise of a recursive set of data constructors; it's simply a question of what the default roles for an un-annotated data type are.

Does anyone else care?

It's a tremendous pity that all this is happening just as 7.8 is on the verge of release (indeed, I thought it had already happened). I am reluctant to delay it while we debate this issue further. (I am not optimistic about a rapid consensus.)

Simon

comment:20 Changed 18 months ago by carter

in what cases would (C) break pre 7.8 code? I think we should at least get Edwardk's take on this, at the very least..

comment:21 Changed 18 months ago by ekmett

Both the (A) and (C) solutions have bad cases. It comes down to picking a poison.

In (C), picking nominal as the default role means the GeneralizedNewtypeDeriving basically goes away with GHC 7.8. You will be unable to use GND for anything unless every author above you in the chain puts role annotations in. Moreover, we'll be stuck writing them forever for every data type. It still closes the GND loopholes, but it does so by nuking the site from orbit.

For (A), picking representational as the default role means that authors have to deal with unexpected encapsulation leaks. This required a half dozen annotations over base, and a couple in containers, maybe a dozen in lens and affects a relative minority of modules.

Both scenarios are bad.

Argument in favor of (A) is that many authors have already refactored to write their instances this way over the course of the last 2 release candidates, because we've been hammering home that they'll have to deal with it and that in the long term it is less work for everyone going forward, but it does mean that the folks who do care about SafeHaskell need to be conscious of the new language feature.

I rather don't like the fact that (A) puts a new burden on people silently, but I really don't like that (C) requires literally everyone to annotate everything forever. We'd never stop paying that tax, and everyone who thought they were ready for 7.8 is back to the drawing board, and needs to start preparing for the release anew.

Turning this around to try to find something productive here, I have two thoughts:

1.) One option might be to add a warning to any data type that doesn't export its constructors that has representational role. That isn't perfect. Some folks expose constructors in other-modules that they don't leak to to the end user in exposed-modules, but it'd be a start and covers for example, the cases in containers.

2.) Is there actually a SafeHaskell problem here at all? coerce itself isn't currently exposed in a Safe module, so it can't be used for a SafeHaskell exploit directly without trust. I am currently unaware of any path by which we can abuse GND alone to obtain a coercion.

If (2) holds, then while it is annoying to library authors that there is a new avenue to deal with that can violate their encapsulation, it does come into user-land from the same GHC.Prim module that you get unsafeCoerce# from, which can do everything coerce can do.

comment:22 Changed 18 months ago by ekmett

One option here is to just retain the status quo that GeneralizedNewtypeDeriving isn't available in Safe Haskell.

Then you'd retain the invariant that GND absolutely can't be used to violate library invariants, and since coerce isn't available in Safe mode without explicit trust, the interaction of these two features goes away completely.

comment:23 Changed 18 months ago by dterei

What of this:

  • nominal by default when constructors aren't all exported
  • representational by default when all constructors are exported

I understand the concern with (C) and the burden, I'm just not sure how high it is with some heuristics applied as suggested above. I like this approach (although I may be missing some corner cases) as it matches what I feel is a good high level starting point that GND and surrounding infrastructure (coercion) shouldn't allow you to write code that you couldn't write by hand unless explicitly opted-in by the library author.

comment:24 Changed 18 months ago by goldfire

I tend to agree with Edward's comments above, except on one perhaps-critical point: since RC 2, there is now a new module Data.Coerce in base that does export coerce Safe-ly. So, Edward's point (2) is incorrect. See #8745.

Separately, there has been a fair amount of debate between choices (A) and (C). I have advocated for (A), based on decreasing the pain for library authors. Perhaps in a clean-slate implementation of a language with coerce, I would favor (C), but that is not the case we have in front of us. (Yes, under (C), we would require more annotations, but we already require a host of instance declarations for many types, and this would just be yet another thing that Haskellers would know to do.)

The recursive-constructor-check that is described in the beginning of this ticket has lost its luster for me. If Safe inference worked, it would solve most of the problems brought up with the interaction between Safe and coerce. But, it also makes the coerce feature / GND very bowdlerized under Safe Haskell, and many desired uses of GND would no longer be possible. And, it's ugly from a user standpoint, requiring lots of importing of names unmentioned in code.

I see a two feasible ways forward, given the lateness of hour:

1) Keep the status quo.

2) Remove GND and coerce from the Safe subset for 7.8. We could look into ways of bringing them into Safe Haskell for 7.10.

In particular, I don't think that Simon's option (C) is viable, unless we want a Release Candidate 3, which I don't. Separately from the choice between (1) and (2), I definitely favor issuing warnings around missing role annotations, but it's unclear to me where the warnings should go. Edward's idea for warning when abstractly exporting a type with representational roles and no role annotation is a good start, and is perhaps the answer.

David's suggestion above (controlling roles in export lists) seems to present strange action-at-a-distance, in that if I don't export constructors from their defining module, the behavior is different from when I don't export the constructors from a different module. I worry that, while perhaps catching common cases, it would be very unexpected behavior in the details.

comment:25 Changed 18 months ago by thoughtpolice

Just as a note, I was talking with Edward about this for just a moment, and I'm inclined to agree with him and Richard that we should bar GND and Data.Coerce from -XSafe for 7.8 as this discussion goes on. It's a bit 11th hour, but the changes are easy and safe.

comment:26 Changed 18 months ago by ekmett

Ah. I had missed the addition of Data.Coerce.

I think the least pain would be temporarily marking that module Unsafe, and reverting to the pre 7.8 rule of no GND under Safe and then all we'll have "done no harm" as all code that worked before will continue to work and no new security problems will have been introduced.

I'd also be okay with looking into the 'if you don't export all your constructors you get a nominal role' rule for 7.10, but I am somewhat leery of introducing it in 7.8 after folks have already braced for a rather different impact.

comment:27 Changed 18 months ago by nomeata

May I throw the suggestion in comment:14 again into the ring? Most data types are exported with their constructors, so we want their users to coerce and GND as they wish; with that suggestion, no additional annotations are needed, and coerce can be replaced by hand-written code.

Abstract type constructors would, under this proposal, not be coercible by default, but this can easily be enabled using deriving (even instance deriving anywhere where the corresponding manual code can be written).

This is a variant of David’s suggestion (controlling roles in export lists) in that we correlate coercibility with constructor scope, but does not have the awkwardness of making the defining module’s export list special. In fact, it would quite precisely follow the guidance of “coerce (and GND) work exactly when you could write manual code for it”.

comment:28 Changed 18 months ago by dterei

So given the late hour, perhaps (sadly) as others have suggested, we keep the status quo for 7.8 and leave GND and Coercion unsafe. Be nicer if we could have solved for 7.8 but that's on me for not be reachable easily or following the mailing list.

Going forward, my suggestion or a similar proposal such as nomeata's in comment:27 sound the nicest to me. We could potentially argue for this with data as well by seeing how prevalent coercion and GND is on abstract types that don't have explicit roles.

If everyone can quickly reach consensus though on the above proposal then I'm very happy with that as well :)

comment:29 Changed 18 months ago by goldfire

I'll admit to liking the general flavor of comment:27, but the details elude me somewhat. What, precisely, would go after the word deriving? What's the interaction between derived instances and role annotations? Could a derived instance be more restrictive than the assigned roles? (It certainly couldn't be less restrictive!) How would a user specify that?

On a separate note, and with no intention to start a flame war, this process would have been easier if we had had a more accurate timeline on the 7.8 release. When some of this debate started in October, we debated under the "late hour" premise, as well. The RCs hadn't gone out, obviously, but any less-than-almost-fully-baked idea was discarded as "too much work too late". If we had known that we had a few months to work out a proposal and implement it, it's possible that we'd be in a better place today. Of course, schedules slip, but communicating a best guess (even as that guess changes) would have been helpful.

comment:30 Changed 18 months ago by nomeata

I'll admit to liking the general flavor of comment:27, but the details elude me somewhat. What, precisely, would go after the word deriving?

Well, either data MyList a = [a] deriving Coercible, which will give you the same instance as described in the paper, but independent of the scope of the constructor.

Or deriving instance Coercible a b => Coercible (Map k a) (Map k b), which allows you to control the coercing

This is even strictly more powerful that our current system: It would allow the author of Map to export the instance above, while still using coerce in his code (with constructors in scope and no abstraction to follow) to coerce the key, if he wishes to do so.

What's the interaction between derived instances and role annotations? Could a derived instance be more restrictive than the assigned roles? (It certainly couldn't be less restrictive!) How would a user specify that?

As you say: The derived instance have to adhere the roles, but may be more restrictive (if created using a standalone deriving instance). The non-standalone-deriving instance would be the one that we have currently.

OTOH, this gets hairy once we mix tycons with exported constructors and tycons without, but with exported Coercible instances, in a new data declaration and try to find out when we can coerce under that... and that issue is well handled by roles and role inference. *shrug*

comment:31 follow-up: Changed 18 months ago by ekmett

deriving Coercible a b => Coercible (Map k a) (Map k b)

is somewhat in line with what I've been playing around with to try to work out how we can lift coercions over complex data types.

e.g.

class Representational (t :: k1 -> k2) where
  rep :: Coercion a b -> Coercion (t a) (t b)
  default rep :: Phantom t => Coercion a b -> Coercion (t a) (t b)
  rep _ = phantom

class Representational t => Phantom t where
  phantom :: Coercion (t a) (t b)
  default phantom :: Coercible (t a) (t b) => Coercion (t a) (t b)
  phantom = Coercion

Then easy cases we can already handle work:

instance Representational Proxy
instance Phantom Proxy

instance Representational Tagged
instance Phantom Tagged
instance Representational (Tagged a) where rep Coercion = Coercion

instance Representational Const where rep Coercion = Coercion
instance Representational (Const a)
instance Phantom (Const a)

instance Representational Coercion     where rep = unsafeCoerce
instance Representational (Coercion a) where rep Coercion = Coercion

instance Representational (->)       where rep Coercion = Coercion
instance Representational ((->) a)   where rep Coercion = Coercion

But with a few helpers

coerce1 :: Coercible a b => Coercion a c -> Coercion b c
coerce1 = coerce

coerce2 :: Coercible b c => Coercion a b -> Coercion a c
coerce2 = coerce

-- from Control.Lens as a placeholder
new :: (Rewrapping s t, Coercible (Unwrapped s) s, Coercible (Unwrapped t) t) => Coercion (Unwrapped s) (Unwrapped t) -> Coercion s t
new = coerce1 . coerce2

-- I don't see how to implement this one directly
eta :: forall (f :: x -> y) (g :: x -> y) (a :: x). Coercion f g -> Coercion (f a) (g a)
eta = unsafeCoerce

we can write several hard cases that are currently beyond our reach:

instance (Representational f, Representational g) => Representational (Compose f g) where
  rep = new.rep.rep

instance Representational m => Representational (StateT s m) where
  rep = new.rep.rep.eta.rep

instance Representational m => Representational (ReaderT e m) where
  rep = new.rep.rep

instance Representational m => Representational (WriterT w m) where
  rep = new.rep.eta.rep

Then instead of lifting Coercible a b into Coercible (f a) (f b) based on the role of f's next argument, it'd lift using the Representational instance.

I'd been mostly exploring this as a straw man, but if we're throwing 'big changes' into the discussion, I felt it worth mentioning as a direction.

Mind you the code I have above is implementable and implemented with the existing Coercible machinery.

It isn't perfect though, e.g. I don't know a better way to implement

instance Representational f => Representational (Compose f) where
  rep = unsafeCoerce

comment:32 Changed 18 months ago by Austin Seipp <austin@…>

In 8f7303774237a8b0787d98c5ab6f605e3e897f19/ghc:

Revert "Fix #8745 - GND is now -XSafe compatible."

See #8827 - for now, we're making GND unsafe again.

This also fixes the tests since they were originally not using the new
unicode quote style we're using.

This reverts commit a8a01e742434df11b830ab99af12d9045dfcbc4b.

comment:33 Changed 18 months ago by Austin Seipp <austin@…>

In 2dbde340fae8122342a4d7c13fea7890ab963d11/base:

Mark Data.Coerce as Unsafe (#8827)

Signed-off-by: Austin Seipp <[email protected]>

comment:34 Changed 18 months ago by thoughtpolice

  • Milestone changed from 7.8.1 to 7.10.1
  • Version changed from 7.8.1-rc2 to 7.9

This is now taken care of in HEAD and 7.8, so I'm moving it out of the milestone.

comment:35 in reply to: ↑ 31 Changed 18 months ago by simonpj

Replying to ekmett:

is somewhat in line with what I've been playing around with to try to work out how we can lift coercions over complex data types.

Edward, I'm not following all the details of your comment here (like where is Coercion defined?), but it smells to me that you are hitting the the main limitation of the system as-implemented, namely the inability to abstract over a type constructor with a representational argument. Eg

data T f a = MkT (f a)

Can I do Coercible (T f Int) (T f Age) (where Age is a newtype for Int)? Only if we are guaranteed that f will be instantiated with type constructor with representational roles.

So you want higher-order roles. We know how to do that, but rejected it as Too Complicated. We could revisit that I guess.

I may also be misunderstanding your problem. Perhaps it would help to exhibit the simplest case that doesn't work.

Simon

comment:36 Changed 18 months ago by simonpj

OK, so we've essentially decided (for 7.8)

  • Keep the status quo (as presented in the paper) for 7.8
  • But GND and Data.Coerce are both out of bounds for Safe Haskell

That's ok with me. Subsequent to 7.8, the Davids (or other Safe Haskell folk) may want to propose a concrete plan for getting them back in. Or maybe it just doesn't matter because Safe Haskell clients don't care (enough) about GND or Coercible.

FWIW I am keen to avoid any solution based on direct user control of Coercible instances. We have roles (we need them regardless to guarantee type-soundness); what we want can be expressed through roles or role signatures; adding another mechanism of control that does almost the same thing should be avoided if at all possible.

Indeed, I don't think in terms of instance declarations for Coercible at all; instead it's just a matter of what rules are available for solving Coercible constraints. To take an analogy, implicit parameters are internally implemented as type-class constraints, with some special rules. They share some stuff in common with type class constraints, but are best thought of separately. Same with Coercible.

Simon

comment:37 Changed 8 months ago by thoughtpolice

  • Milestone changed from 7.10.1 to 7.12.1

Moving to 7.12.1 milestone; if you feel this is an error and should be addressed sooner, please move it back to the 7.10.1 milestone.

comment:38 Changed 5 months ago by simonpj

The GHC 7.6 user manual used to say: GeneralizedNewtypeDeriving — It can be used to violate constructor access control, by allowing untrusted code to manipulate protected data types in ways the data type author did not intend, breaking invariants they have established..

But that was removed in the 7.8 user manual, which seems to contradict the conclusion in comment:36. Would someone like to fix the manual to say what it should say, whatever that now is?

See this Haskell Cafe thread.

Simon

comment:39 follow-up: Changed 3 weeks ago by oerjan

  • Cc oerjan added

It seems to me that there have been several competing goals mentioned here. However, I do not think they are impossible to mostly satisfy simultaneously, except perhaps for simplicity of the design. (In particular, I think it requires reinstating the constructor check.)

  1. Code that is not annotated with roles should still largely enjoy the same module encapsulation as in H2010, so that module writers do not need to consider the implications of coerce or GND if they are not actually using them. With Safe Haskell, coerce and GND should not be able to create code based on such a module that couldn't be written "by hand".
  2. Even without role annotations, coerce and GND should still be possible to use in Safe Haskell for most code that can be written by hand. (Preferrably as much as today without Safe Haskell enabled).
  3. Safe Haskell should be inferrable without changing the semantics of a module.
  4. Exporting all the constructors of a type from an Unsafe "Internal" module should not prevent data encapsulation by not reexporting them from a Trustworthy one.
  5. Explicit role annotations, when used, should overrule all automatic restrictions on Safe mode, since that means the author has explicitly stated their intent.

Point 1 means that it is not ideal to make roles default to representational with no further checks. Point 2 means, similarly, that it is not ideal to make nominal the default. Point 4 means that any constructor export check cannot just be done by looking at the module defining the type.

Given this, I hope the following is compatible with all the goals above:

  • Default inferred role remains representational.
  • Any use of coerce (including via GND) must respect roles etc. as currently without Safe Haskell.
  • If allowed in general, a "lifting" use of coerce is compatible with Safe if either:
    • The type has an explicit role annotation, or
    • All of the type's data constructors are in scope.

comment:40 in reply to: ↑ 39 ; follow-up: Changed 3 weeks ago by goldfire

Replying to oerjan:

Given this, I hope the following is compatible with all the goals above:

  • Default inferred role remains representational.
  • Any use of coerce (including via GND) must respect roles etc. as currently without Safe Haskell.
  • If allowed in general, a "lifting" use of coerce is compatible with Safe if either:
    • The type has an explicit role annotation, or
    • All of the type's data constructors are in scope.

For this to work out, the last check above must be recursive, looking at all datatypes mentioned in those in-scope data constructors, out to the leaves. Otherwise, a programmer could write a trivial wrapper around a type; all the data constructors would be in scope for the wrapper, and then the programmer could coerce away. It's the recursiveness of this check that's annoying.

Another (small) problem with this is that it means redundant role annotations are no longer a no-op. For example:

data Maybe1 a = Just1 a | Nothing1

data Maybe2 a = Just2 a | Nothing2
type role Maybe2 representational

Maybe1 and Maybe2 will have subtly different behavior with respect to Safe Haskell under this proposal. And just because the author of Maybe2 wanted to add some documentation about roles, much like most programmers add easy-to-infer type signatures. I don't think the point I'm making should kill this proposal, but it is a downside.

comment:41 in reply to: ↑ 40 Changed 3 weeks ago by oerjan

Replying to goldfire:

For this to work out, the last check above must be recursive, looking at all datatypes mentioned in those in-scope data constructors, out to the leaves. Otherwise, a programmer could write a trivial wrapper around a type; all the data constructors would be in scope for the wrapper, and then the programmer could coerce away. It's the recursiveness of this check that's annoying.

Oops. What's worse, this interacts with role annotations:

module A (A) where

-- inferred type role A representable
data A a = A' a

...

{-# LANGUAGE Safe #-}
module B where

import A

type role B representative

newtype B b = B (A b)

By my literal proposal above, module B has just managed to achieve such a wrapper, just with a role annotation and without A' in scope. Which means my point 5 is unsupportable as given, and the closest option I can think of requires tracking Coercible separately from roles - B here should have the role given, since that's the inferred one, but it should not therefore automatically be Coercible to A, except where A' is in scope.

Or wait, hm. Perhaps a simpler option might be to distinguish representable from inferredRepresentable (better name hereby solicited), and have the rule that inferredRepresentable can be treated as representable only when all necessary constructors are in scope, recursively until you reach a non-inferred role. (Similarly for phantom, I guess.) That makes the role annotation for B wrong because A's role is inferredRepresentable, and A' is not in scope.

Also, there should then probably be a pragma to automatically remove the "inferred" when possible for types declared in a module.

Warning: the above was written in a bit of a hurry.

comment:42 Changed 2 weeks ago by oerjan

I think what was wrong with my first attempt was forgetting that role annotations also need the safety check. Also the safety of an explicit role annotation has two sides:

  • When using an explicit role annotation, the recursive check for that type is disabled at the use site. I.e. an explicit role annotation is to be interpreted as explicit permission to do the implied coercions Safely without further ado thereafter.
  • However, the annotation itself should not be Safe (nor the module containing it) unless the coercions implied by it can be written by hand using what's in scope in the module, plus coerce for other types with explicit annotations. This means a recursive check for all its non-nominal parameters.

Putting this together, the recursive check, when it happens, should cut off whenever an explicit role annotation is reached. A role annotation for a type transfers coercion Safety responsibility from the module using the role to the module declaring it.

So I'll modify point 5, and add a point 6 that became apparent:

  1. Explicit role annotations, when used, transfer the responsibility for the Safety of coerceing a type to the annotating module.
  2. To avoid breaking code, none of the new restrictions will apply outside of Safe mode.

So, to sum up the modified design:

  • Default inferred role remains representational.
  • Explicit role annotations do indeed have subtly different semantics from implicitly inferred ones: they are considered stated author permission to coerce a type, even when not all constructors are in scope at the use site.
  • Declaring a role annotation, coerceing types, or GND require checking the (non-nominal) parameter roles for Safety. Failure of the check means the module is not Safe, but has no other semantic effect.

My non-expert attempt to define the recursive Safety check, probably similar to what GHC used to do:

  • A nominal role is always Safe.
  • The role of a parameter from an explicit role annotation is Safe.
  • The role of a parameter from a type's implicitly inferred role annotation is Safe if:
    • The type's data constructors are all in scope, and
    • the roles of all nested type parameters containing the given parameter on the right side of the datatype declaration are Safe.

A few more thoughts/options:

  • As I mentioned, there probably should be an extension to make implicitly inferred roles behave as explicit ones for types declared in a module. Maybe RoleAnnotations itself will do, or at least imply it.
  • Since this proposal makes implicit and explicit role annotations subtly different, someone might want to have a way to explicitly declare a role annotation that behaves like an inferred one. Maybe the inferredRepresentational idea from my previous message.
  • By point 6, module encapsulation can still be broken nilly-willy by non-Safe modules, even by accident. Perhaps this is a bad idea, and an explicit extension (NoScopedRolesRestriction?) should be required to disable the recursive check for non-Safe modules. However, this would break backwards compatibility, and probably need a deprecation period.

comment:43 Changed 2 weeks ago by goldfire

A few reactions:

  • I think this proposal holds water, from a technical standpoint.
  • I'm worried about the efficiency of Safe-inference. This proposal requires doing a recursive check, potentially loading new interface files, just for inferring Safety. This makes me sad. I have no suggestion for improvement, however.
  • Along similar lines, just expanding a module's import list can change it from unSafe-inferred to Safe-inferred. Are the extra imported data constructors considered used in the module? That is, do they get "redundant import" warnings? Either answer to that last question seems wrong.
  • The reason that the default role is representational (phantom, actually) is solely for backward compatibility. In a newly-minted language, I think choosing nominal here would be rather uncontroversial. Is it worth recasting this debate as a move toward a nominal-default? With this change, the Safety issue goes away. This would be socially harder, but perhaps better in the long run.

Here's a stab at that last point:

Plan to introduce, in GHC 7.14, a new extension PhantomDefaultRoles. With this extension, roles would work as they do today. Roles would also be Safe, because the author declares their knowledge of roles by using this extension. Without the extension enabled, a nominal role is default.

We would introduce a warning in GHC 7.12 about this impending change. I don't have a good proposal for how to implement the warning, though -- it would amount to an inefficient recursive check. Maybe someone has a better idea.

By making this change via an extension, it's easy for library-writers to upgrade: include a straightforward conditional block in a .cabal file. All packages could be upgrade by something like (I didn't look up concrete syntax) if impl(ghc >= 7.12) default-extensions: PhantomDefaultRoles. Easy!


In sum, I think that the proposal in comment:42 is feasible, but I'm worried about efficiency. But I think that tackling the larger problem of default roles is better in the long run.

comment:44 Changed 2 weeks ago by oerjan

My hunch about nominal default is that it would be annoying in the same way that authors forgetting to provide a useful type class for a type is annoying, but worse because a user of the package cannot afterwards declare an orphan Coercible instance, even if they need it. You could write an explicit coercion function, or use unsafeCoerce, but if you were an intermediate package writer declaring a wrapping type, you couldn't then provide Coercible for the users of your type.

comment:45 Changed 2 weeks ago by oerjan

I was hoping that the cutoff at explicit role annotations would mitigate the efficiency issue - presumably most major packages would have them, or the pragma to make implicit ones explicit. Otherwise, perhaps the interface files could list the needed constructors for an implicit role parameter?

comment:46 Changed 13 days ago by goldfire

You're right about the annoyance of a nominal default, and that it's worse than with classes. Bah.

And, we may be scared about an efficiency issue that does not arise much in practice, especially with a user-controllable cutoff mechanism. I don't think adding the constructors to interface files helps much, because a client would still need to import the constructors from an exporting module.

Simon PJ has had strong opinions here in the past, and I'm curious to see what he thinks.

Note: See TracTickets for help on using tickets.