Opened 3 years ago

Closed 3 months ago

#5462 closed feature request (fixed)

Deriving clause for arbitrary classes

Reported by: simonpj Owned by: dreixel
Priority: normal Milestone: 7.10.1
Component: Compiler Version: 7.2.1
Keywords: Generics Cc: wadler@…, jpm@…, hvr@…, dterei, leather@…, illissius@…, erkokl@…, mail@…, v.dijk.bas@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #7346 Differential Revisions: Phab:D476

Description

Currently, you can use a deriving clause, or standalone-deriving declaration, only for

  • a built-in class like Eq or Show, for which GHC knows how to generate the instance code
  • a newtype, via the "newtype-deriving" mechanism.

However, Pedros's new generic-default mechanism means that it makes perfect sense to write this:

data T a = ...blah..blah... deriving( Generic )

instance C a => C (T a)  -- No 'where' clause

where C is some random user-defined class. Usually, an instance decl with no 'where' clause would be pretty useless, but now that we have default method signatures, in conjunction with deriving( Generic ), the instance can be useful.

That in turn leads to a desire to say

data T a = ...blah..blah... deriving( Generic, C )

which is even more compact. Is the extra compactness worth it? Presumably we'd only want to do this for some specified classes C, so we'd need some way to say this at C's declaration site. Something like:

class C where
  op :: a -> a
  default op :: blah => a -> a
  op = <code for the default method>

{-# DERIVABLE C #-}   -- This says you can say
                      -- data T = ... deriving( C )

I'm a bit dubious about whether the payoff justifies the cost, but I'm recording the idea in this ticket.

Change History (48)

comment:1 Changed 3 years ago by dreixel

  • Cc jpm@… added

comment:2 Changed 3 years ago by guest

I would argue that it is worth the extra compactness, in that I want
to be able to teach this to first years. I don't want to teach them
to use

... deriving Show

to get unpretty printing, but then they have to use the more long-winded

instance (Pretty a, Pretty b) => Pretty (Tree a b)

to get pretty printing. If it's good for the goose (your built-in
derived types) it's good for the gander (my custom derived types).

-- Philip Wadler (logged in as guest because the registration function is buggy)

comment:3 Changed 3 years ago by hvr

  • Cc hvr@… added

Could this be designed in such a way, that one would also be able to optionally provide a TemplateHaskell code generator for used-defined auto-derivable instances?

comment:4 Changed 3 years ago by dterei

I would vote in favour of this. I was a little disappointed when learning the new deriving mechanism that you couldn't already do this. I like that this method doesn't present to a user any difference from deriving Show to deriving C. That is what I think is more important, not the compactness but reusing the same interface for both which lowers the complexity.

comment:5 Changed 3 years ago by dterei

  • Cc dterei added

comment:6 follow-up: Changed 3 years ago by simonpj

I hate the {-# DERIVABLE C #-} pragma though -- it's not really a pragma at all (in the sense of affecting optimisation but not semantics or what programs are typeable). But we need some way to say that class C is one that can appear in a deriving clause. We can't allow just any old class in a deriving clause!

Any other ideas?

comment:7 in reply to: ↑ 6 Changed 3 years ago by dterei

Replying to simonpj:

We can't allow just any old class in a deriving clause!

I don't understand this issue. Why do we need to annotate certain classes as being allowed in a deriving clause? The annotation doesn't add any extra information (except a boolean to say its allowed) so GHC must have everything it needs to derive. Why would I want to write a class and all the supporting code to allow it to be derived but not add the DERIVABLE pragma?

comment:8 Changed 3 years ago by dreixel

One extra issue to consider is that the instance head is not trivial to generate. Say we have an arbitrary deriving clause:

data MyDatatype a = C1 | C2 a (MyDatatype a) deriving MyClass

We want to convert it into an empty instance:

instance (...) => MyClass (MyDatatype a)

But how do we fill in the constraints? This depends on both MyClass and MyDatatype, and in a non-trivial way, as far as I can see. E.g., if MyClass implements a show function, we probably want the MyClass a constraint. But if a was actually a phantom type, then we don't. Nor if MyClass computes shallow equality, for instance. Things get even worse when MyDatatype is not a regular type (think of nested datatypes, or types with higher-order arguments).

This problem does not occur when standalone deriving is used, because then the user specifies the constraints directly. But I guess that limiting this new feature to standalone deriving would make it a bit inconsistent, and also defeat the purpose of extra compactness/simplicity.

comment:9 follow-up: Changed 3 years ago by simonpj

Standalone deriving is, in effect, already allowed for any class. You just write

instance C a => C (T a)  -- No 'where' clause

(not deriving instance, just plain instance). Like any instance decl, this will fill in with the default methods; and if the default methods use Pedro's new generic-programming infrastructure, the default methods will do the right thing.

The question for this ticket is: should be able to generate this (where-clause-less) instance decl by saying data T a = ... deriving( Generic, C ).

You ask "Why would I want to write a class and all the supporting code to allow it to be derived but not add the DERIVABLE pragma?" By "all the supporting code" I assume you mean default methods a la Pedro (see http://www.haskell.org/ghc/dist/current/docs/html/users_guide/type-class-extensions.html#class-default-signatures). Well, you wouldn't. But what counts as "you have written all the supporting code"?

Perhaps your proposal is this? A class can appear in a deriving clause if (and only if)

  • the class has at least one default foo :: type signature, and
  • the class has a default method definition for every method

Thus

class C1 a where  -- NO (no default method)
  op1 :: a -> a

class C2 a where  -- NO (non default method signature)
  op2 :: a -> a
  op2 x = x

class C3 a where  -- YES (both are present)
  op3 :: a -> a
  default op3 :: Ord a => a -> a
  op3 x = x>x

But I'm not sure whether that is what you meant.

comment:10 in reply to: ↑ 9 Changed 3 years ago by dterei

Replying to simonpj:

But I'm not sure whether that is what you meant.

Yes that is what I meant. Basically GHC should set a list of requirements that must be met (chosen so GHC has enough information to correctly derive) and if a class meets that you can put it in a deriving clause. Otherwise if you put it in a deriving clause you'll get an error message like you currently do.

The proposed DERIVE pragma is the same as this anyway except I guess you could throw an error earlier and maybe a little more informative if you encounter a class that has the pragma but doesn't meet the requirements.

comment:11 Changed 3 years ago by dterei

Oh and as part of the change it would be good to add support to Haddock to produce docs that tell you if a Class can be derived or not. Could also add a command to ghci.

comment:12 follow-ups: Changed 3 years ago by simonpj

The question is what should the 'list of requirements' be?.

Remember it is always legal to say

instance C (T a) where {}

becuase the compiler will fill in a bunch of defaults from the class decl, or error thunks if you supplied none. It's just not usually useful to do so.

When it is useful? Really only the programmer can tell us. I was proposing above that a clue might be that s/he provided at least one default-method signature (using the new Pedro stuff) since that is our main route to providing non-trivial defaults.

Yes it'd be a little smoother to allow deriving( Generic, C ). But it needs design.

comment:13 in reply to: ↑ 12 Changed 3 years ago by dterei

Replying to simonpj:

Remember it is always legal to say

instance C (T a) where {}

becuase the compiler will fill in a bunch of defaults from the class decl, or error thunks if you supplied none. It's just not usually useful to do so.

Ahh I wasn't aware you could do this. OK, well are there any downsides to the list of requirements you proposed? They are what I was imagining myself as my context for all of this is the new Pedro stuff.

comment:14 in reply to: ↑ 12 ; follow-up: Changed 3 years ago by guest

Replying to simonpj:

the compiler will fill in a bunch of defaults from the class decl, or error thunks if you supplied none. It's just not usually useful to do so.

proposed sounds perfectly reasonable - but just to ask about the more liberal option: while a collection of error thunks is far more likely a programmer mistake than the intended effect of putting an arbitrary class into the deriving clause (dterei didn't expect it could even work, nor did I), I'd think getting a boring instance of default methods is precisely what s/he would expect if deriving from a class w/o any default method signatures. So why not let him/her have it? Apart from probable uselessness, is there a price to be payed by such a choice?

comment:15 in reply to: ↑ 14 Changed 3 years ago by hvr

Replying to guest:

I'd think getting a boring instance of default methods is precisely what s/he would expect if deriving from a class w/o any default method signatures. So why not let him/her have it? Apart from probable uselessness, is there a price to be payed by such a choice?

How would this interact with the GeneralizedNewtypeDeriving extension? Could there be cases where the code compiles fine with and without GeneralizedNewtypeDeriving, but in the first case the wrapped types' instance methods are "inherited", and in the second case the default methods of the typeclass would be used. Would this be a bad thing if switching on language extensions changes the semantics of the program (w/o any visible compile warning)?

comment:16 Changed 3 years ago by igloo

  • Milestone set to 7.6.1

comment:17 Changed 3 years ago by spl

  • Cc leather@… added

comment:18 Changed 3 years ago by mux

I would be in favor of this too; but since I'm not entirely sure what the original proposal meant exactly, let me state what I think the semantics should be.

I think that this should be strictly syntactic sugar, without any compiler logic behind it. So:

data T = ... deriving C

would get translated to:

instance C T

The intent is to keep things as simple as possible. As a consequence, trying to add some class to a deriving clause would fail if it requires a context, exactly in the same way as an "instance C T" without any context would.

comment:19 Changed 3 years ago by illissius

  • Cc illissius@… added

comment:20 Changed 3 years ago by lerkok

  • Cc erkokl@… added

comment:21 Changed 2 years ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:22 follow-up: Changed 13 months ago by lerkok

  • difficulty set to Unknown

Is there any progress on this ticket? Looks like milestone was changed to 7.6.2. I've got 7.6.3 installed, which does not seem to support such deriving instances. Do I need a particular flag to enable this?

comment:23 Changed 13 months ago by dreixel

  • Keywords Generics added
  • Milestone changed from 7.6.2 to
  • Owner set to dreixel

comment:24 in reply to: ↑ 22 ; follow-up: Changed 13 months ago by dreixel

Replying to lerkok:

Is there any progress on this ticket? Looks like milestone was changed to 7.6.2. I've got 7.6.3 installed, which does not seem to support such deriving instances. Do I need a particular flag to enable this?

No, this was not implemented yet. But I like the idea, and I think Simon's specification is good (repeating here):


A class can appear in a deriving clause if (and only if) the class has at least one default foo :: type signature, and the class has a default method definition for every method. Thus:

class C1 a where  -- NO (no default method)
  op1 :: a -> a

class C2 a where  -- NO (non default method signature)
  op2 :: a -> a
  op2 x = x

class C3 a where  -- YES (both are present)
  op3 :: a -> a
  default op3 :: Ord a => a -> a
  op3 x = x>x

Only thing left to answer is how to determine the context (in case of standard deriving; in standalone deriving, the user provides the context). For that, I think that Andres's proposal in #7346 sounds reasonable:


I propose that if normal deriving is used, GHC uses the same heuristic for figuring out the class context that it uses for Eq in the case of *-kinded classes, and for Functor in the case of * -> *-kinded classes. That may not be optimal or even wrong. But in such cases, standalone deriving can still be used.


If there is no opposition, I'm happy to have a go at implementing this. I agree with dterei that Haddock and GHCi support would be desirable too.

Last edited 13 months ago by hvr (previous) (diff)

comment:25 Changed 13 months ago by kosmikus

  • Cc mail@… added

comment:26 Changed 13 months ago by basvandijk

  • Cc v.dijk.bas@… added

comment:27 in reply to: ↑ 24 ; follow-up: Changed 5 months ago by hvr

Replying to dreixel:

If there is no opposition, I'm happy to have a go at implementing this. I agree with dterei that Haddock and GHCi support would be desirable too.

As I'd be excited to see this feature rather sooner than later... do you have given it a shot yet? :)

comment:28 in reply to: ↑ 27 Changed 5 months ago by dreixel

Replying to hvr:

As I'd be excited to see this feature rather sooner than later... do you have given it a shot yet? :)

No, but maybe I can do it (or convince someone else to do it) this weekend at HacBerlin!

comment:29 follow-up: Changed 5 months ago by rwbarton

Can't we determine the context for the derived instance from the contexts of the default method type signatures?

comment:30 in reply to: ↑ 29 Changed 5 months ago by dreixel

Replying to rwbarton:

Can't we determine the context for the derived instance from the contexts of the default method type signatures?

Not really. Look at the serialisation example from the wiki page:

putDefault :: (Generic a, GSerialize (Rep a)) => a -> [Bit]

Nothing here tells us that we need Serialize constraints on the datatype parameters.

More info comes from the K1 instance, though:

instance (Serialize a) => GSerialize (K1 i a) where
  gput (K1 x) = put x

Here we see that the constructor arguments will require a Serialize instance. So if parameter a appears as argument to a constructor, we should introduce the constraint Serialize a. But this is non-trivial in general... the constraint might be part in an instance of the form K1 i a :*: g, for example.

I think going for Andres's proposal is still the best idea, as it is simple and relates to something people already know.

comment:31 Changed 5 months ago by simonpj

You know what I'm going to say. Can you write a wiki page that specifies the feature, as seen by the programmer, as precisely as possible. When it is legal? What instance declaration is generated? What is the context on that instance declaration.

If there are any implementation wrinkles, they can appear in a separate section on the wiki page.

Thanks

Simon

comment:32 Changed 5 months ago by rwbarton

Well, I was imagining that you would simply reduce the constraints using whatever instances are in scope, just like when inferring the type of a top-level definition.

However, now I see that this falls apart in the presence of non-uniform recursion. Quite annoying... but in this case there seems to potentially be a genuine choice involved, so it might be sensible to just reject "deriving" then and require the user to write their own instance head.

comment:33 Changed 5 months ago by dreixel

  • Milestone changed from to 7.10.1

I'm working on this in branch wip/T5462. Coding is complete, now I just need to write the user manual description and an informative wiki page.

See 7a4cdef85b0fa03a22fda595ac92870465d8c727

comment:34 follow-up: Changed 5 months ago by simonpj

Goals:

  • Save a few characters: say deriving( C ) rather than deriving instance C a => C (T a)
  • Make it possible for a library author to make a class C which can be used in ..deriving( C ) just like built-in classes.
  • Ultimately: remove the built-in code for deriving Eq, Ord etc with generic code.

Concerns

  • Conflict between this feature and GND for newtypes.
    • Current proposal is that this new feature is not available for newtypes.
    • An alternative: a declared property of the class says "do not use GND" in deriving clauses. (Examples: Show, Read, and one or two others.) Maybe we don't even need to make this user-controllable; a handful of built-in classes may suffice.
    • Another alternative: fail if there ambiguity.
    • Richard wants it to be a property of the class and over-rideable at the deriving site.
  • For a top-level instance instance C a => C (T a), this is always an empty instance, filled in by generic defaults if they exist. But what about deriving instance C a => C (T a). Does that mean precisely the same thing?
  • If you say data T a = ... deriving( C ), and change to data T a = ...; deriving instance C a => C (T a), does that make a difference?
  • Possible principle. Use per-class control for newtype ... deriving( C ) situation, but per-instance control for deriving instance and instance declarations.

Another question:

  • Just having one generic default method may not be enough. Maybe two are needed to give a well-founded instance.
  • Better: class author writes an explicit MINIMAL pragma with an empty minimal set. That says, fully explicitly, that the class author is happy with instances that have no explicit methods.
  • I think we converged on the latter.

comment:35 in reply to: ↑ 34 Changed 5 months ago by hvr

Replying to simonpj:

  • Make it possible for a library author to make a class C which can be used in ..deriving( C ) just like built-in classes.
  • Ultimately: remove the built-in code for deriving Eq, Ord etc with generic code.

IMHO the value of theses two goals should not be underestimated, as this feature would allow for custom Preludes to provide (as an example) custom Eq/Ord instances different from the default ones generated by GHC while allowing to use automatic derivation syntax as if they were built-ins (which I don't think RebindableSyntax already allows at this point).

Last edited 5 months ago by hvr (previous) (diff)

comment:37 Changed 5 months ago by hvr

@dreixel, is your code ready for code revision via as a Phab:differential ?

comment:38 follow-up: Changed 5 months ago by dreixel

No, not yet, I'm afraid.

comment:39 Changed 5 months ago by sjoerd_visscher

From the wiki page: "We could try to figure this out in a clever way from the definition of the class being derived, but this is very hard in general." Why is this very hard in general? If I leave out the context in instance MyClass (MyDatatype a) then GHC will complain that it is missing MyClass a. So can't we use the type checker to give us the required context?

comment:40 in reply to: ↑ 38 Changed 4 months ago by hvr

Replying to dreixel:

No, not yet, I'm afraid.

Is this still possible for GHC 7.10? Do you need help with something?

comment:41 Changed 4 months ago by dreixel

Oh, I definitely want it in for 7.10. But I thought we were doing a 7.8.4, so 7.10 got pushed back? What's the deadline for feature merge?

comment:43 Changed 4 months ago by hvr

  • Differential Revisions set to Phab:D476

It's up for review at Phab:D476!

comment:44 Changed 4 months ago by hvr

  • Status changed from new to patch

comment:46 Changed 4 months ago by dreixel

I need to update that page. On it.

comment:47 Changed 3 months ago by Austin Seipp <austin@…>

In 7ed482d909556c1b969185921e27e3fe30c2fe86/ghc:

Implement #5462 (deriving clause for arbitrary classes)

Summary: (this has been submitted on behalf on @dreixel)

Reviewers: simonpj, hvr, austin

Reviewed By: simonpj, austin

Subscribers: goldfire, thomie, carter, dreixel

Differential Revision: https://phabricator.haskell.org/D476

GHC Trac Issues: #5462

comment:48 Changed 3 months ago by thoughtpolice

  • Resolution set to fixed
  • Status changed from patch to closed
Note: See TracTickets for help on using tickets.