Opened 5 years ago

Closed 2 years ago

Last modified 4 months ago

#6018 closed feature request (fixed)

Injective type families

Reported by: lunaris Owned by: jstolarek
Priority: normal Milestone: 8.0.1
Component: Compiler Version: 7.4.1
Keywords: TypeFamilies, InjectiveFamilies Cc: simonpj@…, mokus@…, eir@…, nathanhowell@…, nfrisby, conal@…, shane@…, andy.adamsmoran@…, byorgey@…, douglas.mcclean@…, ollie@…, acowley@…, ekmett@…, jan.stolarek@…, ggreif@…, vogt.adam@…, haskell@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case: T6018, T6018fail, T6018rnfail, T6018th, T6018failclosed1, T6018failclosed2, T6018ghci, T6018ghcifail, T6018ghcirnfail
Blocked By: Blocking:
Related Tickets: #4259, #10832, #10833 Differential Rev(s): Phab:D202
Wiki Page:

Description

Injective type families have been discussed several times on the mailing list and identified as a potentially useful feature.

I've naively attempted a proof-of-concept in GHC. It's almost certainly incorrect and probably breaks the type system, but I thought it best to put it here and see if it can be made workable.

In summary, my changes are:

  • Add a new keyword, injective, which is available only when the TypeFamilies extension is enabled. Injective families may then be defined thus:
  injective family F a :: *
  type instance F Int  = Bool
  type instance F Bool = Int

  injective family G a :: *
  type instance G a = a

Syntax is of course completely changeable; I've simply picked one of the possible designs. Hopefully this won't be subjected to too much bike-shedding.

  • Add the constructor InjFamilyTyCon to TyConRhs and the family flavour InjectiveFamily to HsSyn. Again, I've opted to encode injectivity as a flavour rather than (say) a Bool attached to type families. This is a completely arbitrary choice and may be utterly stupid.
  • Altered the definition of decomposable TyCons to include injective families (isDecomposableTyCon). This effectively introduces a typing rule that says if we have (F a ~ F b) then we can deduce (a ~ b) (TcCanonical).
  • Modified the unifier so that it will attempt to replace saturated injective families with their left-hand sides where possible (TcUnify). This means that in a function such as:
  f :: F a -> F a
  f = ...

The type of f False is inferred as F Int (i.e., a is no longer ambiguous).

Some things work, some things don't. For example, the attached file typechecks, but if I actually try to evaluate f False I get nothing (not even a Segmentation fault).

See what you think.

Attachments (2)

Injective.hs (296 bytes) - added by lunaris 5 years ago.
Demonstration of injective families
injective-families.patch (55.1 KB) - added by lunaris 5 years ago.
Patch against GHC HEAD to support injective families

Download all attachments as: .zip

Change History (122)

Changed 5 years ago by lunaris

Attachment: Injective.hs added

Demonstration of injective families

Changed 5 years ago by lunaris

Attachment: injective-families.patch added

Patch against GHC HEAD to support injective families

comment:1 Changed 5 years ago by lunaris

Oops -- I forgot to list all that was wrong with the current implementation. Namely:

  • In the example, g's type is inferred as a -> a (and not G a -> G a). This is in some sense `correct', but seems a bit dodgy.
  • I don't actually *enforce* that the declared instances of a family *are* injective yet -- I've put that off until what's there is semi-stable.
  • I've messed up and inadvertently included whitespace changes in one of the patches. Apologies.

comment:2 Changed 5 years ago by simonpj

difficulty: Unknown

Some functions might be injective in one argument but not another. For example:

  F a b ~ F c d   ===>    a ~ c
                but not   b ~ d

comment:3 Changed 5 years ago by lunaris

Indeed; I'd not considered that case. I'm not aware of a well-defined semantics for injective type functions in the context of constraint solving so I've no idea how to proceed. There are of course syntactic complications also.

comment:4 Changed 5 years ago by mokus

Cc: mokus@… added

comment:5 Changed 5 years ago by mokus

An interesting use case is type-level co-naturals using -XDataKinds (which I'm using as the "height" type index for an experimental 2k-tree implementation):

data Nat 
    = Zero
    | Suc Nat

data CoNat
    = Co Nat
    | Infinity

type family Succ (t ::  CoNat) :: CoNat
type instance Succ (Co n)           = Co (Suc n)
type instance Succ Infinity         = Infinity

Succ can't be a constructor, because then Succ Infinity /~ Infinity. Succ as a type family half-works, but you end up needing (logically-)unnecessary type annotations all over the place because GHC doesn't know that Succ n ~ Succ m => n ~ m. It might be possible to devise some kind of type-level bisimilarity constraint, but I suspect it would actually increase the need for type annotations rather than decrease it.

I've also tried an alternate definition of the tree type using "Pred" instead of "Succ", but that seems to lead to really bizarre types - it accepts obviously non-flat trees at a type indicating that the tree's height is zero (or actually, any CoNat I want), because it accepts the existence of types such as "Pred (Co Zero)". Is this expected behavior (e.g., due to the open-world assumption) or should I file a bug report on that?

comment:6 Changed 5 years ago by goldfire

Cc: eir@… added

comment:7 Changed 5 years ago by nathanhowell

Cc: nathanhowell@… added

comment:8 Changed 5 years ago by nfrisby

Cc: nfrisby added

comment:9 Changed 5 years ago by igloo

Milestone: 7.8.1
Owner: set to simonpj

Simon, I'm no sure what the status of this is, but it looks like your area so I'm assigning it to you.

comment:10 Changed 5 years ago by conal

Cc: conal@… added

Note this discussion, which includes the subtlety of injective in various arguments (holding other fixed).

comment:11 Changed 5 years ago by goldfire

Also see the discussion in #4259, which is about strengthening type families along a different dimension. The two tickets are related.

comment:12 Changed 4 years ago by duairc

Cc: shane@… added

comment:13 Changed 4 years ago by morabbin

Cc: andy.adamsmoran@… added

comment:14 Changed 4 years ago by byorgey

Cc: byorgey@… added

comment:15 Changed 4 years ago by morabbin

Noting for completeness: possibly related to #7205 and #5591.

Why I'm interested: I'm modeling the Java type system in Haskell (overload resolution in particular) a la chak and Andrew's Salsa, using type families and type functions heavily. Their code worked under earlier versions of GHC, but not under 7.4 and beyond.

comment:16 Changed 4 years ago by simonpj

Andy, No version of GHC had injective families. If you have code that you think should work, and used to, and doesn't now, by all means submit a but report!

Simon

comment:17 Changed 4 years ago by morabbin

It turns out I was missing -XscopedTypeVariables. I'm still interested in injective type families, but it's not a blocker (and Manuel and Andrew's code still compiles).

comment:18 Changed 4 years ago by Artyom.Kazak

Cc: Artyom.Kazak@… added

comment:19 Changed 3 years ago by dmcclean

Cc: douglas.mcclean@… added

comment:20 Changed 3 years ago by thoughtpolice

Milestone: 7.8.37.10.1

Moving to 7.10.1.

comment:21 Changed 3 years ago by ocharles

Cc: ollie@… added

comment:22 Changed 3 years ago by acowley

Cc: acowley@… added

comment:23 Changed 3 years ago by simonpj

ollie, acowley: if you have a real-life use case for injective type families, do give the details here. Currently we don't have many compelling examples.

Thanks

Simon

comment:24 Changed 3 years ago by ocharles

It's a little hard to say whether or not injective type families would actually help with what I'm doing - it could well be that I don't just understand this type level magic enough! However, in a few cases where I've got stuck I've moved things to a GADT to witness the relation between types instead, which I believe does get me injectivity and from that points things seemed to "just work" - albeit requiring a bit more effort. I'll keep my eyes out, and try and provide examples here if I find anything concrete.

comment:25 Changed 3 years ago by MikeIzbicki

Here's a real worldish use case.

Let's try to write a modified Functor class that lets us map over not just the last type variable, but any type variable. This would be useful, for example, with the Either type.

The class might look something like:

class Functor loc f where
    fmap :: proxy loc -> (a -> b) -> SetLocation f loc a -> SetLocation f loc b

Where loc is a type that uniquely identifies the position of the argument we want to map over. We can't make an instance of this class unless the SetLocation type family is injective.

With a little syntactic sugar, we can convert the above class into something that looks very much like the current Functor class, but is quite a bit more powerful:

class Functor loc f where
    fmap :: proxy loc -> (a -> b) -> f { loc = a } -> f { loc = b }

I don't think you can get something nearly that clean without injective type families. And, of course, this applies to any parametric classes currently in use, not just Functor.

comment:26 Changed 3 years ago by spacekitteh

Here's another real world use case.

Given

class Manifold' a where
    type Field a
    type Base  a
    type Tangent a
    type TangentBundle a
    type Dimension a∷  Nat
    type UsesMetric a∷ Symbol
    project ∷ a → Base a
    unproject ∷ Base a→ a
    tangent ∷ a → TangentBundle a
    cotangent ∷ a → (TangentBundle a→ Field a)

then this works:

id' ∷ ∀ a. ( Manifold' a) ⇒ Base a → Base a
id' input= project out where 
    out∷a
    out = unproject input

whereas this requires injective type families:

id' ∷ ∀ a. ( Manifold' a) ⇒ Base a → Base a
id' = project ∘ unproject

comment:27 Changed 3 years ago by hpacheco

Is there a current version of the patch that works against the current GHC HEAD? The 2-year-old one fails.

I can say that I have been trying to define injective type families ever since TypeFamilies were introduced! Especially to define functions like the fmap that MikeIzbicki showed or very simple type-level aliases between different type classes.

It also increases modularity of data families. For example, it would allow defining datatypes in different modules, and combining them together using type family aliases in a parent module.

module A where

data A = ...
data FA = ...

module B where

data B = ...
data FB = ...

module Main where
import A
import B

type family F a :: *
type instance F A = FA
type instance F B = FB
Last edited 3 years ago by hpacheco (previous) (diff)

comment:28 Changed 3 years ago by acowley

My intended uses seem to arise quite often, but I've not found one that boils down into a small, self-contained, compelling example. In broad terms, I would like to use type families for decoupling, as hpacheco describes, and to do things like move back and forth between a set of distinct types and singleton Nats.

comment:29 Changed 3 years ago by simonpj

I don't think anyone is working on injective type families right now, but it would make a good project, and not too difficult.

Lunaris's patch has a lot of irrelevant white-space changes in it. But the biggest problem is that I don't think it implements the all-important check that a injective type family really is injective. Consider

injective type family F a

type instance F Int = Bool
type instance F Char = Bool

This isn't injective and should jolly well be rejected. And this should be the case even if the two type instance declarations are in different modules.

Similar code already exist for rejecting type-family overlap; I think it is in FamInst.checkFamInstConsistency. But it'd need to be beefed up to deal with the injectiveness check.

And you need to take care. What about

type instance F Int = Bool        -- (BAD)
type instance F Char = F Int      -- (BAD)

The two RHSs look different, but actually they are the same. The criterion for injectivity is that

If    (F t1 .. tn) ~ (F s1 .. sn)
then  t1 ~ s1
      ...
      tn ~ sn

That doesn't hold for the (BAD) instances, so they should be rejected. In the terminology of the "Closed type families" paper, we need the two RHSs to be apart.

Simon

comment:30 Changed 3 years ago by hpacheco

To get going, I wouldn't mind if the onus of proving that the type family is indeed injective fell on me, in an unsafe way. But I understand that it is clearly not a feature that you should blindly introduce in mainstream GHC... I may look into it in the future if I ever find the time, but I doubt it.

comment:31 Changed 3 years ago by ekmett

Cc: ekmett@… added

comment:32 in reply to:  2 Changed 3 years ago by jstolarek

I spent some time recently thinking about this one. It is not clear to me what should happen when we know that a type family F is injective. It would certainly allow us to infer from (F a ~ F b) that a ~ b (modulo corner cases). So instead of first reducing injective type family application and then working on the result we could instead reason about type family's arguments. We could even invert type families. Is there anything else we could reason about with injective type families?

The discussion here is only about open type families (obviously, closed type families were not implemented two years ago). I guess that now this change would affect both open and closed type families?

Replying to simonpj:

Some functions might be injective in one argument but not another. For example:

  F a b ~ F c d   ===>    a ~ c
                but not   b ~ d

That is not true according to the mathematical definition of injectivity (as stated in comment 29). Unless we want to introduce definition of type families injective only in some of the arguments, as proposed by Iavor.

comment:33 Changed 3 years ago by jstolarek

Cc: jan.stolarek@… added

comment:34 Changed 3 years ago by goldfire

I thinking about this ticket, I've imagined some syntax like this:

type family FullyInjective a | Result -> a
       -- use a fun-dep-like syntax, where "Result" is a keyword, capitalized to avoid name-clashes with type variables
type family Plus a b | Result a -> b, Result b -> a
type family Only1stInjective a b | Result -> a

I could even imagine having dependencies among the arguments, though that might be a little strange.

comment:35 Changed 3 years ago by Artyom.Kazak

Cc: Artyom.Kazak@… removed

comment:36 Changed 3 years ago by heisenbug

Cc: ggreif@… added

comment:37 Changed 3 years ago by jstolarek

Owner: changed from simonpj to jstolarek

Simon, I'm stealing this one from you :-)

comment:38 Changed 3 years ago by simonpj

Fine. But please let's have a wiki page that describes the design in detail, from the user's point of view, before you start implementing anything!

  • Syntax
  • The effect on which programs are well-typed

Simple is good!

Simon

comment:39 Changed 3 years ago by jstolarek

Simon, I just drafted this wiki page. I believe it gives a pretty good description of how the syntax will work, but it doesn't give too many details about the effects on type checking. Do you want this to be somehow formalized or would examples suffice? Do you feel that something else is missing on the wiki page?

Simple is good!

Are you referring to anything particular or is this just a general reminder to follow the KISS principle?

comment:40 Changed 3 years ago by goldfire

I'm still not sold on the concrete syntax involving result, despite having introduced that idea myself. I like the (**) suggestion here a bit more: http://www.haskell.org/pipermail/ghc-devs/2014-July/005492.html

Separately, a change to Core will be necessary to really make this work in all cases.

data X a where
  MkX :: (F b c ~ a) => b -> c -> X a

Say F is injective (in both arguments). After pattern-matching on MkX ... :: X (F Int Bool), we will probably want to discover that b is Int and c is Bool. To do so, we will need to decompose the (F b c ~ a) coercion, using the left and right (or perhaps nth) coercion forms. These currently don't work over type families, and for good reason. However, if a type family is injective, then these could be made to work over type families. With the update to Core, we would probably want to make sure we don't lose type safety, though!

All that said, this use case is obscure, and the primary push for injective type families does not seem to require a change to Core. It is worth implementing without this piece. But, the idea came up in conversation, and I thought I should record it.

comment:41 in reply to:  40 ; Changed 3 years ago by jstolarek

Replying to goldfire:

I'm still not sold on the concrete syntax involving result, despite having introduced that idea myself. I like the (**) suggestion here a bit more: http://www.haskell.org/pipermail/ghc-devs/2014-July/005492.html

So, to be precise, with this syntax the examples from the wiki page would look like this:

type family F a b c | F -> a b c
type family G a b c | G -> a b
type family H a b c | H a -> b c, H b -> a c, H c -> a b
type family J a b c | J a -> b c, J b -> a c

I did not propose to use the notation like H b -> a c because I thought it might mislead people to think that: a) H could be partially applied; b) some parameters of H could be skipped (ie. we can apply it to b without first applying to a). Luckily, this is not a blocking issue for the rest of implementation. It should only affect 3 files in the parser, so it can be changed later.

Separately, a change to Core will be necessary to really make this work in all cases.

(...)

All that said, this use case is obscure, and the primary push for injective type families does not seem to require a change to Core. It is worth implementing without this piece. But, the idea came up in conversation, and I thought I should record it.

This is beyond my knowledge - I would require some guidance here. But if it's possible to implement injective type families without this change then I would postpone it until later and possibly consider it a separate task.

comment:42 in reply to:  41 Changed 3 years ago by goldfire

Replying to jstolarek:

This is beyond my knowledge - I would require some guidance here. But if it's possible to implement injective type families without this change then I would postpone it until later and possibly consider it a separate task.

Yes, it's possible, but there are some circumstances in which users could say that GHC gives an erroneous type error, in that GHC would manifestly have enough knowledge to make progress but doesn't. I agree that you should go ahead without this piece, though!

comment:43 Changed 3 years ago by aavogt

Cc: vogt.adam@… added

comment:44 Changed 3 years ago by ibotty

Cc: haskell@… added

comment:45 Changed 3 years ago by simonpj

Before embarking on the implementation, is everyone convinced that the proposed syntax

type family F a :: * 
   | result -> a

is the best one? I understand the desire to support injectivity in some parameters but not others.

For closed type classes, can't we infer injectivity, rather than having to declare it? (Although I suppose that the ability to declare is always a Good Thing.)

Simon

comment:46 Changed 3 years ago by goldfire

That syntax precludes the possibility of result being the name of an argument to the function. Using Result avoids this problem, but that would be a capitalized keyword, which is strange. The other possibility that avoids stealing the name result is to use the type family itself to represent the result: type family F a | F -> a, but that's a little strange. I guess I favor the last syntax the most.

comment:47 Changed 3 years ago by rwbarton

We could also reuse an existing keyword, e.g., type family F a :: * | type -> a.

comment:48 Changed 3 years ago by goldfire

A conversation with Ulf Norell (the first implementor of Agda) led to this interesting thought: We may not want injectivity -- we may want a weaker property, which I'll term "head-injectivity".

The equations defining a head-injective function all have RHSs with different heads. What appears below the heads does not matter. Here is an example:

type family F a where
  F (Just x) = True
  F Nothing  = False

This function is not injective: F (Just 1) ~ F (Just 2). Yet, it is head-injective. If the type-inferencer knows that F x ~ True, it can then refine x ~ Just y for some fresh unification variable y. This improvement may be enough to make further progress.

Of course, injectivity does not imply head-injectivity. Consider

  type family G a where
    G True  = Just 1
    G False = Just 2

This function is injective but not head-injective. So, implementing only head-injectivity will miss this case. But, perhaps we can refine the notion of head-injectivity simply to mean that the RHSs don't unify.

The reason I'm bringing this up is that head-injectivity is, as I understand it, implemented in Agda and used during type inference. It intrigues me to think that we can make progress in the case of F, a non-injective function.

comment:49 Changed 3 years ago by jstolarek

Replying to simonpj:

Before embarking on the implementation, is everyone convinced that the proposed syntax

type family F a :: * 
  | result -> a

is the best one?

I think the bulk of the work with injectivity will be done under the hood. Changing the syntax later on will not be difficult and I don't want to be blocked by bikeshedding.

Inferring injectivity for closed type families is something that indeed requires more discussion. But again, I feel I can make at least some progress without any binding decissions on this matter being made.

I don't like any of Richard's proposals:

  • if we introduced Result this would be the first capitalized keyword (reserved word?) in Haskell. I don't like such inconsistency. Notice also that result might conflict with type variable names in type family head but that is a local conflict that restricts only the names possible to use for type variables. On the other hand Result might conflict with a name of a type (or type family), which is more of a global thing.
  • if we use syntax that uses type family name then we will get something like type family F a b c | F b -> a c and F b looks strange given that F's arity is 3. It looks as if F could be partially applied.

I don't think that restricting the use of result word in type family head is a big problem - we've already done that with role.

Last edited 3 years ago by jstolarek (previous) (diff)

comment:50 Changed 3 years ago by jstolarek

Richard, I don't understand your terminology here (though I do understand the F example):

The equations defining a head-injective function all have RHSs with different heads. What appears below the heads does not matter.

Could you explain what you mean by RHSs with different heads and below the heads? Up till now I thought that the term "head" refers to type family F a part of the declaration.

I also don't understand why G is not head-injective, but perhaps this will become clear to me once earlier questions are answered.

comment:51 in reply to:  49 ; Changed 3 years ago by goldfire

Replying to jstolarek:

I don't think that restricting the use of result word in type family head is a big problem - we've already done that with role.

The use of role doesn't steal syntax: it can be used only after the word type in a top-level declaration, where a capitalized identifier is expected.

comment:52 in reply to:  50 ; Changed 3 years ago by goldfire

Replying to jstolarek:

Could you explain what you mean by RHSs with different heads and below the heads? Up till now I thought that the term "head" refers to type family F a part of the declaration.

Here, "head" means the head of the RHS type, not the type declaration -- it's the root of the type's AST. Does that help?

I also don't understand why G is not head-injective, but perhaps this will become clear to me once earlier questions are answered.

G isn't head-injective because both RHSs are headed by Just.

comment:53 in reply to:  52 ; Changed 3 years ago by jstolarek

Replying to goldfire:

Here, "head" means the head of the RHS type, not the type declaration -- it's the root of the type's AST. Does that help?

Yes, now I fully understand.

Also, if I understood correctly it is possible to have both injectivity and head-injectivity in the compiler, right?

comment:54 in reply to:  51 Changed 3 years ago by jstolarek

Replying to goldfire:

Replying to jstolarek:

I don't think that restricting the use of result word in type family head is a big problem - we've already done that with role.

The use of role doesn't steal syntax: it can be used only after the word type in a top-level declaration, where a capitalized identifier is expected.

Oops, sorry for defamation then ;-) Still, I think that the price to pay for using result will not be that high.

comment:55 in reply to:  53 Changed 3 years ago by goldfire

Replying to jstolarek:

Also, if I understood correctly it is possible to have both injectivity and head-injectivity in the compiler, right?

Yes, but confusing, especially if there is syntax for both.

comment:56 in reply to:  48 Changed 3 years ago by heisenbug

Replying to goldfire:

A conversation with Ulf Norell (the first implementor of Agda) led to this interesting thought: We may not want injectivity -- we may want a weaker property, which I'll term "head-injectivity".

So when our goal is to make progress from the target expression towards the source expression, what hinders us to follow Omega and use narrowing to obtain the head? Tim Sheard has a nice 2006 paper on how to do it. It is really powerful and we could insist that exactly one solution exists.

comment:57 Changed 3 years ago by simonpj

(GHC does, in effect, use narrowing, I believe.)

Let's stand back a bit. What are we trying to achieve. Here is the Number 1 Goal: if the type inference engine is trying to prove F a ~ F beta, where a is a skolem constant and beta is a unification variable, is it justified in fixing beta := a? If F is not injective the answer is "no". If it's injective, the answer is "yes". We have multiple examples of the need for this.

More generally, our goal is to use injectivity to guide the decision about what type a unification varaible stands for. Injectivity also allows other such opportunities, beyond the Number 1 Goal above. Suppose F is injective, and we have

F Int = True

Then if we have F beta ~ True we can deduce beta := Int.

Richard raises further possiblities: possibly-partial information about the result might allow us to deduce possibly-partial information about the arguments. That is the stuff about head-injectivity and it seems to be potentially very complicated. Take a closed type family. If I tell you something (but perhaps not everything) about the result, perhaps you can tell me something (but perhaps not everything) about the argument. I don't know how clever you might be here, or how to declare the cleverness in an "injectivity signature" for open families.

But we have no actual use-caes for anything except Number 1 Goal, for equalities of form F t1 ~ F t2. It seems that equalities of form F t1 ~ some-type offer potential opportunities, but I don't see a sweet spot.

Simon

comment:58 in reply to:  57 ; Changed 3 years ago by jstolarek

Replying to heisenbug:

Tim Sheard has a nice 2006 paper on how to do it.

What's the title of this paper? I tried to google it but failed.

Replying to simonpj:

For closed type classes, can't we infer injectivity, rather than having to declare it?

I thought about it and I believe this should be possible. But we could only infer normal injectivity. I believe an algorithm for inferring injectivity in some of the parameters (ie. things like result a -> b c or result -> b c for a 3-arity type family) would be exponential in the number of arguments. Inferring normal injectivity should be O(n2) at most, where n is the number of equations.

Replying to simonpj:

But we have no actual use-caes for anything except Number 1 Goal, for equalities of form F t1 ~ F t2.

It looks like Conal Elliot has a use case for injectivity in only some arguments: http://www.haskell.org/pipermail/glasgow-haskell-users/2011-February/020027.html I'll contact him for more details.

comment:59 Changed 3 years ago by goldfire

Simon's (very straightforward) example clarifies this in my mind -- I agree about just heading for plain old injectivity for open type families. If/when we want more involved reasoning for closed type families, where inference is possible, we can address that.

comment:60 in reply to:  58 Changed 3 years ago by heisenbug

Replying to jstolarek:

Replying to heisenbug:

Tim Sheard has a nice 2006 paper on how to do it.

What's the title of this paper? I tried to google it but failed.

"Type-level Computation Using Narrowing in Ωmega", Elsevier, 2007

comment:61 in reply to:  59 Changed 3 years ago by jstolarek

Replying to goldfire:

plain old injectivity for open type families.

more involved reasoning for closed type families, where inference is possible

I'm not sure what design choices are hidden behind these two statements. Do you mean that for open type families we would have: a) simple injectivity in the result only (via annotations) or b) injectivity in some of the arguments (also via annotations); whereas for closed type families we would admit the same kind of injectivity declarations as for open type families and perhaps also injectivity inference as proposed by me in comment 58?

Last edited 3 years ago by jstolarek (previous) (diff)

comment:62 in reply to:  58 Changed 3 years ago by dfeuer

Replying to jstolarek:

I thought about it and I believe this should be possible. But we could only infer normal injectivity. I believe an algorithm for inferring injectivity in some of the parameters (ie. things like result a -> b c or result -> b c for a 3-arity type family) would be exponential in the number of arguments.

I don't know anything about this, but what's wrong with exponential time here? This inference only happens once per closed type family, right? How many variables are there likely to be?

comment:63 Changed 3 years ago by danilo2

Hello! Is there any patch which would allow me to use injective type families in current GHC? I saw on the top of this page, that a patch was provided, but about 2 years ago - are there any working implementations?

comment:64 Changed 3 years ago by jstolarek

what's wrong with exponential time here?

Compilation times. If a type family takes a single argument then obviously there is only one possible injectivity condition ("result -> a"). For two arguments there are 5 possible injectivity conditions. For 3 this number jumps to 19. 4 arguments => 65 combinations. 5 arguments => 211 combinations. 6 arguments => 665 combinations. Let's assume (for illustrative purposes) that verifying (ie. attempting to infer) a single injectivity condition for a type family takes 0.01s. Under such assumptions inferring all possible injectivity conditions for a type family with 6 arguments will take ~6.5s. We can't do that.

Is there any patch which would allow me to use injective type families in current GHC?

No. But if all goes well all this discussion here will culminate with such a patch :-) In the meantime, if you have any motivating use cases that require injectivity to work please share them if you can. This would be most helpful.

I've spent last several days thinking intensely about the design and made some progress. Hopefully, I'll incorporate my notes into the wiki page by the end of the week. Will get back with some ideas and proposals once this happens.

comment:65 Changed 3 years ago by danilo2

@jstolarek: Oh, I made a mistake, I've got no example regarding injective type families. I'm sorry for the confusion!

Last edited 3 years ago by danilo2 (previous) (diff)

comment:66 Changed 3 years ago by jstolarek

Differential Rev(s): Phab:D202

comment:67 Changed 3 years ago by jstolarek

I've made a major update to the wiki page. Here are a few things worth pointing out:

In this section I've listed all the syntax proposals made so far. I've also mailed Haskell-cafe so we can get some feedback from the community.

I christened various flavours of injectivity as A, B and C :-) -- see this section. I will refer to them as such in the rest of this comment.

I mailed Conal Elliot and he says he does not remember what was his use case for type families injective in only one argument. This means we only have use cases of the form A shown by Simon. Given that, I believe that it is best if we limit ourselves to implementing A. We can revise this design later if someone demonstrates compelling examples of B or C.

I have outlined my idea for the algorithm that would verify users injectivity declarations in this section. Simon, please tell me if there are any misconceptions about how GHC's type inference works.

After some thinking I am against inferring injectivity for closed type families - see my arguments in this section.

comment:68 Changed 3 years ago by jstolarek

EDIT: Please ignore earlier versions of this comment.

A nice syntax proposal showed up on Haskell-cafe. The idea is to use type variables introduced to name the result:

type family Id a :: result | result -> a where
type family F a b c :: d | d -> a b c
type family G a b c :: foo | foo -> a b where
type family Plus a b :: sum | sum a -> b, sum b -> a where
type family H a b c :: xyz | xyz a -> b c, xyz b -> a c
Last edited 3 years ago by jstolarek (previous) (diff)

comment:69 Changed 3 years ago by rwbarton

That reads very oddly to me, but how about...?

type family Id a = result | result -> a where

comment:70 in reply to:  69 Changed 3 years ago by jstolarek

Replying to rwbarton:

That reads very oddly to me, but how about...?

type family Id a = result | result -> a where

You mean that the :: result syntax reads oddly? It is already supported by GHC and the nice thing about this proposal is that we can piggyback on the syntax we already have.

comment:71 Changed 3 years ago by simonpj

The difficulty is that when you say

   type family F (a::*) :: *

you expect a to be a type variable and * to be a kind. So if you say

   type family F (a::*) :: (b :: *)

it looks as though b is a kind variable, and * is a sort. That's why Reid wants to use = rather than ::, and he is right. It's reasonable to have either of these

type family F a :: *
type family F a = (b :: *)

but not

type family F a :: (b :: *)

The merit of the "= blah" notation is that it allows us to give a name to the result, rather than use a magic name. I think it's probably just about worth the syntactic cost.

Simon

comment:72 Changed 3 years ago by jstolarek

Ah, of course. I was thinking about this problem earlier but then I somehow convinced myself that this will not be an issue.

comment:73 Changed 3 years ago by jstolarek

Simon, Richard, others,

I'm making progress on this one but before going too far I want to get a green light on my design proposal (points 1-3). I've also bumped into some problems and have a couple of questions.

My design proposal is to implement only injectivity of type A (ie. plain injectivity in all arguments). The consequences of this are:

  1. A user is allowed to write at most one injectivity condition.
  1. That condition must contain only the result type variable on the LHS of -> and all other type variables on the RHS of ->. In other words if I declare type family P a b = r then I'm only allowed to write r -> a b or r -> b a injectivity condition. Questions here: a) can I add a restriction that type variables on the RHS must be given in exactly the same order as they were given in the head declaration? This would make the implementation simpler; b) where should this check be done? In the renamer?
  1. Once we check 2) we must verify that the type family is indeed injective. Not sure where this should happen? My guess is that during typechecking of a type family equations. Algorithm is outlined here.

Does the implementation plan outlined so far look sensible? Are there any misconceptions?

  1. Once we pass these checks injectivity becomes a binary property, so we can discard injectivity conditions written by the user and replace it with a Bool. Richard, when we first spoke about injective type families you suggested that all the magic will go into isDecomposableTyCon in types/TyCon.lhs. I've added a Bool field to SynTyCon data constructor of TyCon data type and made isDecomposableTyCon use that field to tell whether a type family is injective or not. (Checks described in 2 and 3 are not implemented. When a user writes injectivity declaration for a type family I just assume it is correct.) Sadly, this does not work. Here's an example test case:
type family F a = r | r -> a where
    F Int = Bool
    F Bool = Int
    F a = a

foo :: F a -> F a
foo = id

This fails with:

Couldn't match type ‘F a0’ with ‘F a’
NB: ‘F’ is a type function, and may not be injective
The type variable ‘a0’ is ambiguous
Expected type: F a -> F a
  Actual type: F a0 -> F a0
In the ambiguity check for: forall a. F a -> F a
To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
In the type signature for ‘foo’: foo :: F a -> F a

I added traces to verify that isDecomposableTyCon is called for F type family and it correctly returns True.

Out of curiosity I turned on AllowAmbiguousTypes and added a definition like this:

bar :: Int -> Int
bar = foo

That failed with:

Couldn't match type ‘F a0’ with ‘Int’
The type variable ‘a0’ is ambiguous
Expected type: Int -> Int
  Actual type: F a0 -> F a0
In the expression: foo
In an equation for ‘bar’: bar = foo

I imagine that getting this one to work definitely requires more changes than just isDecomposableTyCon. Then again GHC can already deal with some cases of injectivity:

type family F a = r | r -> a where
    F Int = Int
    F Bool = Bool
    F a = a

foo :: F a -> F a
foo = id

bar :: Int -> Int
bar = foo

That works perfectly fine in GHC 7.8.

I tried to analyse what's going on with -ddump-tc-trace but the dumps are 600-1000 lines long. Of course the source code itself is even longer, so I'd appreciate any directions where should I start looking.

One final question. Assuming that SynTyCon really should have a new field, I believe this field should be stored in interface files. After all we want injectivity information to propagate from one module to another. Now I wonder how this interacts with open type families. I understand that these are typechecked progressively as we find more equations. I haven't looked at the implementation but my intuition from reading GHC papers is that in a given module we import open type family equations from other modules, collect equations from current module and proceed with type checking of an open type family. Is that intuition correct? If so then I believe that checking of injectivity (point 3 of my outline) should be done during this stage (I believe it would be best to combine it with checking equation overlapping).

comment:74 in reply to:  73 Changed 3 years ago by heisenbug

Replying to jstolarek:

Simon, Richard, others,

I'm making progress on this one but before going too far I want to get a green light on my design proposal (points 1-3). I've also bumped into some problems and have a couple of questions.

Just a side question. I have a type level Map:

type family Map (f :: k -> l) (ks :: [k]) :: [l] where
 ...

When I declare this as injective (presumably) with

type family Map (f :: k -> l) (ks :: [k]) = result :: [l] | result -> ks where
 ...

then f must be injective for this to work. Concretely, will this invocation be considered injective by GHC (:set -XDataKinds) ?

Map IO '[Int, (), Bool]

Thanks,

Gabor

comment:75 Changed 3 years ago by jstolarek

Good question. I have not considered such case earlier. Here are some thoughts:

Currently your Map function will work only for type constructors because unsaturated type families are not allowed. These are certainly injective.

First step in my current outline of the algorithm that decides whether a type family is injective or not says: "If a RHS contains a call to a type family we conclude that the type family is not injective. I am not certain if this extends to self-recursion -- see discussion below." This would lead to conclusion that Map is not injective because Map calls itself.

Now let's say I manage to find a way to allow self-recursion (see discussion on the wiki page). I'm left with the problematic call to f. As stated earlier currently this must be an injective type constructor. So it seems that declaring Map injective should be safe because type constructors are generative and it seems possible to deduce f, a and b from Map f [a, b] ~ [Maybe Int, Maybe Char]. At the moment I have no idea how to formulate an algorithm for such a deduction. Could this be done by the current constraint solver?

What if we were allowed to have partially applied type families (say, because we've implemented ideas from singletons) and f could be a type family? Would knowing that f is injective allow us to declare Map as injective? Injectivity says you can deduce LHS from RHS but I think that it would not be a good idea to try to solve Map f [a, b] ~ [Int, Char]. And so the restriction on not having calls to type families on the RHS would rightly declare Map as not injective. So if we had partially applied type families and thus f was allowed to be either a type constructor a type family we should declare Map as not injective. This contrasts with earlier paragraph, where knowing that f must be a type constructor allowed us to declare Map as injective.

Does that make sense?

comment:76 Changed 3 years ago by simonpj

  • I think it's very straightforward to allow injectivity of form r -> <subset of args>. That is, the rhs is unconstrained but the lhs must be r.
  • So injectivity isn't a binary property; rather it's a Bool per argument. Making this list a field of SynTyCon is probably right, with ordinary synonyms being all false.
  • We are not (at least for now) changing System FC. So the only effect of injectivity is to add extra "Derived" unification constraints, very much like functional dependencies. If we see
     [W] F t ~ F s
    
    then we don't decopose it. Instead we add
     [D] t ~ s
    
    That guides inference, but does not produce proof terms. So, no, nothing to do with isDecomposableTyCon. Much more like the try_improvement code in TcInteract.doTopReactFunEq.
  • I don't understand the discussion of Map. Are you perhaps discussing how to verify some (not shown here) implementation of a closed type family Map is in fact injective?
  • Verifying injectivity for open families: if the RHSs unify then the LHSs should be identical under that substitution. Eg
      type instance F a Int = a -> Int
      type instance F Int a = Int -> a
    
    These are consistent (and hence allowed) and are injective in both arguments.
  • My brain is too small to think about verifying injectivity for closed families. Richard?

comment:77 in reply to:  76 Changed 3 years ago by jstolarek

Replying to simonpj:

  • We are not (at least for now) changing System FC. So the only effect of injectivity is to add extra "Derived" unification constraints, very much like functional dependencies. If we see
     [W] F t ~ F s
    
    then we don't decopose it. Instead we add
     [D] t ~ s
    

What is the meaning of [W] and [D]? "Wanted" and "derived"?

  • I don't understand the discussion of Map. Are you perhaps discussing how to verify some (not shown here) implementation of a closed type family Map is in fact injective?

Yes, I was discussing how to verify whether this definition of Map is injective:

type family Map (f :: k -> l) (ks :: [k]) :: [l] where
    Map f '[]       = '[]
    Map f (x ': xs) = f x ': Map f xs
  • Verifying injectivity for open families: if the RHSs unify then the LHSs should be identical under that substitution. Eg
      type instance F a Int = a -> Int
      type instance F Int a = Int -> a
    
    These are consistent (and hence allowed) and are injective in both arguments.

Ah, I haven't thought about allowing substitution when RHSs unify. Wiki updated.

  • My brain is too small to think about verifying injectivity for closed families. Richard?

I only remind that I outlined an algorithm on the wiki :-) Richard, it would be great if you could take a look and say whether it makes sense.

In the meantime I have almost implemented checking correctness of injectivity condition (point 2 from my earlier comment). Renamer seems to be the right place for doing this.

comment:78 Changed 3 years ago by simonpj

Yes "Wanted" and "Derived". You'll see lots of this with -ddump-tc-trace

comment:79 in reply to:  73 ; Changed 3 years ago by goldfire

Replying to jstolarek:

  1. That condition must contain only the result type variable on the LHS of -> and all other type variables on the RHS of ->. In other words if I declare type family P a b = r then I'm only allowed to write r -> a b or r -> b a injectivity condition. Questions here: a) can I add a restriction that type variables on the RHS must be given in exactly the same order as they were given in the head declaration? This would make the implementation simpler; b) where should this check be done? In the renamer?

I'm ambivalent on this design decision (that the variables have to be in the same order in the annotation). I suppose it simplifies the implementation. Yes, the renamer seems like a fine-enough place to do the check.

  1. Once we check 2) we must verify that the type family is indeed injective. Not sure where this should happen? My guess is that during typechecking of a type family equations. Algorithm is outlined here.

Please see updates to the wiki page, labeled RAE ... End RAE.

Does the implementation plan outlined so far look sensible? Are there any misconceptions?

  1. Once we pass these checks injectivity becomes a binary property, so we can discard injectivity conditions written by the user and replace it with a Bool. Richard, when we first spoke about injective type families you suggested that all the magic will go into isDecomposableTyCon in types/TyCon.lhs. I've added a Bool field to SynTyCon data constructor of TyCon data type and made isDecomposableTyCon use that field to tell whether a type family is injective or not. (Checks described in 2 and 3 are not implemented. When a user writes injectivity declaration for a type family I just assume it is correct.) Sadly, this does not work. Here's an example test case:
type family F a = r | r -> a where
    F Int = Bool
    F Bool = Int
    F a = a

foo :: F a -> F a
foo = id

This fails with:

Couldn't match type ‘F a0’ with ‘F a’
NB: ‘F’ is a type function, and may not be injective
The type variable ‘a0’ is ambiguous
Expected type: F a -> F a
  Actual type: F a0 -> F a0
In the ambiguity check for: forall a. F a -> F a
To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
In the type signature for ‘foo’: foo :: F a -> F a

I added traces to verify that isDecomposableTyCon is called for F type family and it correctly returns True.

Out of curiosity I turned on AllowAmbiguousTypes and added a definition like this:

bar :: Int -> Int
bar = foo

That failed with:

Couldn't match type ‘F a0’ with ‘Int’
The type variable ‘a0’ is ambiguous
Expected type: Int -> Int
  Actual type: F a0 -> F a0
In the expression: foo
In an equation for ‘bar’: bar = foo

I imagine that getting this one to work definitely requires more changes than just isDecomposableTyCon.

Yes, Simon's comment above is correct -- I was just wrong about the isDecomposableTyCon thing. Actually, injective type families still need to say "no" to isDecomposableTyCon, because of the left and right coercion formers, which assume generativity.

Then again GHC can already deal with some cases of injectivity:

type family F a = r | r -> a where
    F Int = Int
    F Bool = Bool
    F a = a

foo :: F a -> F a
foo = id

bar :: Int -> Int
bar = foo

That works perfectly fine in GHC 7.8.

What works perfectly fine? Not the code you wrote above, because it contains a not-currently-parsed injectivity annotation. I tried it without the injectivity annotation, and it does indeed work. But not because of injectivity, at all: it's because GHC is clever enough to figure out that F is just an identity function, and so the F as in foo's type become a. If you swap the Int and Bool RHSs in the definition (which preserves injectivity), the code fails to compile.

I tried to analyse what's going on with -ddump-tc-trace but the dumps are 600-1000 lines long. Of course the source code itself is even longer, so I'd appreciate any directions where should I start looking.

Fair warning: 1000 lines of -ddump-tc-trace isn't long at all! :)

One final question. Assuming that SynTyCon really should have a new field, I believe this field should be stored in interface files. After all we want injectivity information to propagate from one module to another. Now I wonder how this interacts with open type families. I understand that these are typechecked progressively as we find more equations. I haven't looked at the implementation but my intuition from reading GHC papers is that in a given module we import open type family equations from other modules, collect equations from current module and proceed with type checking of an open type family. Is that intuition correct? If so then I believe that checking of injectivity (point 3 of my outline) should be done during this stage (I believe it would be best to combine it with checking equation overlapping).

Agreed.

comment:80 in reply to:  75 Changed 3 years ago by goldfire

Replying to jstolarek:

Good question. I have not considered such case earlier. Here are some thoughts:

Currently your Map function will work only for type constructors because unsaturated type families are not allowed. These are certainly injective.

First step in my current outline of the algorithm that decides whether a type family is injective or not says: "If a RHS contains a call to a type family we conclude that the type family is not injective. I am not certain if this extends to self-recursion -- see discussion below." This would lead to conclusion that Map is not injective because Map calls itself.

Now let's say I manage to find a way to allow self-recursion (see discussion on the wiki page). I'm left with the problematic call to f. As stated earlier currently this must be an injective type constructor. So it seems that declaring Map injective should be safe because type constructors are generative and it seems possible to deduce f, a and b from Map f [a, b] ~ [Maybe Int, Maybe Char]. At the moment I have no idea how to formulate an algorithm for such a deduction. Could this be done by the current constraint solver?

This should just work with the algorithm already stated. (Except for the self-recursion bit, which I responded to on the wiki page.) Unification should do the work for you.

What if we were allowed to have partially applied type families (say, because we've implemented ideas from singletons) and f could be a type family? Would knowing that f is injective allow us to declare Map as injective? Injectivity says you can deduce LHS from RHS but I think that it would not be a good idea to try to solve Map f [a, b] ~ [Int, Char]. And so the restriction on not having calls to type families on the RHS would rightly declare Map as not injective. So if we had partially applied type families and thus f was allowed to be either a type constructor a type family we should declare Map as not injective. This contrasts with earlier paragraph, where knowing that f must be a type constructor allowed us to declare Map as injective.

Does that make sense?

No. Under the singletons work, we gave a different kind to non-injective functions, so Gabor's kind -- using ->, not any other arrow -- requires an injective argument.

comment:81 in reply to:  76 Changed 3 years ago by goldfire

Replying to simonpj:

  • My brain is too small to think about verifying injectivity for closed families. Richard?

I've put some comments on the wiki page. But the whole thing makes me a little nervous, as closed type families are really squirrelly. I would want something resembling a proof before feeling truly confident. As a first pass, we could always treat the equations of a closed family as independent (like an open family). I believe that such a treatment would be fully conservative.

comment:82 in reply to:  79 ; Changed 3 years ago by jstolarek

Replying to goldfire:

I'm ambivalent on this design decision (that the variables have to be in the same order in the annotation). I suppose it simplifies the implementation.

I guess my argument about simpler implementation is not that important. I mean the code is quite simple in both cases. It's more like a choice between linear and quadratic algorithm. I'd went for linear, even if it adds an extra restriction on the user code.

Yes, Simon's comment above is correct -- I was just wrong about the isDecomposableTyCon thing.

OK. Reverting :-)

What works perfectly fine? Not the code you wrote above, because it contains a not-currently-parsed injectivity annotation. I tried it without the injectivity annotation, and it does indeed work.

Yes, that's what I meant.

But not because of injectivity, at all: it's because GHC is clever enough to figure out that F is just an identity function

Right. Still, I wonder how this works. I mean there are probably no special cases in the typechecker that try to guess whether a type family is an identity function?

Fair warning: 1000 lines of -ddump-tc-trace isn't long at all! :)

I imagine, given that my source file was 8 lines of code. Still, it's a bit steep learning curve.

Replying to goldfire:

Under the singletons work, we gave a different kind to non-injective functions

Ah, right.

Replying to wiki comments:

I believe this would also have to check that all variables mentioned in the LHS are mentioned in the RHS. Or, following along with Simon's idea in comment:76:ticket:6018, those variables that appear in the injectivity annotation must be mentioned in the RHS.

I haven't thought about that. I believe you're right.

We have to be careful with the word overlap here. I think we want "overlaps" to be "is subsumed by".

I don't see the difference between "overlaps" and "is subsumed by". :-(

Now, in 4c I wrote:

if unification [of the RHSs] succeeds and there are type variables involved we substitute unified type variables on the LHS and check whether this LHS overlaps with any of the previous equations. If it does we proceed to the next equation

You replied with this:

you want to know of the equation at hand is reachable, given the LHS substitution. Even if it is reachable, the (substituted) LHS may coincide with the LHS of the earlier equation whose RHS unified with the current RHS.

I don't see the difference between your version and mine.

Even subtler, it's possible that certain values for type variables are excluded if the current LHS is reachable (for example, some variable a couldn't be Int, because if a were Int, then a previous equation would have triggered). Perhaps these exclusions can also be taken into account.

Hm... can you give an example where this would be useful?

RAE: But it seems that, under this treatment, any self-recursion would automatically lead to a conclusion of "not injective", just like any other use of a type family. For example:

type family IdNat n where
  IdNat Z = Z
  IdNat (S n) = S (IdNat n)

IdNat is injective. But, following the algorithm above, GHC would see the recursive use of IdNat, not know whether IdNat is injective, and then give up, concluding "not injective". Is there a case where the special treatment of self-recursion leads to a conclusion of "injective"? End RAE

This example is just like my NatToMaybe. My idea here was that RHSs of these two equations won't unify - first returns Z, second returns S something. These are distinct constructors that have no chance of being unified. (I assumed that we are able to detect that.) There are no calls to type families other than self-recursion and so we can declare IdNat to be injective. I admit I am not 100% certain that allowing self-recursion is safe. It's just that I was unable to come up with an example that would show that my algorithm: a:) declares injective function to be non-injective; b) declares a non-injective function to be injective.

Last edited 3 years ago by jstolarek (previous) (diff)

comment:83 in reply to:  82 Changed 3 years ago by goldfire

Replying to jstolarek:

Right. Still, I wondering how this works. I mean there are probably no special cases in the typechecker that try to guess whether a type family is an identity function?

Of course, you're right -- GHC doesn't have a special case there. What it does do is so-called compatibility checking. Two equations of a type family (open or closed) are compatible if, whenever the LHSs unify with substitution s, applying s to the RHSs makes them the same. Two equations for an open type family do not have a malignant overlap if they are compatible. And, more relevant here, GHC does the apartness check in closed type families only among incompatible equations. In F as originally given, all the equations are compatible with one another, and so GHC always skips the apartness check. Thus, when it sees F a, the last equation triggers and reduces F a to a. This is all described in the closed type families paper, and in the manual.

We have to be careful with the word overlap here. I think we want "overlaps" to be "is subsumed by".

I don't see the difference between "overlaps" and "is subsumed by". :-(

I would say these equations overlap:

type instance G a Int = a type instance G Int a = a

but neither equation subsumes the other. On the other hand,

type instance H a a = a -- 1 type instance H Int Int = Int -- 2

equation 1 subsumes equation 2, because anything that matches 2 will also surely match 1.

Now, in 4c I wrote:

if unification [of the RHSs] succeeds and there are type variables involved we substitute unified type variables on the LHS and check whether this LHS overlaps with any of the previous equations. If it does we proceed to the next equation

You replied with this:

you want to know of the equation at hand is reachable, given the LHS substitution. Even if it is reachable, the (substituted) LHS may coincide with the LHS of the earlier equation whose RHS unified with the current RHS.

I don't see the difference between your version and mine.

See new example E1 on the wiki page.

Even subtler, it's possible that certain values for type variables are excluded if the current LHS is reachable (for example, some variable a couldn't be Int, because if a were Int, then a previous equation would have triggered). Perhaps these exclusions can also be taken into account.

Hm... can you give an example where this would be useful?

See new example E2 on the wiki page.

RAE: But it seems that, under this treatment, any self-recursion would automatically lead to a conclusion of "not injective", just like any other use of a type family. For example:

type family IdNat n where
  IdNat Z = Z
  IdNat (S n) = S (IdNat n)

IdNat is injective. But, following the algorithm above, GHC would see the recursive use of IdNat, not know whether IdNat is injective, and then give up, concluding "not injective". Is there a case where the special treatment of self-recursion leads to a conclusion of "injective"? End RAE

This example is just like my NatToMaybe. My idea here was that RHSs of these two equations won't unify - first returns Z, second returns S something. These are distinct constructors that have no chance of being unified. (I assumed that we are able to detect that.) There are no calls to type families other than self-recursion and so we can declare IdNat to be injective. I admit I am not 100% certain that allowing self-recursion is safe. It's just that I was unable to come up with an example that would show that my algorithm: a:) declares injective function to be non-injective; b) declares a non-injective function to be injective.

What are the steps your algorithm is taking? I don't see how IdNat can be considered injective while Ban is considered non-injective. A critical step in banning Ban is noting that we don't yet know that Ban is injective when checking it, so we conclude "not injective". Maybe it's because Ban's recursion is at the top-level? In IdNat and NatToMaybe, unification fails before seeing the recursive use of the type family. Hmm... maybe. There seems to be something going on here that, if unification succeeds up until it sees the recursive use of the type family, then the family either must be non-injective or non-terminating... because this situation can only arise when the function "collapses" two inputs into the same output through recursion. But I'd want a proof first.

comment:84 Changed 3 years ago by jstolarek

Richard, either I'm not getting the subtleties of your arguments or you're misunderstanding what I meant to say on the wiki page. Or both :-)

I now see the theoretical difference between "overlaps" and "subsumes" but I don't yet see how that is important in my algorithm outline. I re-read parts of the closed type families paper and now I'm not even sure whether I meant "overlaps" or "unifies". Anyway, I believe that my thinking is correct and in my algorithm I have addressed problems that you raised. It looks that I need help with sorting out the terminology so things are clear.

Now, looking at your examples:

type family E1 a = r | r -> a where
  E1 Int = Int -- 1
  E1 a   = a   -- 2

When my algorithm reaches (2) it will attempt to unify a with RHS of equation (1). It will succeed with substitution [ a -> Int ]. It will apply that substitution to the LHS of (2) and proceed with checking whether after substitution this equation is subsumed (?) by any of the earlier ones. Clearly, it is by the first equation and since this is the last equation we declare E1 to be injective. In other words I know that (2) will never produce an Int since this will equation will never be reached for a equal to Int.

Now E2:

type family E2 (a:: Bool) = r | r -> a where
  E2 False = True
  E2 True  = False
  E2 a     = False

I consider almost identical example on the wiki:

type family Bak a = r | r -> a where
     Bak Int  = Char
     Bak Char = Int
     Bak a    = a

The difference is the RHS of the last equation: a concrete type in your example, a type variable in mine. Yet I believe that both examples are intended to demonstrate the same problem - see discussion below that example.

In IdNat and NatToMaybe, unification fails before seeing the recursive use of the type family.

Right. I assumed that noticing that allows us to conlcude that equations of IdNat and NatToMaybe produce distinct types and thus these type families are injective. Your reasoning in that whole paragraph completely agrees with mine.

But I'd want a proof first.

Agreed. As I've said, I've been unable to produce a counter-example but that doesn't mean that one does not exist.

I have a question about closed type families paper. Paragraph just after Candidate Rule 2 (section 3.2, page 3) says: "As a notational convention, apart(ρ, τ) considers the lists ρ and τ as tuples of types; the apartness check does not go element-by-element.", where "not" is emphasized. Why is this important? Seems like the choice of representing patterns and types as tuples or lists is just an implementation detail.

Last edited 3 years ago by jstolarek (previous) (diff)

comment:85 in reply to:  84 Changed 3 years ago by goldfire

Replying to jstolarek:

Richard, either I'm not getting the subtleties of your arguments or you're misunderstanding what I meant to say on the wiki page. Or both :-)

I now see the theoretical difference between "overlaps" and "subsumes" but I don't yet see how that is important in my algorithm outline. I re-read parts of the closed type families paper and now I'm not even sure whether I meant "overlaps" or "unifies".

To be clear, I am considering the expressions "two types overlap" and "two types unify" to be synonymous. Subsumption is a finer, directed relation between types than overlap/unify.

type family E1 a = r | r -> a where ...

Yes yes yes. You're right here. I was wrong. The wiki page is updated.

Now E2:

type family E2 (a:: Bool) = r | r -> a where
  E2 False = True
  E2 True  = False
  E2 a     = False

I consider almost identical example on the wiki:

type family Bak a = r | r -> a where
     Bak Int  = Char
     Bak Char = Int
     Bak a    = a

The difference is the RHS of the last equation: a concrete type in your example, a type variable in mine. Yet I believe that both examples are intended to demonstrate the same problem - see discussion below that example.

I think these cases are different. In yours, the RHS unification (a unifies with Char) refines the LHS (a |-> Char) so that it is subsumed by an earlier equation. This action does not happen in my example, and different reasoning (noting that False and True are both impossible) is required.

I have a question about closed type families paper. Paragraph just after Candidate Rule 2 (section 3.2, page 3) says: "As a notational convention, apart(ρ, τ) considers the lists ρ and τ as tuples of types; the apartness check does not go element-by-element.", where "not" is emphasized. Why is this important? Seems like the choice of representing patterns and types as tuples or lists is just an implementation detail.

Agreed that choosing a list vs. a tuple is irrelevant, but choosing element-by-element vs. as-a-whole is quite important. For example (a, a) is quite apart from (Int, Bool), when considered as a whole, even though a is neither apart from Int nor Bool looking at individual elements.

Does that clarify?

comment:86 Changed 3 years ago by jstolarek

Does that clarify?

Yes.

I don't yet see how to deal with your E2 example. It looks like we would need to have pattern-exhaustiveness checking for type families. And this would only be possible for closed kinds I think. Well, for open kinds type family would simply be non-injective.

I have a few more questions about implementation. These most likely go to Simon. I've been trying to wrap my head around implementation of doTopReactFunEq and I have these questions:

  1. What does the term "react" mean? As in the name doTopReact and the comment below its type signature: "The work item does not react with the inert set..." (TcInteract, L1412).
  1. What is "interaction"? As in the module name TcInteract or the comment mentioned above: "... so try interaction with top-level instances".
  1. What does EvTerm data type represent?
  1. What are branches in axioms?
  1. I don't understand how canonicalization works. Is there an overview somewhere? Other than Note [Canonicalization], which unfortunately is too vague for me :-(

comment:87 in reply to:  86 Changed 3 years ago by goldfire

Replying to jstolarek:

I don't yet see how to deal with your E2 example. It looks like we would need to have pattern-exhaustiveness checking for type families. And this would only be possible for closed kinds I think. Well, for open kinds type family would simply be non-injective.

I don't know how to do this either. Luckily, we can reject E2 on a first pass. It's possible that the only examples like this are silly in the same way that E2 is silly, in that the last equation is utterly unreachable.

I have a few more questions about implementation. These most likely go to Simon. I've been trying to wrap my head around implementation of doTopReactFunEq and I have these questions:

  1. What does the term "react" mean? As in the name doTopReact and the comment below its type signature: "The work item does not react with the inert set..." (TcInteract, L1412).
  1. What is "interaction"? As in the module name TcInteract or the comment mentioned above: "... so try interaction with top-level instances".

Simon may be able to answer better than I, but I believe these refer to the actions taken in section 7 of the OutsideIn paper.

  1. What does EvTerm data type represent?

These are values of constraint types, including class dictionaries, type equalities, and implicit parameters. EvTerms are desugared in DsBinds. Write back if that's not enough of an explanation.

  1. What are branches in axioms?

A closed type family with multiple equations leads to a so-called branched axiom. The way overlap works in a closed type family means that we can't have separate axioms for each equation. So, a branched axiom contains potentially many branches; each branch asserts an equality, but later branches can be used only when no previous branch supersedes. Does that help? It's all in the closed type families paper.

  1. I don't understand how canonicalization works. Is there an overview somewhere? Other than Note [Canonicalization], which unfortunately is too vague for me :-(

I think this is also in OutsideIn.

comment:88 Changed 3 years ago by jstolarek

I don't know how to do this either. Luckily, we can reject E2 on a first pass.

Yes, I believe that's what we should do. If we ever have analysis of unreachable equations for type families then we can use it to deal with E2 but until then we should simply reject it.

As for answers to 1-5 I'll read some more and get back if that doesn't help.

comment:89 Changed 3 years ago by jstolarek

I've read section 7.4 of OutsideIn(X) paper. Code makes more sense now but still I have some questions. Most importantly I'm trying to understand FINST top-level reaction rule (OutsideIn paper, fig. 24 page 65), which according to my understanding is implemented by doTopReactFunEq. Since I don't have LaTeX in the wiki I'll be using following notation:

\Q - set of top-level axiom schemes \g - little gamma \d - little delta \th - little theta \t - little tau \e - little epsilon _0 - 0 in a subscript \vec{...} - a vector of values, in paper denoted using overline \in - set membership realtion

Now the rule:

forall \vec{a}. F \vec{Xi_0} ~ Xi_0 \in \Q             -- (1)
\vec{b} = ftv(\vec{Xi_0})                              -- (2)
\vec{c} = \vec{a} - \vec{b}                            -- (3)
\g fresh                                               -- (4)
\th = [\vec{b |-> \t_a},\vec{c |-> \g}]                -- (5)
\th \vec{Xi_0} = \vec{Xi}                              -- (6)
if (l = w) then \vec{\d} = \vec{\g} else \vec{\d} = \e -- (7)
-------------------------------------------------------------
topreact[l](\Q, F \vec{Xi}~Xi) = {\d, \th Xi_0 ~ Xi}   -- (8)

Here's how I understand (or misunderstand that rule):

Premise (1) represents a type family equation written by the user (either from a closed or open type family). \vec{Xi_0} represents the LHS of F ie. patterns in that equation. So for example if the equation is F [b] Bool c then \vec{Xi_0} is ([b],Bool, c). I'm not sure about [b] - I think that's not canonical? Are axioms canonicalized?

I'm also not sure about the meaning of \vec{a} in (1)'s forall. My intuition would be that these are type variables in the LHS (b and c in above example). But then I don't know how to understand premise (2). If \vec{b} represents free variables if \vec{Xi_0} then wouldn't this be identical to \vec{a}? This does not make sense because in that situation \vec{c} in premise (3) would always be empty.

Premise (5) is a substitution used in premise (6) to turn the LHS of the original axiom into LHS of a constraint we are trying to solve. But what does \t_a represent? Instantiation of variables in the original axiom to some particular values present in \vec{Xi}?

Premise (7) says that if a constraint is wanted then we introduce new variables. I'm not sure why. Maybe this will become clear once I understand the rest.

I'm not sure if I understand the conclusion (8). Reading the rule I see that if we have F \vec{Xi} ~ Xi constraint we use it to figure out the \th substitution and then use that substitution to create a new constraint that tries to equate the RHS of our current constraint with the RHS of original axiom. But why do we want this? I could use an example with more comments than in the paper (and the example could be more complicated too). Before proceeding with the implementation I'd like to understand how injectivity affects FINST rule.

What is not clear to me is whether the new constraint we are producing is wanted or given? My guess is that we produce a given if the original constraint was given and wanted if the original was wanted.

I also wonder whether injectivity gives us a new interaction rule:

INJFEQFEQ   interact[l] (F \vec{Xi_1} ~ Xi, F \vec{Xi_2} ~ Xi) =
                        (F \vec{Xi_1} ~ Xi, \vec{ Xi_1 ~ Xi_2 })

where \vec{ Xi_1 ~ Xi_2 } represents vector of pairwise constraints.

A few more questions unrelated to the paper:

  1. Are axioms for associated types treated differently than for open type families?
  1. What does BuiltInSynFamTyCon constructor represent?
  1. Note [Basic Simplifier Plan] in TcInteract mentions "inert reactions" and "spontaneous reactions". What's the difference between the two?

comment:90 in reply to:  76 Changed 3 years ago by jstolarek

Simon, I'm trying to understand your earlier comment:

  • We are not (at least for now) changing System FC. So the only effect of injectivity is to add extra "Derived" unification constraints, very much like functional dependencies. If we see
     [W] F t ~ F s
    
    then we don't decopose it. Instead we add
     [D] t ~ s
    
    That guides inference, but does not produce proof terms. So, no, nothing to do with isDecomposableTyCon. Much more like the try_improvement code in TcInteract.doTopReactFunEq.

First of all, do we ever get a chance of actually seeing F t ~ F s? I think this is not canonical and if such constraint ever appears it would be turned into F t ~ a, F s ~ a. I'm not sure which of the canonicalization rules in Fig. 21 in OutsideIn paper would be responsible for that conversion because I don't know the meaning of these stylized D, F and T used in the mentioned figure. My guess would be FFLAT[W/G]R rules. Anyway, once we have F t ~ a, F s ~ a we could apply my suggested INJFEQFEQ interaction rule to generate [D] s ~ t. If my thinking here is right then shouldn't the implementation go into TcInteract.interactFunEq (guessing from its name)?

I also wonder whether [W] is important in your [W] F t ~ F s example? I believe this rule is valid for both given and wanted constraints.

I don't understand what you mean by "we don't decopose it". What is this "decomposition" that you mention?

I believe that the only moment when injectivity interacts with top-level axioms is when we're deducing LHS from RHS. I suspect this might require a new reaction rule, separate from FINST.

Are my ideas here correct?

comment:91 Changed 3 years ago by simonpj

Re comment:89 and comment:90 I'm struggling with limited personal bandwidth, and daunted by the challenge of giving a tutorial about the type inference engine, which is one of the most complex parts of GHC.

Let me make this offer. If you do the rest, I'll do the bit in the type inference engine that exploits injectivity. It won't take long, and then you can see.

Decomposition in brief. If you want Maybe a ~ Maybe b then it's enough to prove a ~ b. From a proof of the latter we can get a proof of the former. No more than that.

Simon

comment:92 Changed 3 years ago by jstolarek

Can I get some more time before I throw in the towel? Say, until next Thursday? EDIT: Next Thursday = 30th October

Simon, if you were to start working on this what parts of my implementation need to be finished? Everything up to SynTyCon storing a list of Bools? In what form would you like my patch? A differential revision? A branch on github? Do you want my work rebased against latest HEAD?

Decomposition in brief. If you want Maybe a ~ Maybe b then it's enough to prove a ~ b. From a proof of the latter we can get a proof of the former. No more than that.

Obviously, this is what we currently have for data types and what we want for injective type families. But from the OutsideIn paper my understanding was that *current* treatment of *type families* is different.

Last edited 3 years ago by jstolarek (previous) (diff)

comment:93 Changed 3 years ago by simonpj

Sure. Take as long as you need.

I was offering to do the type inference piece, which seemed to be the bit you are stalled on. So if you do parser, renamer, and typechecking of the TyCon declaration, so we have the injectivity info in the TyCon, that would do it I think.

In practical terms, a branch in a github repo that I can pull and push to would work well.

All I plan to do is add a rule along the lines of

  [W] F ty1 ~ rhs
  [W] F ty2 ~ rhs  
===>  add a new constraint (if F is injective)
  [D] ty1 ~ ty2

very like functional dependencies for type classes.

I'm not sure that there is time to get this implemented, tested etc before the 7.10 freeze, but we'll see.

Simon

comment:94 in reply to:  93 Changed 3 years ago by jstolarek

If we're not aiming to get this into 7.10 then I guess there's no rush.

Replying to simonpj:

All I plan to do is add a rule along the lines of

  [W] F ty1 ~ rhs
  [W] F ty2 ~ rhs  
===>  add a new constraint (if F is injective)
  [D] ty1 ~ ty2

very like functional dependencies for type classes.

Yes, that's what I meant with my INJFEQFEQ interaction rule in comment 89:

INJFEQFEQ   interact[l] (F \vec{Xi_1} ~ Xi, F \vec{Xi_2} ~ Xi) =
                        (F \vec{Xi_1} ~ Xi, \vec{ Xi_1 ~ Xi_2 })

Sorry if that wasn't clear. (Now I realized that checking whether F is injective might be consider a reaction with top-level axiom and thus this should be a reaction rule.)

But it seems to me there should be one more rule. Consider this example:

type family F a = r | r -> a where
    F Char = Bool
    F Bool = Char
    F a    = a

idChar :: (F a ~ Bool) => a -> Char
idChar a = a

Here we have constraint F a ~ Bool. Knowing that F is injective and it's RHS is Bool we want to deduce that a must be Char. This one seems more difficult because we have to look through all the equations. My guess would be that we have to do this before constraint solving (in TcM monad?).

Last edited 3 years ago by jstolarek (previous) (diff)

comment:95 Changed 3 years ago by simonpj

Yes, we need to look through all the equations. Very like functional dependencies.

Before we forget it, the big un-implemented piece is checking that the type-family equations are indeed injective. Or is that done now? I can't see it in Phab. That's the main reason I'm doubtful we'll make 7.10, unless perhaps you have it ready to go.

comment:96 Changed 3 years ago by jstolarek

Before we forget it, the big un-implemented piece is checking that the type-family equations are indeed injective. Or is that done now?

No, that's not implemented yet - I'm distracted between this and two other GHC projects so I didn't yet have the time :-(

comment:97 Changed 3 years ago by simonpj

What's the status on injective type families? Stalled?

comment:98 Changed 3 years ago by jstolarek

I'm implementing injectivity check for open type families and got stuck on some nonsense problem (described on Phab) so I diverted to #8326 this week. Will resume work on this next week.

Last edited 3 years ago by jstolarek (previous) (diff)

comment:99 Changed 3 years ago by jstolarek

Oh, and while we're talking about open type families, there's a design decision to make. Verifying user-supplied injectivity condition requires comparing the same equations that are compared by compatibility check. Should we:

a) verify injectivity at the same time we verify compatibility? b) verify injectivity in a separate pass?

a) might seem like an obvious answer because it allows us to avoid traversing the equation pairs twice. However, that makes compatibility checking code in FamInstEnv.lookup_fam_inst_env' a bit more complicated. b) on the other hand can allow us to skip the injectivity checking altogether if the user did not supply any injectivity declarations. This would be a big win since compilation performance would not be degraded for already existing code (since there are no injectivity declarations). I am in favour of option b).

comment:100 Changed 3 years ago by simonpj

My instinct is (a) because finding all pairs is potentially quite expensive, and repeating all that code seems a shame

comment:101 Changed 3 years ago by jstolarek

Finding pairs would be done once. Only the traversal would be done twice - see my inline comment on Phab.

comment:102 Changed 3 years ago by jstolarek

Milestone: 7.10.17.12.1

comment:103 Changed 3 years ago by jberryman

I'm very late to the discussion, so this is probably not helpful at all, but I wanted to mention my interest. I've only had use for better type inference with (conceptually) closed, recursive type families. What I think I want is something like proposal C from the wiki, or the head-injectivity proposal where "possibly-partial information about the result might allow us to deduce possibly-partial information about the arguments."

And as a naive user I want injectivity to be inferred for closed type families, and don't find the arguments against it in the wiki convincing. Intuitively there doesn't seem to be much difference between the level of logic required to infer things about types with closed type synonym families, vs the sort of inference the typechecker does elsewhere (again, as a naive user who doesn't care about the trickiness of implementing such a scheme). My impression is that this is what many people assumed would be the point of closed type families; for what I've had in mind closed type families make only a cosmetic improvement.

Anyway most of my interest came from my work on https://github.com/jberryman/shapely-data which embarrassingly seems to dependent quite a lot on #1241, so it (and below) needs GHC 7.6. You can ignore the example that follows or consider it if it's helpful:

{-# LANGUAGE TypeOperators #-}  
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE UndecidableInstances #-}  -- for nested Type family: Reversed (Tail t) :> Head t
{-# LANGUAGE FunctionalDependencies #-}

-- | A class for the exponent laws with the 'Normal' form @abc@ in the exponent
-- place. See the instance documentation for concrete types and examples.
class Exponent abc where
    fanin :: (abc :=>-> x) -> (abc -> x)
    unfanin :: (abc -> x) -> (abc :=>-> x)

-- | The algebraic normal form 'Exponent' @abc@ distributed over the single
-- base variable @x@.
type family abc :=>-> x
-- | x¹ = x
type instance () :=>-> x = x
-- | x⁽ᵃᵇ⁾ = (xᵇ)ᵃ
type instance (a,bs) :=>-> x = a -> bs :=>-> x

instance Exponent () where
    fanin = const
    unfanin f = f ()

instance (Exponent bs)=> Exponent (a,bs) where
    fanin f = uncurry (fanin . f)
    unfanin f = unfanin . curry f


class Homogeneous a as | as -> a where
    -- An n-ary @codiag@.
    genericRepeat :: a -> as
    
instance Homogeneous a () where
    genericRepeat _ = ()

instance (Homogeneous a as)=> Homogeneous a (a,as) where
    genericRepeat a = (a,genericRepeat a)

So the following two type-check fine:

a = (3 ==) $ (\(x,(y,(z,())))-> x+y+z) $ genericRepeat 1
b = (3 ==) $ fanin (\x y z-> x+y+z) (genericRepeat 1 :: (Int,(Int,(Int,()))))

But this doesn't:

c = (3 ==) $ fanin (\x y z-> x+y+z) $ genericRepeat 1

...failing with

    Couldn't match expected type `a1 -> a1 -> a1 -> a1'
                with actual type `abc0 :=>-> a0'
    The type variables `a0', `abc0', `a1' are ambiguous
    Possible fix: add a type signature that fixes these type variable(s)
    The lambda expression `\ x y z -> x + y + z' has three arguments,
    but its type `abc0 :=>-> a0' has none
    In the first argument of `fanin', namely `(\ x y z -> x + y + z)'
    In the expression: fanin (\ x y z -> x + y + z)

That library is probably going in the garbage bin at this point, but in general this is the sort of thing I think I'd like to be able to do with type families eventually. And I haven't gotten to dig into y'all's singletons work yet, so it might be that that's the sort of thing I'm looking for and my use of type families and classes in shapely-data is ill-considered.

Thanks for your work on this!

comment:104 in reply to:  103 Changed 3 years ago by jstolarek

Replying to jberryman:

I've only had use for better type inference with (conceptually) closed, recursive type families

Calling type families from an injective type family will not be allowed (including recursion). The reason is that I don't see a way of verifying injectivity when type families are involved.

What I think I want is something like proposal C from the wiki

At the moment I am implementing proposals A and B. C is something to think about in the future, as it is not trivial.

And as a naive user I want injectivity to be inferred for closed type families, and don't find the arguments against it in the wiki convincing.

Yes, I suppose that focus of the wiki discussion is a bit off. I think we could infer injectivity of type A for closed type families. But I don't think we should try to infer injectivity of type B because the algorithm is exponential - see comment 64. That's the major reason for doing this. (Numbers in that comment were computed for type C injectivity, but type B is also exponential.)

-- | The algebraic normal form 'Exponent' @abc@ distributed over the single
-- base variable @x@.
type family abc :=>-> x
-- | x¹ = x
type instance () :=>-> x = x
-- | x⁽ᵃᵇ⁾ = (xᵇ)ᵃ
type instance (a,bs) :=>-> x = a -> bs :=>-> x

I don't think :=>-> type family is injective: both () :=>-> (Int -> Int) and (Int,()) :=>-> Int reduce to Int -> Int.

comment:105 Changed 3 years ago by dorchard

Hi, Sorry to throw in a late thought, but what about adding a kind of injectives >->? e.g.* >-> *is the kind of injective type functions (as proved by the compiler, and perhaps as explicitly annotated by the user). This would provide a way to propagate injectivity information within the type checker, and could also be an alternative (replacement?) for using the functional dependency syntax here (although, I think that syntax makes a lot of sense). I suppose it is not currently possible to specify this for a type family, unless kind signatures on type family names were possible, e.g.

    type family (Id :: * >-> *) a where Id a = a

but this wouldn't be hard to add I don't think.

Note that data families, and even data types, could then be given the injective kind too. This might provide some simplification in the type checker? I don't know for sure though; just a thought, but one I thought worth mentioning. Great success was had in the past by adding a new kind constructor rather than new syntax (cf. constraint kinds/constraint families!)

Last edited 3 years ago by dorchard (previous) (diff)

comment:106 Changed 3 years ago by dfeuer

I don't see the problem with letting injective families call other type families as long as there is no recursion—can't GHC follow all paths to the end when necessary, and use injectivity "lemmas" when possible for efficiency? Or does this end up not being useful/efficient/possible for some reason?

comment:107 Changed 3 years ago by jstolarek

Dominic, do throw in thoughts at any time. My patch will be ready Real Soon Now but after the merge there will be room for further development of this feature.

I will not comment on the proposal of having a separate kind for injective type families. I don't have sufficient knowledge of typechecker to judge whether this would make things easier. Perhaps Simon or Richard can offer some insight here. Regarding your proposed new syntax I believe it does not offer anything new compared to the syntax we decided to use (unless I missed something?). These pairs of definitions would be equivalent:

type family Id a = result | result -> a where {...}
type family (Id :: * >-> *) a where {...}

type family F a b c = d | d -> a b c
type family (F :: * >-> * >->) a b c

type family G a b c = foo | foo -> a b where
type family (G :: * >-> * -> *) a b c

Your syntax however does not extend to injectivity where knowing the result and some of the arguments allows to determine some other arguments:

type family Plus a b = (sum :: Nat) | sum a -> b, sum b -> a where
type family H a b c = xyz | xyz a -> b c, xyz b -> a c

This kind of injectivity is not implemented in my patch but we want to leave the door open for such a possibility in the future. I'm also not sure if I like how your proposed syntax interacts with PolyKinds: it seems that there would be two alternative places to specify kinds.

David, what do you mean by injectivity lemmas? At the moment I don't have a solid stance on the subject of calling type families in the RHS of injective type families. When I started this work it seemed to me that this is not possible. Now I have some more experience so I plan to revisit my earlier decisions. Incidentally, now it seems to me that self-recursion might actually be doable. Calling other type families in the RHS still seems very problematic. First of all called families would have to be injective to declare that a type family that calls them is also injective*. This implies that type families would have to be typechecked in dependency order, which sounds lika a non-trivial technical obstacle. Of course this would prevent type families that form cycles (including mutual recursion) from being injective. Another problem is issue of overlap:

type family Foo a = r | r -> a where
     Foo (Bar a)   = X a
     Foo (Baz a b) = Y a b

To determine whether Foo is injective or not we must know whether X a and Y a b can have identical values (well, types not values). Given that X and Y can also call other type families I don't see a simple way of checking this. Anyway, the ice here seems very thin and I don't feel confident proceeding further with calling type families by injective type families without formal proof.

*) This is a simplifying assumption. It is possible to write a type family that is injective despite calling non-injective type families.

comment:108 in reply to:  105 ; Changed 3 years ago by goldfire

Replying to dorchard:

what about adding a kind of injectives >->? e.g.* >-> *is the kind of injective type functions

This is very much along the lines of what Jan and I proposed in our "Promoting Functions to Type Families" paper (here). But, there is some subtlety.

GHC assumes all function types (that is, types with kinds that are headed by ->) are generative and injective. Generativity means that, given a b ~ c d, you can conclude a ~ c. Injectivity means that, given a b ~ a d, you can conclude b ~ d. Indeed, putting these together and assuming a b ~ c d, you can conclude b ~ d. Thus, the kind -> denotes a generative, injective function.

We could imagine a new classifier ~> that denotes an ordinary function, with no assumptions. GHC would then be careful not to decompose such things. This is exactly the situation today with type families. Note that, short of GHCi's :kind command, there is no way to work with the kind of an unapplied type family. If I say type family F (a :: *) :: *, we often colloquially say that F :: * -> *, but there is no way to use this knowledge in a Haskell program, because F can never appear unsaturated. Thus, it is more accurate to say that F is just ill-formed, and F a :: * as long as a :: *.

All of this is to say that the new arrow would have to denote non-injective functions. And, I could easy say that GHC today supports such a thing, because there's no way to take advantage of it anyway.

What we really need in this direction is unsaturated type functions, but that's a story for another day. (A story I'd love to think more about... on that other day!)

So, taking all of this into account, dorchard's proposal is a different syntax for denoting injectivity, and I personally prefer the functional-dependency-like one more.

comment:109 in reply to:  108 ; Changed 2 years ago by jstolarek

Replying to goldfire:

Replying to dorchard:

what about adding a kind of injectives >->? e.g.* >-> *is the kind of injective type functions

This is very much along the lines of what Jan and I proposed in our "Promoting Functions to Type Families" paper (here).

Not exactly, I think. At the moment -> kind classifies term-level functions (non-injective, non-generative, can be partially applied), type constructors (injective, generative, can be partially applied) and type families (non-injective, non-generative, cannot be partially applied). In our paper we proposed a new kind to classify type functions (families) that are still non-injective and non-generative but can be partially applied just like term-level functions. We addressed the problem of partial application, not injectivity.

GHC assumes all function types (that is, types with kinds that are headed by ->) are generative and injective. (...) Thus, the kind -> denotes a generative, injective function.

Are you sure? I would say this is true only for type constructors but perhaps I'm missing something. (I am assuming that also -> classifies type families although, as you point out, this is not entirely accurate.)

Anyway, I can imagine introducing a new kind to distinguish between injective and non-injective applications but, given that we might want a new kind to distinguish between things that can and can't be partially applied at the type level, this does not seem like a good choice.

comment:110 in reply to:  109 Changed 2 years ago by goldfire

Replying to jstolarek:

(I am assuming that also -> classifies type families although, as you point out, this is not entirely accurate.)

Right -- -> does not classify type families in Haskell today. The only place where -> describes a type family is in the output of GHCi's :kind operator. And, to get that to work, GHC has a gross hack in that GHCi relaxes the "type families must always be saturated" requirement, just to appease users who want to know the shape of their type families. (This gross hack is a great idea from a user's point of view, and I'm not at all suggesting removing it. But it is gross from a type system point of view.)

Nowhere in a Haskell source text can -> classify a type family. So, I think that fact describes the difference between our comments above.

comment:111 Changed 2 years ago by Jan Stolarek <jan.stolarek@…>

In 37445780/ghc:

Injective type families

For details see #6018, Phab:D202 and the wiki page:

https://ghc.haskell.org/trac/ghc/wiki/InjectiveTypeFamilies

This patch also wires-in Maybe data type and updates haddock submodule.

Test Plan: ./validate

Reviewers: simonpj, goldfire, austin, bgamari

Subscribers: mpickering, bgamari, alanz, thomie, goldfire, simonmar,
             carter

Differential Revision: https://phabricator.haskell.org/D202

GHC Trac Issues: #6018

comment:112 Changed 2 years ago by jstolarek

Resolution: fixed
Status: newclosed
Test Case: T6018, T6018fail, T6018rnfail, T6018th, T6018failclosed1, T6018failclosed2, T6018failclosed3, T6018failclosed4, T6018failclosed5, T6018failclosed6, T6018failclosed7, T6018failclosed8, T6018failclosed9, T6018failclosed10, T6018failclosed11, T6018failclosed12, T6018ghci, T6018ghcifail, T6018ghcirnfail

comment:113 Changed 2 years ago by jstolarek

comment:114 Changed 2 years ago by thoughtpolice

Milestone: 7.12.18.0.1

Milestone renamed

comment:115 Changed 2 years ago by Simon Peyton Jones <simonpj@…>

In a7f6909/ghc:

A CFunEqCan can be Derived

This fixes the ASSERTION failures in
  indexed-types/should_fail/T5439
  typecheck/should_fail/T5490
when GHC is compiled with -DDEBUG

See Phab:D202 attached to Trac #6018

comment:116 Changed 2 years ago by Jan Stolarek <jan.stolarek@…>

In f30a4925/ghc:

Testsuite cleanup

As a result of fixing #10836 it is now possible to merge 11 different
tests for #6018 into one

comment:117 Changed 2 years ago by jstolarek

Test Case: T6018, T6018fail, T6018rnfail, T6018th, T6018failclosed1, T6018failclosed2, T6018failclosed3, T6018failclosed4, T6018failclosed5, T6018failclosed6, T6018failclosed7, T6018failclosed8, T6018failclosed9, T6018failclosed10, T6018failclosed11, T6018failclosed12, T6018ghci, T6018ghcifail, T6018ghcirnfailT6018, T6018fail, T6018rnfail, T6018th, T6018failclosed1, T6018failclosed2, T6018ghci, T6018ghcifail, T6018ghcirnfail

comment:118 Changed 22 months ago by Jan Stolarek <jan.stolarek@…>

In 3cfe60a/ghc:

Abstract TFs can have injectivity information

Summary:
For abstract type families we incorrectly rejected their injectivity
annotation.  Fixes #11007.

Test Plan: #6018

Reviewers: goldfire, austin, bgamari

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D1453

GHC Trac Issues: #11007

comment:119 Changed 11 months ago by Iceland_jack

Use case of more complicated injectivity annotations from Type Families with Class, Type Classes with Family, simulating the type class:

class Collection c e | c -> e where
  empty :: c
  add   :: e -> c -> c

with a type family

data Defined = Yes | No

type family IsCollection t e = (result :: Defined) | result t -> e

Edit: Instances

instance        Collection [a] a

type instance IsCollection [a] a = Yes
Last edited 11 months ago by Iceland_jack (previous) (diff)

comment:120 Changed 4 months ago by simonpj

Keywords: InjectiveFamilies added; Injective removed
Note: See TracTickets for help on using tickets.