#6018 closed feature request (fixed)
Injective type families
Reported by: | lunaris | Owned by: | jstolarek |
---|---|---|---|
Priority: | normal | Milestone: | 8.0.1 |
Component: | Compiler | Version: | 7.4.1 |
Keywords: | TypeFamilies, Injective | Cc: | simonpj@…, mokus@…, eir@…, nathanhowell@…, nfrisby, conal@…, shane@…, andy.adamsmoran@…, byorgey@…, douglas.mcclean@…, ollie@…, acowley@…, ekmett@…, jan.stolarek@…, ggreif@…, vogt.adam@…, haskell@… |
Operating System: | Unknown/Multiple | Architecture: | Unknown/Multiple |
Type of failure: | None/Unknown | Test Case: | T6018, T6018fail, T6018rnfail, T6018th, T6018failclosed1, T6018failclosed2, T6018ghci, T6018ghcifail, T6018ghcirnfail |
Blocked By: | Blocking: | ||
Related Tickets: | #4259, #10832, #10833 | Differential Rev(s): | Phab:D202 |
Wiki Page: |
Description
Injective type families have been discussed several times on the mailing list and identified as a potentially useful feature.
I've naively attempted a proof-of-concept in GHC. It's almost certainly incorrect and probably breaks the type system, but I thought it best to put it here and see if it can be made workable.
In summary, my changes are:
- Add a new keyword,
injective
, which is available only when theTypeFamilies
extension is enabled. Injective families may then be defined thus:
injective family F a :: * type instance F Int = Bool type instance F Bool = Int injective family G a :: * type instance G a = a
Syntax is of course completely changeable; I've simply picked one of the possible designs. Hopefully this won't be subjected to too much bike-shedding.
- Add the constructor
InjFamilyTyCon
toTyConRhs
and the family flavourInjectiveFamily
toHsSyn
. Again, I've opted to encode injectivity as a flavour rather than (say) aBool
attached to type families. This is a completely arbitrary choice and may be utterly stupid.
- Altered the definition of decomposable
TyCon
s to include injective families (isDecomposableTyCon
). This effectively introduces a typing rule that says if we have(F a ~ F b)
then we can deduce(a ~ b)
(TcCanonical
).
- Modified the unifier so that it will attempt to replace saturated injective families with their left-hand sides where possible (
TcUnify
). This means that in a function such as:
f :: F a -> F a f = ...
The type of
f False
is inferred asF Int
(i.e.,a
is no longer ambiguous).
Some things work, some things don't. For example, the attached file typechecks, but if I actually try to evaluate f False
I get nothing (not even a Segmentation fault).
See what you think.
Attachments (2)
Change History (121)
Changed 5 years ago by
Attachment: | Injective.hs added |
---|
Changed 5 years ago by
Attachment: | injective-families.patch added |
---|
Patch against GHC HEAD to support injective families
comment:1 Changed 5 years ago by
Oops -- I forgot to list all that was wrong with the current implementation. Namely:
- In the example,
g
's type is inferred asa -> a
(and notG a -> G a
). This is in some sense `correct', but seems a bit dodgy.
- I don't actually *enforce* that the declared instances of a family *are* injective yet -- I've put that off until what's there is semi-stable.
- I've messed up and inadvertently included whitespace changes in one of the patches. Apologies.
comment:2 follow-up: 32 Changed 5 years ago by
difficulty: | → Unknown |
---|
Some functions might be injective in one argument but not another. For example:
F a b ~ F c d ===> a ~ c but not b ~ d
comment:3 Changed 5 years ago by
Indeed; I'd not considered that case. I'm not aware of a well-defined semantics for injective type functions in the context of constraint solving so I've no idea how to proceed. There are of course syntactic complications also.
comment:4 Changed 5 years ago by
Cc: | mokus@… added |
---|
comment:5 Changed 5 years ago by
An interesting use case is type-level co-naturals using -XDataKinds (which I'm using as the "height" type index for an experimental 2^{k-tree implementation): }
data Nat = Zero | Suc Nat data CoNat = Co Nat | Infinity type family Succ (t :: CoNat) :: CoNat type instance Succ (Co n) = Co (Suc n) type instance Succ Infinity = Infinity
Succ can't be a constructor, because then Succ Infinity /~ Infinity. Succ as a type family half-works, but you end up needing (logically-)unnecessary type annotations all over the place because GHC doesn't know that Succ n ~ Succ m => n ~ m. It might be possible to devise some kind of type-level bisimilarity constraint, but I suspect it would actually increase the need for type annotations rather than decrease it.
I've also tried an alternate definition of the tree type using "Pred" instead of "Succ", but that seems to lead to really bizarre types - it accepts obviously non-flat trees at a type indicating that the tree's height is zero (or actually, any CoNat I want), because it accepts the existence of types such as "Pred (Co Zero)". Is this expected behavior (e.g., due to the open-world assumption) or should I file a bug report on that?
comment:6 Changed 5 years ago by
Cc: | eir@… added |
---|
comment:7 Changed 5 years ago by
Cc: | nathanhowell@… added |
---|
comment:8 Changed 4 years ago by
Cc: | nfrisby added |
---|
comment:9 Changed 4 years ago by
Milestone: | → 7.8.1 |
---|---|
Owner: | set to simonpj |
Simon, I'm no sure what the status of this is, but it looks like your area so I'm assigning it to you.
comment:10 Changed 4 years ago by
Cc: | conal@… added |
---|
Note this discussion, which includes the subtlety of injective in various arguments (holding other fixed).
comment:11 Changed 4 years ago by
Related Tickets: | → #4259 |
---|
Also see the discussion in #4259, which is about strengthening type families along a different dimension. The two tickets are related.
comment:12 Changed 4 years ago by
Cc: | shane@… added |
---|
comment:13 Changed 4 years ago by
Cc: | andy.adamsmoran@… added |
---|
comment:14 Changed 4 years ago by
Cc: | byorgey@… added |
---|
comment:15 Changed 4 years ago by
Noting for completeness: possibly related to #7205 and #5591.
Why I'm interested: I'm modeling the Java type system in Haskell (overload resolution in particular) a la chak and Andrew's Salsa, using type families and type functions heavily. Their code worked under earlier versions of GHC, but not under 7.4 and beyond.
comment:16 Changed 4 years ago by
Andy, No version of GHC had injective families. If you have code that you think should work, and used to, and doesn't now, by all means submit a but report!
Simon
comment:17 Changed 4 years ago by
It turns out I was missing -XscopedTypeVariables. I'm still interested in injective type families, but it's not a blocker (and Manuel and Andrew's code still compiles).
comment:18 Changed 4 years ago by
Cc: | Artyom.Kazak@… added |
---|
comment:19 Changed 3 years ago by
Cc: | douglas.mcclean@… added |
---|
comment:21 Changed 3 years ago by
Cc: | ollie@… added |
---|
comment:22 Changed 3 years ago by
Cc: | acowley@… added |
---|
comment:23 Changed 3 years ago by
ollie, acowley: if you have a real-life use case for injective type families, do give the details here. Currently we don't have many compelling examples.
Thanks
Simon
comment:24 Changed 3 years ago by
It's a little hard to say whether or not injective type families would actually help with what I'm doing - it could well be that I don't just understand this type level magic enough! However, in a few cases where I've got stuck I've moved things to a GADT to witness the relation between types instead, which I believe does get me injectivity and from that points things seemed to "just work" - albeit requiring a bit more effort. I'll keep my eyes out, and try and provide examples here if I find anything concrete.
comment:25 Changed 3 years ago by
Here's a real worldish use case.
Let's try to write a modified Functor class that lets us map over not just the last type variable, but any type variable. This would be useful, for example, with the Either type.
The class might look something like:
class Functor loc f where fmap :: proxy loc -> (a -> b) -> SetLocation f loc a -> SetLocation f loc b
Where loc is a type that uniquely identifies the position of the argument we want to map over. We can't make an instance of this class unless the SetLocation type family is injective.
With a little syntactic sugar, we can convert the above class into something that looks very much like the current Functor class, but is quite a bit more powerful:
class Functor loc f where fmap :: proxy loc -> (a -> b) -> f { loc = a } -> f { loc = b }
I don't think you can get something nearly that clean without injective type families. And, of course, this applies to any parametric classes currently in use, not just Functor.
comment:26 Changed 3 years ago by
Here's another real world use case.
Given
class Manifold' a where type Field a type Base a type Tangent a type TangentBundle a type Dimension a∷ Nat type UsesMetric a∷ Symbol project ∷ a → Base a unproject ∷ Base a→ a tangent ∷ a → TangentBundle a cotangent ∷ a → (TangentBundle a→ Field a)
then this works:
id' ∷ ∀ a. ( Manifold' a) ⇒ Base a → Base a id' input= project out where out∷a out = unproject input
whereas this requires injective type families:
id' ∷ ∀ a. ( Manifold' a) ⇒ Base a → Base a id' = project ∘ unproject
comment:27 Changed 3 years ago by
Is there a current version of the patch that works against the current GHC HEAD? The 2-year-old one fails.
I can say that I have been trying to define injective type families ever since TypeFamilies were introduced! Especially to define functions like the fmap that MikeIzbicki showed or very simple type-level aliases between different type classes.
It also increases modularity of data families. For example, it would allow defining datatypes in different modules, and combining them together using type family aliases in a parent module.
module A where data A = ... data FA = ... module B where data B = ... data FB = ... module Main where import A import B type family F a :: * type instance F A = FA type instance F B = FB
comment:28 Changed 3 years ago by
My intended uses seem to arise quite often, but I've not found one that boils down into a small, self-contained, compelling example. In broad terms, I would like to use type families for decoupling, as hpacheco describes, and to do things like move back and forth between a set of distinct types and singleton Nats.
comment:29 Changed 3 years ago by
I don't think anyone is working on injective type families right now, but it would make a good project, and not too difficult.
Lunaris's patch has a lot of irrelevant white-space changes in it. But the biggest problem is that I don't think it implements the all-important check that a injective type family really is injective. Consider
injective type family F a type instance F Int = Bool type instance F Char = Bool
This isn't injective and should jolly well be rejected. And this should be the case even if the two type instance
declarations are in different modules.
Similar code already exist for rejecting type-family overlap; I think it is in FamInst.checkFamInstConsistency
. But it'd need to be beefed up to deal with the injectiveness check.
And you need to take care. What about
type instance F Int = Bool -- (BAD) type instance F Char = F Int -- (BAD)
The two RHSs look different, but actually they are the same. The criterion for injectivity is that
If (F t1 .. tn) ~ (F s1 .. sn) then t1 ~ s1 ... tn ~ sn
That doesn't hold for the (BAD) instances, so they should be rejected. In the terminology of the "Closed type families" paper, we need the two RHSs to be apart.
Simon
comment:30 Changed 3 years ago by
To get going, I wouldn't mind if the onus of proving that the type family is indeed injective fell on me, in an unsafe way. But I understand that it is clearly not a feature that you should blindly introduce in mainstream GHC... I may look into it in the future if I ever find the time, but I doubt it.
comment:31 Changed 3 years ago by
Cc: | ekmett@… added |
---|
comment:32 Changed 3 years ago by
I spent some time recently thinking about this one. It is not clear to me what should happen when we know that a type family F
is injective. It would certainly allow us to infer from (F a ~ F b)
that a ~ b
(modulo corner cases). So instead of first reducing injective type family application and then working on the result we could instead reason about type family's arguments. We could even invert type families. Is there anything else we could reason about with injective type families?
The discussion here is only about open type families (obviously, closed type families were not implemented two years ago). I guess that now this change would affect both open and closed type families?
Replying to simonpj:
Some functions might be injective in one argument but not another. For example:
F a b ~ F c d ===> a ~ c but not b ~ d
That is not true according to the mathematical definition of injectivity (as stated in comment 29). Unless we want to introduce definition of type families injective only in some of the arguments, as proposed by Iavor.
comment:33 Changed 3 years ago by
Cc: | jan.stolarek@… added |
---|
comment:34 Changed 3 years ago by
I thinking about this ticket, I've imagined some syntax like this:
type family FullyInjective a | Result -> a -- use a fun-dep-like syntax, where "Result" is a keyword, capitalized to avoid name-clashes with type variables type family Plus a b | Result a -> b, Result b -> a type family Only1stInjective a b | Result -> a
I could even imagine having dependencies among the arguments, though that might be a little strange.
comment:35 Changed 3 years ago by
Cc: | Artyom.Kazak@… removed |
---|
comment:36 Changed 3 years ago by
Cc: | ggreif@… added |
---|
comment:37 Changed 3 years ago by
Owner: | changed from simonpj to jstolarek |
---|
Simon, I'm stealing this one from you :-)
comment:38 Changed 3 years ago by
Fine. But please let's have a wiki page that describes the design in detail, from the user's point of view, before you start implementing anything!
- Syntax
- The effect on which programs are well-typed
Simple is good!
Simon
comment:39 Changed 3 years ago by
Simon, I just drafted this wiki page. I believe it gives a pretty good description of how the syntax will work, but it doesn't give too many details about the effects on type checking. Do you want this to be somehow formalized or would examples suffice? Do you feel that something else is missing on the wiki page?
Simple is good!
Are you referring to anything particular or is this just a general reminder to follow the KISS principle?
comment:40 follow-up: 41 Changed 3 years ago by
I'm still not sold on the concrete syntax involving result
, despite having introduced that idea myself. I like the (**)
suggestion here a bit more: http://www.haskell.org/pipermail/ghc-devs/2014-July/005492.html
Separately, a change to Core will be necessary to really make this work in all cases.
data X a where MkX :: (F b c ~ a) => b -> c -> X a
Say F
is injective (in both arguments). After pattern-matching on MkX ... :: X (F Int Bool)
, we will probably want to discover that b
is Int
and c
is Bool
. To do so, we will need to decompose the (F b c ~ a)
coercion, using the left and right (or perhaps nth) coercion forms. These currently don't work over type families, and for good reason. However, if a type family is injective, then these could be made to work over type families. With the update to Core, we would probably want to make sure we don't lose type safety, though!
All that said, this use case is obscure, and the primary push for injective type families does not seem to require a change to Core. It is worth implementing without this piece. But, the idea came up in conversation, and I thought I should record it.
comment:41 follow-up: 42 Changed 3 years ago by
Replying to goldfire:
I'm still not sold on the concrete syntax involving
result
, despite having introduced that idea myself. I like the(**)
suggestion here a bit more: http://www.haskell.org/pipermail/ghc-devs/2014-July/005492.html
So, to be precise, with this syntax the examples from the wiki page would look like this:
type family F a b c | F -> a b c type family G a b c | G -> a b type family H a b c | H a -> b c, H b -> a c, H c -> a b type family J a b c | J a -> b c, J b -> a c
I did not propose to use the notation like H b -> a c
because I thought it might mislead people to think that: a) H
could be partially applied; b) some parameters of H
could be skipped (ie. we can apply it to b
without first applying to a
). Luckily, this is not a blocking issue for the rest of implementation. It should only affect 3 files in the parser, so it can be changed later.
Separately, a change to Core will be necessary to really make this work in all cases.
(...)
All that said, this use case is obscure, and the primary push for injective type families does not seem to require a change to Core. It is worth implementing without this piece. But, the idea came up in conversation, and I thought I should record it.
This is beyond my knowledge - I would require some guidance here. But if it's possible to implement injective type families without this change then I would postpone it until later and possibly consider it a separate task.
comment:42 Changed 3 years ago by
Replying to jstolarek:
This is beyond my knowledge - I would require some guidance here. But if it's possible to implement injective type families without this change then I would postpone it until later and possibly consider it a separate task.
Yes, it's possible, but there are some circumstances in which users could say that GHC gives an erroneous type error, in that GHC would manifestly have enough knowledge to make progress but doesn't. I agree that you should go ahead without this piece, though!
comment:43 Changed 3 years ago by
Cc: | vogt.adam@… added |
---|
comment:44 Changed 3 years ago by
Cc: | haskell@… added |
---|
comment:45 Changed 3 years ago by
Before embarking on the implementation, is everyone convinced that the proposed syntax
type family F a :: * | result -> a
is the best one? I understand the desire to support injectivity in some parameters but not others.
For closed type classes, can't we infer injectivity, rather than having to declare it? (Although I suppose that the ability to declare is always a Good Thing.)
Simon
comment:46 Changed 3 years ago by
That syntax precludes the possibility of result
being the name of an argument to the function. Using Result
avoids this problem, but that would be a capitalized keyword, which is strange. The other possibility that avoids stealing the name result
is to use the type family itself to represent the result: type family F a | F -> a
, but that's a little strange. I guess I favor the last syntax the most.
comment:47 Changed 3 years ago by
We could also reuse an existing keyword, e.g., type family F a :: * | type -> a
.
comment:48 follow-up: 56 Changed 3 years ago by
A conversation with Ulf Norell (the first implementor of Agda) led to this interesting thought: We may not want injectivity -- we may want a weaker property, which I'll term "head-injectivity".
The equations defining a head-injective function all have RHSs with different heads. What appears below the heads does not matter. Here is an example:
type family F a where F (Just x) = True F Nothing = False
This function is not injective: F (Just 1) ~ F (Just 2)
. Yet, it is head-injective. If the type-inferencer knows that F x ~ True
, it can then refine x ~ Just y
for some fresh unification variable y
. This improvement may be enough to make further progress.
Of course, injectivity does not imply head-injectivity. Consider
type family G a where G True = Just 1 G False = Just 2
This function is injective but not head-injective. So, implementing only head-injectivity will miss this case. But, perhaps we can refine the notion of head-injectivity simply to mean that the RHSs don't unify.
The reason I'm bringing this up is that head-injectivity is, as I understand it, implemented in Agda and used during type inference. It intrigues me to think that we can make progress in the case of F
, a non-injective function.
comment:49 follow-up: 51 Changed 3 years ago by
Replying to simonpj:
Before embarking on the implementation, is everyone convinced that the proposed syntax
type family F a :: * | result -> ais the best one?
I think the bulk of the work with injectivity will be done under the hood. Changing the syntax later on will not be difficult and I don't want to be blocked by bikeshedding.
Inferring injectivity for closed type families is something that indeed requires more discussion. But again, I feel I can make at least some progress without any binding decissions on this matter being made.
I don't like any of Richard's proposals:
- if we introduced
Result
this would be the first capitalized keyword (reserved word?) in Haskell. I don't like such inconsistency. Notice also thatresult
might conflict with type variable names in type family head but that is a local conflict that restricts only the names possible to use for type variables. On the other handResult
might conflict with a name of a type (or type family), which is more of a global thing. - if we use syntax that uses type family name then we will get something like
type family F a b c | F b -> a c
andF b
looks strange given thatF
's arity is 3. It looks as ifF
could be partially applied.
I don't think that restricting the use of result
word in type family head is a big problem - we've already done that with role
.
comment:50 follow-up: 52 Changed 3 years ago by
Richard, I don't understand your terminology here (though I do understand the F
example):
The equations defining a head-injective function all have RHSs with different heads. What appears below the heads does not matter.
Could you explain what you mean by RHSs with different heads
and below the heads
? Up till now I thought that the term "head" refers to type family F a
part of the declaration.
I also don't understand why G
is not head-injective, but perhaps this will become clear to me once earlier questions are answered.
comment:51 follow-up: 54 Changed 3 years ago by
Replying to jstolarek:
I don't think that restricting the use of
result
word in type family head is a big problem - we've already done that withrole
.
The use of role
doesn't steal syntax: it can be used only after the word type
in a top-level declaration, where a capitalized identifier is expected.
comment:52 follow-up: 53 Changed 3 years ago by
Replying to jstolarek:
Could you explain what you mean by
RHSs with different heads
andbelow the heads
? Up till now I thought that the term "head" refers totype family F a
part of the declaration.
Here, "head" means the head of the RHS type, not the type declaration -- it's the root of the type's AST. Does that help?
I also don't understand why
G
is not head-injective, but perhaps this will become clear to me once earlier questions are answered.
G
isn't head-injective because both RHSs are headed by Just
.
comment:53 follow-up: 55 Changed 3 years ago by
Replying to goldfire:
Here, "head" means the head of the RHS type, not the type declaration -- it's the root of the type's AST. Does that help?
Yes, now I fully understand.
Also, if I understood correctly it is possible to have both injectivity and head-injectivity in the compiler, right?
comment:54 Changed 3 years ago by
Replying to goldfire:
Replying to jstolarek:
I don't think that restricting the use of
result
word in type family head is a big problem - we've already done that withrole
.The use of
role
doesn't steal syntax: it can be used only after the wordtype
in a top-level declaration, where a capitalized identifier is expected.
Oops, sorry for defamation then ;-) Still, I think that the price to pay for using result
will not be that high.
comment:55 Changed 3 years ago by
Replying to jstolarek:
Also, if I understood correctly it is possible to have both injectivity and head-injectivity in the compiler, right?
Yes, but confusing, especially if there is syntax for both.
comment:56 Changed 3 years ago by
Replying to goldfire:
A conversation with Ulf Norell (the first implementor of Agda) led to this interesting thought: We may not want injectivity -- we may want a weaker property, which I'll term "head-injectivity".
So when our goal is to make progress from the target expression towards the source expression, what hinders us to follow Omega and use narrowing to obtain the head? Tim Sheard has a nice 2006 paper on how to do it. It is really powerful and we could insist that exactly one solution exists.
comment:57 follow-up: 58 Changed 3 years ago by
(GHC does, in effect, use narrowing, I believe.)
Let's stand back a bit. What are we trying to achieve. Here is the Number 1 Goal: if the type inference engine is trying to prove F a ~ F beta
, where a
is a skolem constant and beta
is a unification variable, is it justified in fixing beta := a
? If F
is not injective the answer is "no". If it's injective, the answer is "yes". We have multiple examples of the need for this.
More generally, our goal is to use injectivity to guide the decision about what type a unification varaible stands for. Injectivity also allows other such opportunities, beyond the Number 1 Goal above. Suppose F
is injective, and we have
F Int = True
Then if we have F beta ~ True
we can deduce beta := Int
.
Richard raises further possiblities: possibly-partial information about the result might allow us to deduce possibly-partial information about the arguments. That is the stuff about head-injectivity and it seems to be potentially very complicated. Take a closed type family. If I tell you something (but perhaps not everything) about the result, perhaps you can tell me something (but perhaps not everything) about the argument. I don't know how clever you might be here, or how to declare the cleverness in an "injectivity signature" for open families.
But we have no actual use-caes for anything except Number 1 Goal, for equalities of form F t1 ~ F t2
. It seems that equalities of form F t1 ~ some-type
offer potential opportunities, but I don't see a sweet spot.
Simon
comment:58 follow-ups: 60 62 Changed 3 years ago by
Replying to heisenbug:
Tim Sheard has a nice 2006 paper on how to do it.
What's the title of this paper? I tried to google it but failed.
Replying to simonpj:
For closed type classes, can't we infer injectivity, rather than having to declare it?
I thought about it and I believe this should be possible. But we could only infer normal injectivity. I believe an algorithm for inferring injectivity in some of the parameters (ie. things like result a -> b c
or result -> b c
for a 3-arity type family) would be exponential in the number of arguments. Inferring normal injectivity should be O(n^{2}) at most, where n is the number of equations.
Replying to simonpj:
But we have no actual use-caes for anything except Number 1 Goal, for equalities of form
F t1 ~ F t2
.
It looks like Conal Elliot has a use case for injectivity in only some arguments: http://www.haskell.org/pipermail/glasgow-haskell-users/2011-February/020027.html I'll contact him for more details.
comment:59 follow-up: 61 Changed 3 years ago by
Simon's (very straightforward) example clarifies this in my mind -- I agree about just heading for plain old injectivity for open type families. If/when we want more involved reasoning for closed type families, where inference is possible, we can address that.
comment:60 Changed 3 years ago by
comment:61 Changed 3 years ago by
Replying to goldfire:
plain old injectivity for open type families.
more involved reasoning for closed type families, where inference is possible
I'm not sure what design choices are hidden behind these two statements. Do you mean that for open type families we would have: a) simple injectivity in the result only (via annotations) or b) injectivity in some of the arguments (also via annotations); whereas for closed type families we would admit the same kind of injectivity declarations as for open type families and perhaps also injectivity inference as proposed by me in comment 58?
comment:62 Changed 3 years ago by
Replying to jstolarek:
I thought about it and I believe this should be possible. But we could only infer normal injectivity. I believe an algorithm for inferring injectivity in some of the parameters (ie. things like
result a -> b c
orresult -> b c
for a 3-arity type family) would be exponential in the number of arguments.
I don't know anything about this, but what's wrong with exponential time here? This inference only happens once per closed type family, right? How many variables are there likely to be?
comment:63 Changed 3 years ago by
Hello! Is there any patch which would allow me to use injective type families in current GHC? I saw on the top of this page, that a patch was provided, but about 2 years ago - are there any working implementations?
comment:64 Changed 3 years ago by
what's wrong with exponential time here?
Compilation times. If a type family takes a single argument then obviously there is only one possible injectivity condition ("result -> a"). For two arguments there are 5 possible injectivity conditions. For 3 this number jumps to 19. 4 arguments => 65 combinations. 5 arguments => 211 combinations. 6 arguments => 665 combinations. Let's assume (for illustrative purposes) that verifying (ie. attempting to infer) a single injectivity condition for a type family takes 0.01s. Under such assumptions inferring all possible injectivity conditions for a type family with 6 arguments will take ~6.5s. We can't do that.
Is there any patch which would allow me to use injective type families in current GHC?
No. But if all goes well all this discussion here will culminate with such a patch :-) In the meantime, if you have any motivating use cases that require injectivity to work please share them if you can. This would be most helpful.
I've spent last several days thinking intensely about the design and made some progress. Hopefully, I'll incorporate my notes into the wiki page by the end of the week. Will get back with some ideas and proposals once this happens.
comment:65 Changed 3 years ago by
@jstolarek: Oh, I made a mistake, I've got no example regarding injective type families. I'm sorry for the confusion!
comment:66 Changed 3 years ago by
Differential Rev(s): | → Phab:D202 |
---|
comment:67 Changed 3 years ago by
I've made a major update to the wiki page. Here are a few things worth pointing out:
In this section I've listed all the syntax proposals made so far. I've also mailed Haskell-cafe so we can get some feedback from the community.
I christened various flavours of injectivity as A, B and C :-) -- see this section. I will refer to them as such in the rest of this comment.
I mailed Conal Elliot and he says he does not remember what was his use case for type families injective in only one argument. This means we only have use cases of the form A shown by Simon. Given that, I believe that it is best if we limit ourselves to implementing A. We can revise this design later if someone demonstrates compelling examples of B or C.
I have outlined my idea for the algorithm that would verify users injectivity declarations in this section. Simon, please tell me if there are any misconceptions about how GHC's type inference works.
After some thinking I am against inferring injectivity for closed type families - see my arguments in this section.
comment:68 Changed 3 years ago by
EDIT: Please ignore earlier versions of this comment.
A nice syntax proposal showed up on Haskell-cafe. The idea is to use type variables introduced to name the result:
type family Id a :: result | result -> a where type family F a b c :: d | d -> a b c type family G a b c :: foo | foo -> a b where type family Plus a b :: sum | sum a -> b, sum b -> a where type family H a b c :: xyz | xyz a -> b c, xyz b -> a c
comment:69 follow-up: 70 Changed 3 years ago by
That reads very oddly to me, but how about...?
type family Id a = result | result -> a where
comment:70 Changed 3 years ago by
Replying to rwbarton:
That reads very oddly to me, but how about...?
type family Id a = result | result -> a where
You mean that the :: result
syntax reads oddly? It is already supported by GHC and the nice thing about this proposal is that we can piggyback on the syntax we already have.
comment:71 Changed 3 years ago by
The difficulty is that when you say
type family F (a::*) :: *
you expect a
to be a type variable and *
to be a kind. So if you say
type family F (a::*) :: (b :: *)
it looks as though b
is a kind variable, and *
is a sort. That's why Reid wants to use =
rather than ::
, and he is right. It's reasonable to have either of these
type family F a :: * type family F a = (b :: *)
but not
type family F a :: (b :: *)
The merit of the "= blah" notation is that it allows us to give a name to the result, rather than use a magic name. I think it's probably just about worth the syntactic cost.
Simon
comment:72 Changed 3 years ago by
Ah, of course. I was thinking about this problem earlier but then I somehow convinced myself that this will not be an issue.
comment:73 follow-ups: 74 79 Changed 2 years ago by
Simon, Richard, others,
I'm making progress on this one but before going too far I want to get a green light on my design proposal (points 1-3). I've also bumped into some problems and have a couple of questions.
My design proposal is to implement only injectivity of type A (ie. plain injectivity in all arguments). The consequences of this are:
- A user is allowed to write at most one injectivity condition.
- That condition must contain only the result type variable on the LHS of
->
and all other type variables on the RHS of->
. In other words if I declaretype family P a b = r
then I'm only allowed to writer -> a b
orr -> b a
injectivity condition. Questions here: a) can I add a restriction that type variables on the RHS must be given in exactly the same order as they were given in the head declaration? This would make the implementation simpler; b) where should this check be done? In the renamer?
- Once we check 2) we must verify that the type family is indeed injective. Not sure where this should happen? My guess is that during typechecking of a type family equations. Algorithm is outlined here.
Does the implementation plan outlined so far look sensible? Are there any misconceptions?
- Once we pass these checks injectivity becomes a binary property, so we can discard injectivity conditions written by the user and replace it with a
Bool
. Richard, when we first spoke about injective type families you suggested that all the magic will go intoisDecomposableTyCon
intypes/TyCon.lhs
. I've added aBool
field toSynTyCon
data constructor ofTyCon
data type and madeisDecomposableTyCon
use that field to tell whether a type family is injective or not. (Checks described in 2 and 3 are not implemented. When a user writes injectivity declaration for a type family I just assume it is correct.) Sadly, this does not work. Here's an example test case:
type family F a = r | r -> a where F Int = Bool F Bool = Int F a = a foo :: F a -> F a foo = id
This fails with:
Couldn't match type ‘F a0’ with ‘F a’ NB: ‘F’ is a type function, and may not be injective The type variable ‘a0’ is ambiguous Expected type: F a -> F a Actual type: F a0 -> F a0 In the ambiguity check for: forall a. F a -> F a To defer the ambiguity check to use sites, enable AllowAmbiguousTypes In the type signature for ‘foo’: foo :: F a -> F a
I added traces to verify that isDecomposableTyCon
is called for F
type family and it correctly returns True
.
Out of curiosity I turned on AllowAmbiguousTypes
and added a definition like this:
bar :: Int -> Int bar = foo
That failed with:
Couldn't match type ‘F a0’ with ‘Int’ The type variable ‘a0’ is ambiguous Expected type: Int -> Int Actual type: F a0 -> F a0 In the expression: foo In an equation for ‘bar’: bar = foo
I imagine that getting this one to work definitely requires more changes than just isDecomposableTyCon
. Then again GHC can already deal with some cases of injectivity:
type family F a = r | r -> a where F Int = Int F Bool = Bool F a = a foo :: F a -> F a foo = id bar :: Int -> Int bar = foo
That works perfectly fine in GHC 7.8.
I tried to analyse what's going on with -ddump-tc-trace
but the dumps are 600-1000 lines long. Of course the source code itself is even longer, so I'd appreciate any directions where should I start looking.
One final question. Assuming that SynTyCon
really should have a new field, I believe this field should be stored in interface files. After all we want injectivity information to propagate from one module to another. Now I wonder how this interacts with open type families. I understand that these are typechecked progressively as we find more equations. I haven't looked at the implementation but my intuition from reading GHC papers is that in a given module we import open type family equations from other modules, collect equations from current module and proceed with type checking of an open type family. Is that intuition correct? If so then I believe that checking of injectivity (point 3 of my outline) should be done during this stage (I believe it would be best to combine it with checking equation overlapping).
comment:74 Changed 2 years ago by
Replying to jstolarek:
Simon, Richard, others,
I'm making progress on this one but before going too far I want to get a green light on my design proposal (points 1-3). I've also bumped into some problems and have a couple of questions.
Just a side question. I have a type level Map:
type family Map (f :: k -> l) (ks :: [k]) :: [l] where ...
When I declare this as injective (presumably) with
type family Map (f :: k -> l) (ks :: [k]) = result :: [l] | result -> ks where ...
then f
must be injective for this to work. Concretely, will this invocation be considered injective by GHC (:set -XDataKinds
) ?
Map IO '[Int, (), Bool]
Thanks,
Gabor
comment:75 follow-up: 80 Changed 2 years ago by
Good question. I have not considered such case earlier. Here are some thoughts:
Currently your Map
function will work only for type constructors because unsaturated type families are not allowed. These are certainly injective.
First step in my current outline of the algorithm that decides whether a type family is injective or not says: "If a RHS contains a call to a type family we conclude that the type family is not injective. I am not certain if this extends to self-recursion -- see discussion below." This would lead to conclusion that Map
is not injective because Map
calls itself.
Now let's say I manage to find a way to allow self-recursion (see discussion on the wiki page). I'm left with the problematic call to f
. As stated earlier currently this must be an injective type constructor. So it seems that declaring Map
injective should be safe because type constructors are generative and it seems possible to deduce f
, a
and b
from Map f [a, b] ~ [Maybe Int, Maybe Char]
. At the moment I have no idea how to formulate an algorithm for such a deduction. Could this be done by the current constraint solver?
What if we were allowed to have partially applied type families (say, because we've implemented ideas from singletons) and f
could be a type family? Would knowing that f
is injective allow us to declare Map
as injective? Injectivity says you can deduce LHS from RHS but I think that it would not be a good idea to try to solve Map f [a, b] ~ [Int, Char]
. And so the restriction on not having calls to type families on the RHS would rightly declare Map
as not injective. So if we had partially applied type families and thus f
was allowed to be either a type constructor a type family we should declare Map
as not injective. This contrasts with earlier paragraph, where knowing that f
must be a type constructor allowed us to declare Map
as injective.
Does that make sense?
comment:76 follow-ups: 77 81 90 Changed 2 years ago by
- I think it's very straightforward to allow injectivity of form
r -> <subset of args>
. That is, the rhs is unconstrained but the lhs must ber
.
- So injectivity isn't a binary property; rather it's a
Bool
per argument. Making this list a field ofSynTyCon
is probably right, with ordinary synonyms being all false.
- We are not (at least for now) changing System FC. So the only effect of injectivity is to add extra "Derived" unification constraints, very much like functional dependencies. If we see
[W] F t ~ F s
then we don't decopose it. Instead we add[D] t ~ s
That guides inference, but does not produce proof terms. So, no, nothing to do withisDecomposableTyCon
. Much more like thetry_improvement
code inTcInteract.doTopReactFunEq
.
- I don't understand the discussion of
Map
. Are you perhaps discussing how to verify some (not shown here) implementation of a closed type familyMap
is in fact injective?
- Verifying injectivity for open families: if the RHSs unify then the LHSs should be identical under that substitution. Eg
type instance F a Int = a -> Int type instance F Int a = Int -> a
These are consistent (and hence allowed) and are injective in both arguments.
- My brain is too small to think about verifying injectivity for closed families. Richard?
comment:77 Changed 2 years ago by
Replying to simonpj:
- We are not (at least for now) changing System FC. So the only effect of injectivity is to add extra "Derived" unification constraints, very much like functional dependencies. If we see
[W] F t ~ F sthen we don't decopose it. Instead we add[D] t ~ s
What is the meaning of [W]
and [D]
? "Wanted" and "derived"?
- I don't understand the discussion of
Map
. Are you perhaps discussing how to verify some (not shown here) implementation of a closed type familyMap
is in fact injective?
Yes, I was discussing how to verify whether this definition of Map
is injective:
type family Map (f :: k -> l) (ks :: [k]) :: [l] where Map f '[] = '[] Map f (x ': xs) = f x ': Map f xs
- Verifying injectivity for open families: if the RHSs unify then the LHSs should be identical under that substitution. Eg
type instance F a Int = a -> Int type instance F Int a = Int -> aThese are consistent (and hence allowed) and are injective in both arguments.
Ah, I haven't thought about allowing substitution when RHSs unify. Wiki updated.
- My brain is too small to think about verifying injectivity for closed families. Richard?
I only remind that I outlined an algorithm on the wiki :-) Richard, it would be great if you could take a look and say whether it makes sense.
In the meantime I have almost implemented checking correctness of injectivity condition (point 2 from my earlier comment). Renamer seems to be the right place for doing this.
comment:78 Changed 2 years ago by
Yes "Wanted" and "Derived". You'll see lots of this with -ddump-tc-trace
comment:79 follow-up: 82 Changed 2 years ago by
Replying to jstolarek:
- That condition must contain only the result type variable on the LHS of
->
and all other type variables on the RHS of->
. In other words if I declaretype family P a b = r
then I'm only allowed to writer -> a b
orr -> b a
injectivity condition. Questions here: a) can I add a restriction that type variables on the RHS must be given in exactly the same order as they were given in the head declaration? This would make the implementation simpler; b) where should this check be done? In the renamer?
I'm ambivalent on this design decision (that the variables have to be in the same order in the annotation). I suppose it simplifies the implementation. Yes, the renamer seems like a fine-enough place to do the check.
- Once we check 2) we must verify that the type family is indeed injective. Not sure where this should happen? My guess is that during typechecking of a type family equations. Algorithm is outlined here.
Please see updates to the wiki page, labeled RAE ... End RAE.
Does the implementation plan outlined so far look sensible? Are there any misconceptions?
- Once we pass these checks injectivity becomes a binary property, so we can discard injectivity conditions written by the user and replace it with a
Bool
. Richard, when we first spoke about injective type families you suggested that all the magic will go intoisDecomposableTyCon
intypes/TyCon.lhs
. I've added aBool
field toSynTyCon
data constructor ofTyCon
data type and madeisDecomposableTyCon
use that field to tell whether a type family is injective or not. (Checks described in 2 and 3 are not implemented. When a user writes injectivity declaration for a type family I just assume it is correct.) Sadly, this does not work. Here's an example test case:type family F a = r | r -> a where F Int = Bool F Bool = Int F a = a foo :: F a -> F a foo = idThis fails with:
Couldn't match type ‘F a0’ with ‘F a’ NB: ‘F’ is a type function, and may not be injective The type variable ‘a0’ is ambiguous Expected type: F a -> F a Actual type: F a0 -> F a0 In the ambiguity check for: forall a. F a -> F a To defer the ambiguity check to use sites, enable AllowAmbiguousTypes In the type signature for ‘foo’: foo :: F a -> F aI added traces to verify that
isDecomposableTyCon
is called forF
type family and it correctly returnsTrue
.Out of curiosity I turned on
AllowAmbiguousTypes
and added a definition like this:bar :: Int -> Int bar = fooThat failed with:
Couldn't match type ‘F a0’ with ‘Int’ The type variable ‘a0’ is ambiguous Expected type: Int -> Int Actual type: F a0 -> F a0 In the expression: foo In an equation for ‘bar’: bar = fooI imagine that getting this one to work definitely requires more changes than just
isDecomposableTyCon
.
Yes, Simon's comment above is correct -- I was just wrong about the isDecomposableTyCon
thing. Actually, injective type families still need to say "no" to isDecomposableTyCon
, because of the left and right coercion formers, which assume generativity.
Then again GHC can already deal with some cases of injectivity:
type family F a = r | r -> a where F Int = Int F Bool = Bool F a = a foo :: F a -> F a foo = id bar :: Int -> Int bar = fooThat works perfectly fine in GHC 7.8.
What works perfectly fine? Not the code you wrote above, because it contains a not-currently-parsed injectivity annotation. I tried it without the injectivity annotation, and it does indeed work. But not because of injectivity, at all: it's because GHC is clever enough to figure out that F
is just an identity function, and so the F a
s in foo
's type become a
. If you swap the Int
and Bool
RHSs in the definition (which preserves injectivity), the code fails to compile.
I tried to analyse what's going on with
-ddump-tc-trace
but the dumps are 600-1000 lines long. Of course the source code itself is even longer, so I'd appreciate any directions where should I start looking.
Fair warning: 1000 lines of -ddump-tc-trace
isn't long at all! :)
One final question. Assuming that
SynTyCon
really should have a new field, I believe this field should be stored in interface files. After all we want injectivity information to propagate from one module to another. Now I wonder how this interacts with open type families. I understand that these are typechecked progressively as we find more equations. I haven't looked at the implementation but my intuition from reading GHC papers is that in a given module we import open type family equations from other modules, collect equations from current module and proceed with type checking of an open type family. Is that intuition correct? If so then I believe that checking of injectivity (point 3 of my outline) should be done during this stage (I believe it would be best to combine it with checking equation overlapping).
Agreed.
comment:80 Changed 2 years ago by
Replying to jstolarek:
Good question. I have not considered such case earlier. Here are some thoughts:
Currently your
Map
function will work only for type constructors because unsaturated type families are not allowed. These are certainly injective.First step in my current outline of the algorithm that decides whether a type family is injective or not says: "If a RHS contains a call to a type family we conclude that the type family is not injective. I am not certain if this extends to self-recursion -- see discussion below." This would lead to conclusion that
Map
is not injective becauseMap
calls itself.Now let's say I manage to find a way to allow self-recursion (see discussion on the wiki page). I'm left with the problematic call to
f
. As stated earlier currently this must be an injective type constructor. So it seems that declaringMap
injective should be safe because type constructors are generative and it seems possible to deducef
,a
andb
fromMap f [a, b] ~ [Maybe Int, Maybe Char]
. At the moment I have no idea how to formulate an algorithm for such a deduction. Could this be done by the current constraint solver?
This should just work with the algorithm already stated. (Except for the self-recursion bit, which I responded to on the wiki page.) Unification should do the work for you.
What if we were allowed to have partially applied type families (say, because we've implemented ideas from singletons) and
f
could be a type family? Would knowing thatf
is injective allow us to declareMap
as injective? Injectivity says you can deduce LHS from RHS but I think that it would not be a good idea to try to solveMap f [a, b] ~ [Int, Char]
. And so the restriction on not having calls to type families on the RHS would rightly declareMap
as not injective. So if we had partially applied type families and thusf
was allowed to be either a type constructor a type family we should declareMap
as not injective. This contrasts with earlier paragraph, where knowing thatf
must be a type constructor allowed us to declareMap
as injective.Does that make sense?
No. Under the singletons work, we gave a different kind to non-injective functions, so Gabor's kind -- using ->
, not any other arrow -- requires an injective argument.
comment:81 Changed 2 years ago by
Replying to simonpj:
- My brain is too small to think about verifying injectivity for closed families. Richard?
I've put some comments on the wiki page. But the whole thing makes me a little nervous, as closed type families are really squirrelly. I would want something resembling a proof before feeling truly confident. As a first pass, we could always treat the equations of a closed family as independent (like an open family). I believe that such a treatment would be fully conservative.
comment:82 follow-up: 83 Changed 2 years ago by
Replying to goldfire:
I'm ambivalent on this design decision (that the variables have to be in the same order in the annotation). I suppose it simplifies the implementation.
I guess my argument about simpler implementation is not that important. I mean the code is quite simple in both cases. It's more like a choice between linear and quadratic algorithm. I'd went for linear, even if it adds an extra restriction on the user code.
Yes, Simon's comment above is correct -- I was just wrong about the
isDecomposableTyCon
thing.
OK. Reverting :-)
What works perfectly fine? Not the code you wrote above, because it contains a not-currently-parsed injectivity annotation. I tried it without the injectivity annotation, and it does indeed work.
Yes, that's what I meant.
But not because of injectivity, at all: it's because GHC is clever enough to figure out that
F
is just an identity function
Right. Still, I wonder how this works. I mean there are probably no special cases in the typechecker that try to guess whether a type family is an identity function?
Fair warning: 1000 lines of
-ddump-tc-trace
isn't long at all! :)
I imagine, given that my source file was 8 lines of code. Still, it's a bit steep learning curve.
Replying to goldfire:
Under the singletons work, we gave a different kind to non-injective functions
Ah, right.
Replying to wiki comments:
I believe this would also have to check that all variables mentioned in the LHS are mentioned in the RHS. Or, following along with Simon's idea in comment:76:ticket:6018, those variables that appear in the injectivity annotation must be mentioned in the RHS.
I haven't thought about that. I believe you're right.
We have to be careful with the word overlap here. I think we want "overlaps" to be "is subsumed by".
I don't see the difference between "overlaps" and "is subsumed by". :-(
Now, in 4c I wrote:
if unification [of the RHSs] succeeds and there are type variables involved we substitute unified type variables on the LHS and check whether this LHS overlaps with any of the previous equations. If it does we proceed to the next equation
You replied with this:
you want to know of the equation at hand is reachable, given the LHS substitution. Even if it is reachable, the (substituted) LHS may coincide with the LHS of the earlier equation whose RHS unified with the current RHS.
I don't see the difference between your version and mine.
Even subtler, it's possible that certain values for type variables are excluded if the current LHS is reachable (for example, some variable a couldn't be Int, because if a were Int, then a previous equation would have triggered). Perhaps these exclusions can also be taken into account.
Hm... can you give an example where this would be useful?
RAE: But it seems that, under this treatment, any self-recursion would automatically lead to a conclusion of "not injective", just like any other use of a type family. For example:
type family IdNat n where IdNat Z = Z IdNat (S n) = S (IdNat n)
IdNat
is injective. But, following the algorithm above, GHC would see the recursive use ofIdNat
, not know whetherIdNat
is injective, and then give up, concluding "not injective". Is there a case where the special treatment of self-recursion leads to a conclusion of "injective"? End RAE
This example is just like my NatToMaybe
. My idea here was that RHSs of these two equations won't unify - first returns Z
, second returns S something
. These are distinct constructors that have no chance of being unified. (I assumed that we are able to detect that.) There are no calls to type families other than self-recursion and so we can declare IdNat
to be injective. I admit I am not 100% certain that allowing self-recursion is safe. It's just that I was unable to come up with an example that would show that my algorithm: a:) declares injective function to be non-injective; b) declares a non-injective function to be injective.
comment:83 Changed 2 years ago by
Replying to jstolarek:
Right. Still, I wondering how this works. I mean there are probably no special cases in the typechecker that try to guess whether a type family is an identity function?
Of course, you're right -- GHC doesn't have a special case there. What it does do is so-called compatibility checking. Two equations of a type family (open or closed) are compatible if, whenever the LHSs unify with substitution s, applying s to the RHSs makes them the same. Two equations for an open type family do not have a malignant overlap if they are compatible. And, more relevant here, GHC does the apartness check in closed type families only among incompatible equations. In F
as originally given, all the equations are compatible with one another, and so GHC always skips the apartness check. Thus, when it sees F a
, the last equation triggers and reduces F a
to a
. This is all described in the closed type families paper, and in the manual.
We have to be careful with the word overlap here. I think we want "overlaps" to be "is subsumed by".
I don't see the difference between "overlaps" and "is subsumed by". :-(
I would say these equations overlap:
type instance G a Int = a type instance G Int a = a
but neither equation subsumes the other. On the other hand,
type instance H a a = a -- 1 type instance H Int Int = Int -- 2
equation 1 subsumes equation 2, because anything that matches 2 will also surely match 1.
Now, in 4c I wrote:
if unification [of the RHSs] succeeds and there are type variables involved we substitute unified type variables on the LHS and check whether this LHS overlaps with any of the previous equations. If it does we proceed to the next equation
You replied with this:
you want to know of the equation at hand is reachable, given the LHS substitution. Even if it is reachable, the (substituted) LHS may coincide with the LHS of the earlier equation whose RHS unified with the current RHS.
I don't see the difference between your version and mine.
See new example E1
on the wiki page.
Even subtler, it's possible that certain values for type variables are excluded if the current LHS is reachable (for example, some variable a couldn't be Int, because if a were Int, then a previous equation would have triggered). Perhaps these exclusions can also be taken into account.
Hm... can you give an example where this would be useful?
See new example E2
on the wiki page.
RAE: But it seems that, under this treatment, any self-recursion would automatically lead to a conclusion of "not injective", just like any other use of a type family. For example:
type family IdNat n where IdNat Z = Z IdNat (S n) = S (IdNat n)
IdNat
is injective. But, following the algorithm above, GHC would see the recursive use ofIdNat
, not know whetherIdNat
is injective, and then give up, concluding "not injective". Is there a case where the special treatment of self-recursion leads to a conclusion of "injective"? End RAEThis example is just like my
NatToMaybe
. My idea here was that RHSs of these two equations won't unify - first returnsZ
, second returnsS something
. These are distinct constructors that have no chance of being unified. (I assumed that we are able to detect that.) There are no calls to type families other than self-recursion and so we can declareIdNat
to be injective. I admit I am not 100% certain that allowing self-recursion is safe. It's just that I was unable to come up with an example that would show that my algorithm: a:) declares injective function to be non-injective; b) declares a non-injective function to be injective.
What are the steps your algorithm is taking? I don't see how IdNat
can be considered injective while Ban
is considered non-injective. A critical step in banning Ban
is noting that we don't yet know that Ban
is injective when checking it, so we conclude "not injective". Maybe it's because Ban
's recursion is at the top-level? In IdNat
and NatToMaybe
, unification fails before seeing the recursive use of the type family. Hmm... maybe. There seems to be something going on here that, if unification succeeds up until it sees the recursive use of the type family, then the family either must be non-injective or non-terminating... because this situation can only arise when the function "collapses" two inputs into the same output through recursion. But I'd want a proof first.
comment:84 follow-up: 85 Changed 2 years ago by
Richard, either I'm not getting the subtleties of your arguments or you're misunderstanding what I meant to say on the wiki page. Or both :-)
I now see the theoretical difference between "overlaps" and "subsumes" but I don't yet see how that is important in my algorithm outline. I re-read parts of the closed type families paper and now I'm not even sure whether I meant "overlaps" or "unifies". Anyway, I believe that my thinking is correct and in my algorithm I have addressed problems that you raised. It looks that I need help with sorting out the terminology so things are clear.
Now, looking at your examples:
type family E1 a = r | r -> a where E1 Int = Int -- 1 E1 a = a -- 2
When my algorithm reaches (2) it will attempt to unify a
with RHS of equation (1). It will succeed with substitution [ a -> Int ]
. It will apply that substitution to the LHS of (2) and proceed with checking whether after substitution this equation is subsumed (?) by any of the earlier ones. Clearly, it is by the first equation and since this is the last equation we declare E1
to be injective. In other words I know that (2) will never produce an Int
since this will equation will never be reached for a
equal to Int
.
Now E2
:
type family E2 (a:: Bool) = r | r -> a where E2 False = True E2 True = False E2 a = False
I consider almost identical example on the wiki:
type family Bak a = r | r -> a where Bak Int = Char Bak Char = Int Bak a = a
The difference is the RHS of the last equation: a concrete type in your example, a type variable in mine. Yet I believe that both examples are intended to demonstrate the same problem - see discussion below that example.
In
IdNat
andNatToMaybe
, unification fails before seeing the recursive use of the type family.
Right. I assumed that noticing that allows us to conlcude that equations of IdNat
and NatToMaybe
produce distinct types and thus these type families are injective. Your reasoning in that whole paragraph completely agrees with mine.
But I'd want a proof first.
Agreed. As I've said, I've been unable to produce a counter-example but that doesn't mean that one does not exist.
I have a question about closed type families paper. Paragraph just after Candidate Rule 2 (section 3.2, page 3) says: "As a notational convention, apart(ρ, τ) considers the lists ρ and τ as tuples of types; the apartness check does not go element-by-element.", where "not" is emphasized. Why is this important? Seems like the choice of representing patterns and types as tuples or lists is just an implementation detail.
comment:85 Changed 2 years ago by
Replying to jstolarek:
Richard, either I'm not getting the subtleties of your arguments or you're misunderstanding what I meant to say on the wiki page. Or both :-)
I now see the theoretical difference between "overlaps" and "subsumes" but I don't yet see how that is important in my algorithm outline. I re-read parts of the closed type families paper and now I'm not even sure whether I meant "overlaps" or "unifies".
To be clear, I am considering the expressions "two types overlap" and "two types unify" to be synonymous. Subsumption is a finer, directed relation between types than overlap/unify.
type family E1 a = r | r -> a where ...
Yes yes yes. You're right here. I was wrong. The wiki page is updated.
Now
E2
:type family E2 (a:: Bool) = r | r -> a where E2 False = True E2 True = False E2 a = FalseI consider almost identical example on the wiki:
type family Bak a = r | r -> a where Bak Int = Char Bak Char = Int Bak a = aThe difference is the RHS of the last equation: a concrete type in your example, a type variable in mine. Yet I believe that both examples are intended to demonstrate the same problem - see discussion below that example.
I think these cases are different. In yours, the RHS unification (a
unifies with Char
) refines the LHS (a |-> Char
) so that it is subsumed by an earlier equation. This action does not happen in my example, and different reasoning (noting that False
and True
are both impossible) is required.
I have a question about closed type families paper. Paragraph just after Candidate Rule 2 (section 3.2, page 3) says: "As a notational convention, apart(ρ, τ) considers the lists ρ and τ as tuples of types; the apartness check does not go element-by-element.", where "not" is emphasized. Why is this important? Seems like the choice of representing patterns and types as tuples or lists is just an implementation detail.
Agreed that choosing a list vs. a tuple is irrelevant, but choosing element-by-element vs. as-a-whole is quite important. For example (a, a)
is quite apart from (Int, Bool)
, when considered as a whole, even though a
is neither apart from Int
nor Bool
looking at individual elements.
Does that clarify?
comment:86 follow-up: 87 Changed 2 years ago by
Does that clarify?
Yes.
I don't yet see how to deal with your E2
example. It looks like we would need to have pattern-exhaustiveness checking for type families. And this would only be possible for closed kinds I think. Well, for open kinds type family would simply be non-injective.
I have a few more questions about implementation. These most likely go to Simon. I've been trying to wrap my head around implementation of doTopReactFunEq
and I have these questions:
- What does the term "react" mean? As in the name
doTopReact
and the comment below its type signature: "The work item does not react with the inert set..." (TcInteract, L1412).
- What is "interaction"? As in the module name
TcInteract
or the comment mentioned above: "... so try interaction with top-level instances".
- What does
EvTerm
data type represent?
- What are branches in axioms?
- I don't understand how canonicalization works. Is there an overview somewhere? Other than
Note [Canonicalization]
, which unfortunately is too vague for me :-(
comment:87 Changed 2 years ago by
Replying to jstolarek:
I don't yet see how to deal with your
E2
example. It looks like we would need to have pattern-exhaustiveness checking for type families. And this would only be possible for closed kinds I think. Well, for open kinds type family would simply be non-injective.
I don't know how to do this either. Luckily, we can reject E2
on a first pass. It's possible that the only examples like this are silly in the same way that E2
is silly, in that the last equation is utterly unreachable.
I have a few more questions about implementation. These most likely go to Simon. I've been trying to wrap my head around implementation of
doTopReactFunEq
and I have these questions:
- What does the term "react" mean? As in the name
doTopReact
and the comment below its type signature: "The work item does not react with the inert set..." (TcInteract, L1412).
- What is "interaction"? As in the module name
TcInteract
or the comment mentioned above: "... so try interaction with top-level instances".
Simon may be able to answer better than I, but I believe these refer to the actions taken in section 7 of the OutsideIn paper.
- What does
EvTerm
data type represent?
These are values of constraint types, including class dictionaries, type equalities, and implicit parameters. EvTerm
s are desugared in DsBinds. Write back if that's not enough of an explanation.
- What are branches in axioms?
A closed type family with multiple equations leads to a so-called branched axiom. The way overlap works in a closed type family means that we can't have separate axioms for each equation. So, a branched axiom contains potentially many branches; each branch asserts an equality, but later branches can be used only when no previous branch supersedes. Does that help? It's all in the closed type families paper.
- I don't understand how canonicalization works. Is there an overview somewhere? Other than
Note [Canonicalization]
, which unfortunately is too vague for me :-(
I think this is also in OutsideIn.
comment:88 Changed 2 years ago by
I don't know how to do this either. Luckily, we can reject E2 on a first pass.
Yes, I believe that's what we should do. If we ever have analysis of unreachable equations for type families then we can use it to deal with E2
but until then we should simply reject it.
As for answers to 1-5 I'll read some more and get back if that doesn't help.
comment:89 Changed 2 years ago by
I've read section 7.4 of OutsideIn(X) paper. Code makes more sense now but still I have some questions. Most importantly I'm trying to understand FINST top-level reaction rule (OutsideIn paper, fig. 24 page 65), which according to my understanding is implemented by doTopReactFunEq
. Since I don't have LaTeX in the wiki I'll be using following notation:
\Q - set of top-level axiom schemes \g - little gamma \d - little delta \th - little theta \t - little tau \e - little epsilon _0 - 0 in a subscript \vec{...} - a vector of values, in paper denoted using overline \in - set membership realtion
Now the rule:
forall \vec{a}. F \vec{Xi_0} ~ Xi_0 \in \Q -- (1) \vec{b} = ftv(\vec{Xi_0}) -- (2) \vec{c} = \vec{a} - \vec{b} -- (3) \g fresh -- (4) \th = [\vec{b |-> \t_a},\vec{c |-> \g}] -- (5) \th \vec{Xi_0} = \vec{Xi} -- (6) if (l = w) then \vec{\d} = \vec{\g} else \vec{\d} = \e -- (7) ------------------------------------------------------------- topreact[l](\Q, F \vec{Xi}~Xi) = {\d, \th Xi_0 ~ Xi} -- (8)
Here's how I understand (or misunderstand that rule):
Premise (1) represents a type family equation written by the user (either from a closed or open type family). \vec{Xi_0}
represents the LHS of F
ie. patterns in that equation. So for example if the equation is F [b] Bool c
then \vec{Xi_0}
is ([b],Bool, c)
. I'm not sure about [b]
- I think that's not canonical? Are axioms canonicalized?
I'm also not sure about the meaning of \vec{a}
in (1)'s forall. My intuition would be that these are type variables in the LHS (b
and c
in above example). But then I don't know how to understand premise (2). If \vec{b}
represents free variables if \vec{Xi_0}
then wouldn't this be identical to \vec{a}
? This does not make sense because in that situation \vec{c}
in premise (3) would always be empty.
Premise (5) is a substitution used in premise (6) to turn the LHS of the original axiom into LHS of a constraint we are trying to solve. But what does \t_a
represent? Instantiation of variables in the original axiom to some particular values present in \vec{Xi}
?
Premise (7) says that if a constraint is wanted then we introduce new variables. I'm not sure why. Maybe this will become clear once I understand the rest.
I'm not sure if I understand the conclusion (8). Reading the rule I see that if we have F \vec{Xi} ~ Xi
constraint we use it to figure out the \th
substitution and then use that substitution to create a new constraint that tries to equate the RHS of our current constraint with the RHS of original axiom. But why do we want this? I could use an example with more comments than in the paper (and the example could be more complicated too). Before proceeding with the implementation I'd like to understand how injectivity affects FINST rule.
What is not clear to me is whether the new constraint we are producing is wanted or given? My guess is that we produce a given if the original constraint was given and wanted if the original was wanted.
I also wonder whether injectivity gives us a new interaction rule:
INJFEQFEQ interact[l] (F \vec{Xi_1} ~ Xi, F \vec{Xi_2} ~ Xi) = (F \vec{Xi_1} ~ Xi, \vec{ Xi_1 ~ Xi_2 })
where \vec{ Xi_1 ~ Xi_2 }
represents vector of pairwise constraints.
A few more questions unrelated to the paper:
- Are axioms for associated types treated differently than for open type families?
- What does
BuiltInSynFamTyCon
constructor represent?
Note [Basic Simplifier Plan]
inTcInteract
mentions "inert reactions" and "spontaneous reactions". What's the difference between the two?
comment:90 Changed 2 years ago by
Simon, I'm trying to understand your earlier comment:
- We are not (at least for now) changing System FC. So the only effect of injectivity is to add extra "Derived" unification constraints, very much like functional dependencies. If we see
[W] F t ~ F sthen we don't decopose it. Instead we add[D] t ~ sThat guides inference, but does not produce proof terms. So, no, nothing to do withisDecomposableTyCon
. Much more like thetry_improvement
code inTcInteract.doTopReactFunEq
.
First of all, do we ever get a chance of actually seeing F t ~ F s
? I think this is not canonical and if such constraint ever appears it would be turned into F t ~ a, F s ~ a
. I'm not sure which of the canonicalization rules in Fig. 21 in OutsideIn paper would be responsible for that conversion because I don't know the meaning of these stylized D, F and T used in the mentioned figure. My guess would be FFLAT[W/G]R rules. Anyway, once we have F t ~ a, F s ~ a
we could apply my suggested INJFEQFEQ interaction rule to generate [D] s ~ t
. If my thinking here is right then shouldn't the implementation go into TcInteract.interactFunEq
(guessing from its name)?
I also wonder whether [W]
is important in your [W] F t ~ F s
example? I believe this rule is valid for both given and wanted constraints.
I don't understand what you mean by "we don't decopose it". What is this "decomposition" that you mention?
I believe that the only moment when injectivity interacts with top-level axioms is when we're deducing LHS from RHS. I suspect this might require a new reaction rule, separate from FINST.
Are my ideas here correct?
comment:91 Changed 2 years ago by
Re comment:89 and comment:90 I'm struggling with limited personal bandwidth, and daunted by the challenge of giving a tutorial about the type inference engine, which is one of the most complex parts of GHC.
Let me make this offer. If you do the rest, I'll do the bit in the type inference engine that exploits injectivity. It won't take long, and then you can see.
Decomposition in brief. If you want Maybe a ~ Maybe b
then it's enough to prove a ~ b
. From a proof of the latter we can get a proof of the former. No more than that.
Simon
comment:92 Changed 2 years ago by
Can I get some more time before I throw in the towel? Say, until next Thursday? EDIT: Next Thursday = 30th October
Simon, if you were to start working on this what parts of my implementation need to be finished? Everything up to SynTyCon
storing a list of Bool
s? In what form would you like my patch? A differential revision? A branch on github? Do you want my work rebased against latest HEAD?
Decomposition in brief. If you want Maybe a ~ Maybe b then it's enough to prove a ~ b. From a proof of the latter we can get a proof of the former. No more than that.
Obviously, this is what we currently have for data types and what we want for injective type families. But from the OutsideIn paper my understanding was that *current* treatment of *type families* is different.
comment:93 follow-up: 94 Changed 2 years ago by
Sure. Take as long as you need.
I was offering to do the type inference piece, which seemed to be the bit you are stalled on. So if you do parser, renamer, and typechecking of the TyCon
declaration, so we have the injectivity info in the TyCon
, that would do it I think.
In practical terms, a branch in a github repo that I can pull and push to would work well.
All I plan to do is add a rule along the lines of
[W] F ty1 ~ rhs [W] F ty2 ~ rhs ===> add a new constraint (if F is injective) [D] ty1 ~ ty2
very like functional dependencies for type classes.
I'm not sure that there is time to get this implemented, tested etc before the 7.10 freeze, but we'll see.
Simon
comment:94 Changed 2 years ago by
If we're not aiming to get this into 7.10 then I guess there's no rush.
Replying to simonpj:
All I plan to do is add a rule along the lines of
[W] F ty1 ~ rhs [W] F ty2 ~ rhs ===> add a new constraint (if F is injective) [D] ty1 ~ ty2very like functional dependencies for type classes.
Yes, that's what I meant with my INJFEQFEQ interaction rule in comment 89:
INJFEQFEQ interact[l] (F \vec{Xi_1} ~ Xi, F \vec{Xi_2} ~ Xi) = (F \vec{Xi_1} ~ Xi, \vec{ Xi_1 ~ Xi_2 })
Sorry if that wasn't clear. (Now I realized that checking whether F
is injective might be consider a reaction with top-level axiom and thus this should be a reaction rule.)
But it seems to me there should be one more rule. Consider this example:
type family F a = r | r -> a where F Char = Bool F Bool = Char F a = a idChar :: (F a ~ Bool) => a -> Char idChar a = a
Here we have constraint F a ~ Bool
. Knowing that F
is injective and it's RHS is Bool
we want to deduce that a
must be Char
. This one seems more difficult because we have to look through all the equations. My guess would be that we have to do this before constraint solving (in TcM
monad?).
comment:95 Changed 2 years ago by
Yes, we need to look through all the equations. Very like functional dependencies.
Before we forget it, the big un-implemented piece is checking that the type-family equations are indeed injective. Or is that done now? I can't see it in Phab. That's the main reason I'm doubtful we'll make 7.10, unless perhaps you have it ready to go.
comment:96 Changed 2 years ago by
Before we forget it, the big un-implemented piece is checking that the type-family equations are indeed injective. Or is that done now?
No, that's not implemented yet - I'm distracted between this and two other GHC projects so I didn't yet have the time :-(
comment:98 Changed 2 years ago by
I'm implementing injectivity check for open type families and got stuck on some nonsense problem (described on Phab) so I diverted to #8326 this week. Will resume work on this next week.
comment:99 Changed 2 years ago by
Oh, and while we're talking about open type families, there's a design decision to make. Verifying user-supplied injectivity condition requires comparing the same equations that are compared by compatibility check. Should we:
a) verify injectivity at the same time we verify compatibility? b) verify injectivity in a separate pass?
a) might seem like an obvious answer because it allows us to avoid traversing the equation pairs twice. However, that makes compatibility checking code in FamInstEnv.lookup_fam_inst_env'
a bit more complicated. b) on the other hand can allow us to skip the injectivity checking altogether if the user did not supply any injectivity declarations. This would be a big win since compilation performance would not be degraded for already existing code (since there are no injectivity declarations). I am in favour of option b).
comment:100 Changed 2 years ago by
My instinct is (a) because finding all pairs is potentially quite expensive, and repeating all that code seems a shame
comment:101 Changed 2 years ago by
Finding pairs would be done once. Only the traversal would be done twice - see my inline comment on Phab.
comment:102 Changed 2 years ago by
Milestone: | 7.10.1 → 7.12.1 |
---|
comment:103 follow-up: 104 Changed 2 years ago by
I'm very late to the discussion, so this is probably not helpful at all, but I wanted to mention my interest. I've only had use for better type inference with (conceptually) closed, recursive type families. What I think I want is something like proposal C from the wiki, or the head-injectivity proposal where "possibly-partial information about the result might allow us to deduce possibly-partial information about the arguments."
And as a naive user I want injectivity to be inferred for closed type families, and don't find the arguments against it in the wiki convincing. Intuitively there doesn't seem to be much difference between the level of logic required to infer things about types with closed type synonym families, vs the sort of inference the typechecker does elsewhere (again, as a naive user who doesn't care about the trickiness of implementing such a scheme). My impression is that this is what many people assumed would be the point of closed type families; for what I've had in mind closed type families make only a cosmetic improvement.
Anyway most of my interest came from my work on https://github.com/jberryman/shapely-data which embarrassingly seems to dependent quite a lot on #1241, so it (and below) needs GHC 7.6. You can ignore the example that follows or consider it if it's helpful:
{-# LANGUAGE TypeOperators #-} {-# LANGUAGE TypeFamilies #-} {-# LANGUAGE MultiParamTypeClasses #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE FlexibleContexts #-} {-# LANGUAGE UndecidableInstances #-} -- for nested Type family: Reversed (Tail t) :> Head t {-# LANGUAGE FunctionalDependencies #-} -- | A class for the exponent laws with the 'Normal' form @abc@ in the exponent -- place. See the instance documentation for concrete types and examples. class Exponent abc where fanin :: (abc :=>-> x) -> (abc -> x) unfanin :: (abc -> x) -> (abc :=>-> x) -- | The algebraic normal form 'Exponent' @abc@ distributed over the single -- base variable @x@. type family abc :=>-> x -- | x¹ = x type instance () :=>-> x = x -- | x⁽ᵃᵇ⁾ = (xᵇ)ᵃ type instance (a,bs) :=>-> x = a -> bs :=>-> x instance Exponent () where fanin = const unfanin f = f () instance (Exponent bs)=> Exponent (a,bs) where fanin f = uncurry (fanin . f) unfanin f = unfanin . curry f class Homogeneous a as | as -> a where -- An n-ary @codiag@. genericRepeat :: a -> as instance Homogeneous a () where genericRepeat _ = () instance (Homogeneous a as)=> Homogeneous a (a,as) where genericRepeat a = (a,genericRepeat a)
So the following two type-check fine:
a = (3 ==) $ (\(x,(y,(z,())))-> x+y+z) $ genericRepeat 1 b = (3 ==) $ fanin (\x y z-> x+y+z) (genericRepeat 1 :: (Int,(Int,(Int,()))))
But this doesn't:
c = (3 ==) $ fanin (\x y z-> x+y+z) $ genericRepeat 1
...failing with
Couldn't match expected type `a1 -> a1 -> a1 -> a1' with actual type `abc0 :=>-> a0' The type variables `a0', `abc0', `a1' are ambiguous Possible fix: add a type signature that fixes these type variable(s) The lambda expression `\ x y z -> x + y + z' has three arguments, but its type `abc0 :=>-> a0' has none In the first argument of `fanin', namely `(\ x y z -> x + y + z)' In the expression: fanin (\ x y z -> x + y + z)
That library is probably going in the garbage bin at this point, but in general this is the sort of thing I think I'd like to be able to do with type families eventually. And I haven't gotten to dig into y'all's singletons work yet, so it might be that that's the sort of thing I'm looking for and my use of type families and classes in shapely-data is ill-considered.
Thanks for your work on this!
comment:104 Changed 2 years ago by
Replying to jberryman:
I've only had use for better type inference with (conceptually) closed, recursive type families
Calling type families from an injective type family will not be allowed (including recursion). The reason is that I don't see a way of verifying injectivity when type families are involved.
What I think I want is something like proposal C from the wiki
At the moment I am implementing proposals A and B. C is something to think about in the future, as it is not trivial.
And as a naive user I want injectivity to be inferred for closed type families, and don't find the arguments against it in the wiki convincing.
Yes, I suppose that focus of the wiki discussion is a bit off. I think we could infer injectivity of type A for closed type families. But I don't think we should try to infer injectivity of type B because the algorithm is exponential - see comment 64. That's the major reason for doing this. (Numbers in that comment were computed for type C injectivity, but type B is also exponential.)
-- | The algebraic normal form 'Exponent' @abc@ distributed over the single -- base variable @x@. type family abc :=>-> x -- | x¹ = x type instance () :=>-> x = x -- | x⁽ᵃᵇ⁾ = (xᵇ)ᵃ type instance (a,bs) :=>-> x = a -> bs :=>-> x
I don't think :=>->
type family is injective: both () :=>-> (Int -> Int)
and (Int,()) :=>-> Int
reduce to Int -> Int
.
comment:105 follow-up: 108 Changed 2 years ago by
Hi,
Sorry to throw in a late thought, but what about adding a kind of injectives >->
? e.g.* >-> *
is the kind of injective type functions (as proved by the compiler, and perhaps as explicitly annotated by the user). This would provide a way to propagate injectivity information within the type checker, and could also be an alternative (replacement?) for using the functional dependency syntax here (although, I think that syntax makes a lot of sense). I suppose it is not currently possible to specify this for a type family, unless kind signatures on type family names were possible, e.g.
type family (Id :: * >-> *) a where Id a = a
but this wouldn't be hard to add I don't think.
Note that data families, and even data types, could then be given the injective kind too. This might provide some simplification in the type checker? I don't know for sure though; just a thought, but one I thought worth mentioning. Great success was had in the past by adding a new kind constructor rather than new syntax (cf. constraint kinds/constraint families!)
comment:106 Changed 2 years ago by
I don't see the problem with letting injective families call other type families as long as there is no recursion—can't GHC follow all paths to the end when necessary, and use injectivity "lemmas" when possible for efficiency? Or does this end up not being useful/efficient/possible for some reason?
comment:107 Changed 2 years ago by
Dominic, do throw in thoughts at any time. My patch will be ready Real Soon Now but after the merge there will be room for further development of this feature.
I will not comment on the proposal of having a separate kind for injective type families. I don't have sufficient knowledge of typechecker to judge whether this would make things easier. Perhaps Simon or Richard can offer some insight here. Regarding your proposed new syntax I believe it does not offer anything new compared to the syntax we decided to use (unless I missed something?). These pairs of definitions would be equivalent:
type family Id a = result | result -> a where {...} type family (Id :: * >-> *) a where {...} type family F a b c = d | d -> a b c type family (F :: * >-> * >->) a b c type family G a b c = foo | foo -> a b where type family (G :: * >-> * -> *) a b c
Your syntax however does not extend to injectivity where knowing the result and some of the arguments allows to determine some other arguments:
type family Plus a b = (sum :: Nat) | sum a -> b, sum b -> a where type family H a b c = xyz | xyz a -> b c, xyz b -> a c
This kind of injectivity is not implemented in my patch but we want to leave the door open for such a possibility in the future. I'm also not sure if I like how your proposed syntax interacts with PolyKinds: it seems that there would be two alternative places to specify kinds.
David, what do you mean by injectivity lemmas? At the moment I don't have a solid stance on the subject of calling type families in the RHS of injective type families. When I started this work it seemed to me that this is not possible. Now I have some more experience so I plan to revisit my earlier decisions. Incidentally, now it seems to me that self-recursion might actually be doable. Calling other type families in the RHS still seems very problematic. First of all called families would have to be injective to declare that a type family that calls them is also injective*. This implies that type families would have to be typechecked in dependency order, which sounds lika a non-trivial technical obstacle. Of course this would prevent type families that form cycles (including mutual recursion) from being injective. Another problem is issue of overlap:
type family Foo a = r | r -> a where Foo (Bar a) = X a Foo (Baz a b) = Y a b
To determine whether Foo
is injective or not we must know whether X a
and Y a b
can have identical values (well, types not values). Given that X
and Y
can also call other type families I don't see a simple way of checking this. Anyway, the ice here seems very thin and I don't feel confident proceeding further with calling type families by injective type families without formal proof.
*) This is a simplifying assumption. It is possible to write a type family that is injective despite calling non-injective type families.
comment:108 follow-up: 109 Changed 2 years ago by
Replying to dorchard:
what about adding a kind of injectives
>->
? e.g.* >-> *
is the kind of injective type functions
This is very much along the lines of what Jan and I proposed in our "Promoting Functions to Type Families" paper (here). But, there is some subtlety.
GHC assumes all function types (that is, types with kinds that are headed by ->
) are generative and injective. Generativity means that, given a b ~ c d
, you can conclude a ~ c
. Injectivity means that, given a b ~ a d
, you can conclude b ~ d
. Indeed, putting these together and assuming a b ~ c d
, you can conclude b ~ d
. Thus, the kind ->
denotes a generative, injective function.
We could imagine a new classifier ~>
that denotes an ordinary function, with no assumptions. GHC would then be careful not to decompose such things. This is exactly the situation today with type families. Note that, short of GHCi's :kind
command, there is no way to work with the kind of an unapplied type family. If I say type family F (a :: *) :: *
, we often colloquially say that F :: * -> *
, but there is no way to use this knowledge in a Haskell program, because F
can never appear unsaturated. Thus, it is more accurate to say that F
is just ill-formed, and F a :: *
as long as a :: *
.
All of this is to say that the new arrow would have to denote non-injective functions. And, I could easy say that GHC today supports such a thing, because there's no way to take advantage of it anyway.
What we really need in this direction is unsaturated type functions, but that's a story for another day. (A story I'd love to think more about... on that other day!)
So, taking all of this into account, dorchard's proposal is a different syntax for denoting injectivity, and I personally prefer the functional-dependency-like one more.
comment:109 follow-up: 110 Changed 2 years ago by
Replying to goldfire:
Replying to dorchard:
what about adding a kind of injectives
>->
? e.g.* >-> *
is the kind of injective type functionsThis is very much along the lines of what Jan and I proposed in our "Promoting Functions to Type Families" paper (here).
Not exactly, I think. At the moment ->
kind classifies term-level functions (non-injective, non-generative, can be partially applied), type constructors (injective, generative, can be partially applied) and type families (non-injective, non-generative, cannot be partially applied). In our paper we proposed a new kind to classify type functions (families) that are still non-injective and non-generative but can be partially applied just like term-level functions. We addressed the problem of partial application, not injectivity.
GHC assumes all function types (that is, types with kinds that are headed by
->
) are generative and injective. (...) Thus, the kind->
denotes a generative, injective function.
Are you sure? I would say this is true only for type constructors but perhaps I'm missing something. (I am assuming that also ->
classifies type families although, as you point out, this is not entirely accurate.)
Anyway, I can imagine introducing a new kind to distinguish between injective and non-injective applications but, given that we might want a new kind to distinguish between things that can and can't be partially applied at the type level, this does not seem like a good choice.
comment:110 Changed 2 years ago by
Replying to jstolarek:
(I am assuming that also
->
classifies type families although, as you point out, this is not entirely accurate.)
Right -- ->
does not classify type families in Haskell today. The only place where ->
describes a type family is in the output of GHCi's :kind
operator. And, to get that to work, GHC has a gross hack in that GHCi relaxes the "type families must always be saturated" requirement, just to appease users who want to know the shape of their type families. (This gross hack is a great idea from a user's point of view, and I'm not at all suggesting removing it. But it is gross from a type system point of view.)
Nowhere in a Haskell source text can ->
classify a type family. So, I think that fact describes the difference between our comments above.
comment:112 Changed 19 months ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Test Case: | → T6018, T6018fail, T6018rnfail, T6018th, T6018failclosed1, T6018failclosed2, T6018failclosed3, T6018failclosed4, T6018failclosed5, T6018failclosed6, T6018failclosed7, T6018failclosed8, T6018failclosed9, T6018failclosed10, T6018failclosed11, T6018failclosed12, T6018ghci, T6018ghcifail, T6018ghcirnfail |
comment:113 Changed 19 months ago by
Related Tickets: | #4259 → #4259, #10832, #10833 |
---|
comment:117 Changed 19 months ago by
Test Case: | T6018, T6018fail, T6018rnfail, T6018th, T6018failclosed1, T6018failclosed2, T6018failclosed3, T6018failclosed4, T6018failclosed5, T6018failclosed6, T6018failclosed7, T6018failclosed8, T6018failclosed9, T6018failclosed10, T6018failclosed11, T6018failclosed12, T6018ghci, T6018ghcifail, T6018ghcirnfail → T6018, T6018fail, T6018rnfail, T6018th, T6018failclosed1, T6018failclosed2, T6018ghci, T6018ghcifail, T6018ghcirnfail |
---|
comment:119 Changed 6 months ago by
Use case of more complicated injectivity annotations from Type Families with Class, Type Classes with Family, simulating the type class:
class Collection c e | c -> e where empty :: c add :: e -> c -> c
with a type family
data Defined = Yes | No type family IsCollection t e = (result :: Defined) | result t -> e
Edit: Instances
instance Collection [a] a type instance IsCollection [a] a = Yes
Demonstration of injective families