Opened 3 years ago

Last modified 22 months ago

#9636 new bug

Function with type error accepted

Reported by: augustss Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.8.3
Keywords: Cc: adamgundry, sweirich@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description

The following program is accepted by ghc even though it is clearly wrong. The type family T is closed so T Bool cannot ever be reduced, so it must be a type error.

{-# LANGUAGE TypeFamilies #-}
module Err62 where

type family T a where
    T Int = ()

x :: T Bool
x = undefined

f :: T Bool -> Bool
f _ = True

y :: Bool
y = f x

Change History (29)

comment:1 Changed 3 years ago by goldfire

Simon and I chatted about this this morning and we thought that your program is a little suspect, perhaps, but not type-incorrect. After all, T Bool is inhabited, by undefined. To me, suggesting that T Bool is unequivocally an error is somewhat like saying Void is an error.

On the other hand, writing T Bool is probably a mistake. We thought a good solution here would be to report a warning in this case. Specifically, warn whenever a closed type family is called on arguments that are apart from all the family's left-hand sides. Would that be a good solution?

comment:2 Changed 3 years ago by augustss

I don't think a warning is right, I think it's an error. I view closed type families as functions on types, and when you use a function you expect it to reduce to a normal form. With a warning you're saying that T x is sometimes equal to some other type and sometimes it's a new type. That makes no sense to me. It has nothing to do with being inhabited, uninhabited types are perfectly fine.

It like saying that if I define a function 'f False = ()' and then do 'print (f True)' it would print "f True", i.e., elevating the function symbol f from a function to a constructor just because it's partial.

comment:3 in reply to:  2 Changed 3 years ago by dfeuer

Replying to augustss:

I don't think a warning is right, I think it's an error. I view closed type families as functions on types, and when you use a function you expect it to reduce to a normal form. With a warning you're saying that T x is sometimes equal to some other type and sometimes it's a new type. That makes no sense to me. It has nothing to do with being inhabited, uninhabited types are perfectly fine.

This is not a realm I know anything about, but I agree. T Bool is not an uninhabited type—it's simply not a type. The situation for data families is qualitatively different, and it would probably make sense to offer such a warning for those.

comment:4 Changed 3 years ago by dreixel

I don't see why T Bool is "not a type". If you really want something like that, you should index T over a closed kind:

{-# LANGUAGE DataKinds, TypeFamilies #-}

data Un = TInt

type family T (a :: Un) :: * where
  T TInt = ()

Now it's clear that T Bool is indeed not a type, and gives rise to a kind error.

comment:5 Changed 3 years ago by augustss

To me it's clear that T Bool is not a type. To me T is a type function which computes a type given a type. But it's a partial function, so only some arguments make sense. New types are created by data and newtype, not by type or type family.

comment:6 Changed 3 years ago by simonpj

For what it's worth, I don't have a strong opinion either way. I think we are agreed that:

  • In an user-written type signature,
  • if an application of a closed type family is "apart" from all equations of the family,
  • then we should signal either an error or a warning.

Lennart says "error", Richard says "warning", and I can't work up much excitement about which is chosen. One advantage of "error" is that you don't have to add a flag to suppress the warning!

Anyway, if someone wants to implement it, go ahead!

This all applies to user-written type signatures. I don't want to apply this to "types that appear during the type inference process" because I don't know exactly what that might mean.

Simon

comment:7 Changed 3 years ago by augustss

I would like it to apply to types that appear during type checking, because I think it's an error and errors should be reported. I presume that at some point during type checking the closed type family equations are used to simplify a type. If at that point you discover that none of the equations can possibly match, then that should be reported as an error.

If irreducible closed type family (type) expressions are allowed then these are some kind of "ghost types". If we agree that that only "data" and "newtype" generate types, then these ghosts are some new kind of types that have no meaning. They are not empty types, they simply malformed.

comment:8 Changed 3 years ago by simonpj

So, to be concrete, if I write:

foo :: Either (T a) a -> a
foo (Right x) = x

then you would like (foo (Right True)) to signal a type error.

Do you have a clear idea about how to write declarative typing rules that would define which programs are acceptable and which are not?

Simon

Last edited 3 years ago by simonpj (previous) (diff)

comment:9 Changed 3 years ago by augustss

Yes, I'd like that to be an error.

And no, I don't know how to write the typing rules. I've not even thought about it.

What I'd like is for the typing rules to capture the intuitively right semantics. If my intuition is wrong, or if that's impossible then I'll have to accept a compromise, of course.

Do we agree that T Bool is a weird beast that has no meaning? Or do you think it is a real type?

comment:10 Changed 3 years ago by simonpj

It is a weird beast. But I have learned, painfully, that not being able to write down typing judgements is a Bad Sign. It suggests that we don't know exactly which programs should be erroneous and which should be OK. And if we don't know, it's hard to explain to the user which programs are ok and which are not.

comment:11 Changed 3 years ago by goldfire

I don't feel strongly how we come down on this one, but I want to note that making T Bool an error is a breaking change.

It's easy enough to write a typing rule in Core that handles this:

F is a closed type family
t is not apart from all equations in F
F : [a:k1]. k2
G |- t : k1
-------------
G |- F[t] : k2[a |-> t]

BUT, that rule doesn't obey the substitution lemma! Specifically, T a could be well-typed, but T Bool wouldn't be. This is No Good.

I see this is as somewhat like the error for inaccessible code. Writing inaccessible code isn't actually erroneous -- we're just sure that users don't want to. Furthermore, with enough type-level trickery, users can write provably inaccessible code that GHC isn't smart enough to flag. So, we could similarly try to flag and error on "weird beasts" on a best-effort basis, but this would probably have to be restricted to user-written types, missing Simon's case in comment:8.

However, having just written that, I recall that, separately, I would love a way to disable the "inaccessible code" check. When I'm producing Haskell code programmatically (say, from Template Haskell), I sometimes produce inaccessible branches and have had to go to some lengths to avoid this. It would be nicer just to be able to ask GHC to accept these harmless chunks of code. In a similar vein, I can imagine some programatically-written Haskell that would contain weird beasts and wouldn't want failure. As I said above, though, I don't feel too strongly.

comment:12 Changed 3 years ago by augustss

To me saying that T Bool is OK because it's inaccessible is akin to saying that type incorrect expressions are OK as long as they are inaccessible. After all, if you don't use them, they can't cause any harm. But for expressions we have decided that this isn't acceptable.

I guess making T a behave would require something like kind classes. We don't say that the expression show x is unconditionally type correct. It depends on the type of x belonging to the Show class. In the same way, T a is not unconditionally type correct, it's only type correct if a is one of the types where T is well defined. Until we have something like that I think you'll have to accept that the substitution lemma doesn't work. You can pretend it works by saying T Bool is a type, if that makes you happier. I just wonder which type it is. :)

But I'm not asking for the moon. :) I'd just like the compiler to tell me when it finds something that is clearly not going to work, like T Bool. Exactly under what conditions and how it tells me, I don't care.

comment:13 in reply to:  12 ; Changed 3 years ago by rwbarton

Replying to augustss:

To me saying that T Bool is OK because it's inaccessible is akin to saying that type incorrect expressions are OK as long as they are inaccessible. After all, if you don't use them, they can't cause any harm. But for expressions we have decided that this isn't acceptable.

Isn't it analogous to saying that incomplete pattern matches are OK as long as they are inaccessible? (Which they are.) After all, like you said, a closed type family is like a function on types. The type expression T Bool has no normal form, for sure, but who says we have to evaluate it?

Anyways, failure of what I have learned is called "the substitution lemma" is far more unintuitive to me than GHC allowing T Bool even though it is "floating" (my made-up word for type family applications that cannot be reduced (yet)) and GHC knows that it can never be reduced. But a warning, sure.

I guess making T a behave would require something like kind classes. We don't say that the expression show x is unconditionally type correct. It depends on the type of x belonging to the Show class. In the same way, T a is not unconditionally type correct, it's only type correct if a is one of the types where T is well defined.

Something like this would be great for other uses, too (imagine restricting the kind of the argument to Set to only allow types which are instances of Ord). But it sounds more like a research project than a bug report!

Until we have something like that I think you'll have to accept that the substitution lemma doesn't work. You can pretend it works by saying T Bool is a type, if that makes you happier. I just wonder which type it is. :)

In the System FC translation it's an indeterminate type, sort of like the variable "x" in a ring of polynomials. I can do algebra on polynomials without ever asking which number "x" is.

comment:14 in reply to:  12 Changed 3 years ago by goldfire

Replying to augustss:

To me saying that T Bool is OK because it's inaccessible is akin to saying that type incorrect expressions are OK as long as they are inaccessible. After all, if you don't use them, they can't cause any harm. But for expressions we have decided that this isn't acceptable.

I actually would say that an inaccessible expression is always type-correct, because inaccessibility means that there is a proof of false in the context, and thus anything is possible. But that's perhaps a story for another day.

I guess making T a behave would require something like kind classes.

It's a little worse than that, I think.

Say x :: Z. In show x, we know Z must be in the Show class, and we also know that anything of type Z is a valid parameter to show. Thus, we have substitution, because substitution preserves types.

But, in closed type families, we have a stranger situation: T a where a :: * is acceptable, but T Bool is not. I think kind classes (which can be kludged today) don't solve this problem.

But I'm not asking for the moon. :) I'd just like the compiler to tell me when it finds something that is clearly not going to work, like T Bool. Exactly under what conditions and how it tells me, I don't care.

This is certainly possible. It's just that the behavior around this feature won't be complete and might not be predictable in corner cases.

comment:15 in reply to:  13 Changed 3 years ago by goldfire

Replying to rwbarton:

Anyways, failure of what I have learned is called "the substitution lemma" is far more unintuitive to me

If the substitution lemma doesn't hold, then (perhaps) FC, and by extension Haskell, is no longer type-safe. In other words, I don't believe that the type system is capable of ruling out T Bool, without major surgery. So, maybe we can report errors to the user if we see T Bool in source code, but if T Bool ends up sloshing around internally, GHC will respect it as a valid type.

comment:16 in reply to:  13 ; Changed 3 years ago by augustss

Replying to rwbarton:

Isn't it analogous to saying that incomplete pattern matches are OK as long as they are inaccessible? (Which they are.) After all, like you said, a closed type family is like a function on types. The type expression T Bool has no normal form, for sure, but who says we have to evaluate it?

Yes, it's a lot like a pattern match failure. So let us go with that analogy for a moment. Assume we have these definitions:

foo :: Int -> Int
foo 0 = 42

bar = foo 5 - foo 5

Since we know that x - x = 0 it is tempting to optimize the definition of bar to bar = 0. This is, of course, wrong since computing foo 5 will result in bottom.

Now compare this to what the type checker does in my original example. It happily assumes that T Bool is equal to T Bool, but that's wrong. T Bool is bottom at type checking time and should be reported as such.

I've decided I'm OK with having T Bool just floating around in the same way that I'm OK with having foo 5 floating around in a lazy language. Type families introduce partial functions on the type level, and I think this has to be dealt with. Type expressions can now blow up during type checking. So at the points when a (closed) type family expression has to be evaluated, e.g., for equality comparison, and it cannot be reduced then this should be an error. IMHO.

comment:17 in reply to:  16 ; Changed 3 years ago by goldfire

Replying to augustss:

So at the points when a (closed) type family expression has to be evaluated, e.g., for equality comparison, and it cannot be reduced then this should be an error. IMHO.

Unfortunately, this, too, violates the substitution lemma. We probably want to be able to solve (a ~ a) with reflexivity, for any a. But, you're suggesting that a particular value for a, namely T Bool, will not equal itself.

The difference between types and terms here is that we need to be able to reason about unevaluated types, such as a. We never need this ability on terms, so the existing term-level semantics works out just fine.

To be clear, I'm saying that issuing an error/warning in the case of a user-written strange beast seems quite easy to implement. Any suggestion beyond that made here seems like a stretch, though. Do keep suggesting, though -- perhaps we'll find something that works in here.

comment:18 in reply to:  17 Changed 3 years ago by augustss

Replying to goldfire:

Unfortunately, this, too, violates the substitution lemma. We probably want to be able to solve (a ~ a) with reflexivity, for any a. But, you're suggesting that a particular value for a, namely T Bool, will not equal itself.

I'm saying it's neither equal to itself, nor unequal to itself. T Bool is in some sense ill formed, so asking if it's equal to itself is nonsensical, because it's not a a type. Can you use reflexivity to conclude Int Bool ~ Int Bool? No, because it's an ill-formed typed. The case with T Bool is not as severely ill form, though.

If you want to keep T Bool you have to have a different explanation what makes a type. Types a made from data, newtype, and type family. But for type family certain of the the types are actually equal to other existing types. But if no equation holds then it's a new (empty) type, distinct from all other types. I can go along with that explanation, but it's not very satisfactory to me.

comment:19 Changed 3 years ago by augustss

Just to follow up on my last comment. Treating type family names as some kind of type constructors means that we still have logic programming on the type level, just some better syntax for relation symbols that act as functions. I was hoping for functional programming at the type level.

comment:20 Changed 3 years ago by simonpj

Lennart, I don't understand comment:19 at all. Can you give an example to illustrate what you mean?

comment:21 Changed 3 years ago by goldfire

There may be a very easy solution to this problem.

At the end of every closed type family, we could just automatically add an equation Foo x y z = Error "Pattern match failure in closed type family `Foo'", where Error is the type-level error function from #9637. I think that would work quite well here.

comment:22 Changed 3 years ago by adamgundry

Cc: adamgundry added

Type families as currently implemented really are axiomatically-defined functional relations: we don't have an operational semantics for them as functions. I agree that being able to write actual functional programs at the type level would be nice, but we can't yet.

comment:23 Changed 2 years ago by adamgundry

I like idea of using an Error type family, but I don't think this should be automatic for all closed type families, or at least there should be some way to switch it off (see Phab:D481).

comment:24 Changed 2 years ago by jstolarek

Adam, you meant Phab:D841, not Phab:D481.

comment:25 Changed 2 years ago by adamgundry

Thanks, Janek, I did. See also the related discussion on #9840.

comment:26 in reply to:  19 ; Changed 22 months ago by DerekElkins

This is a fun one. We can look at what some other systems do in similar situations.

comment:17 talks about handling unevaluated terms and the discussion has been talking about partial functions. One system that deals in this realm is Computational Type Theory (CTT), the type theory underlying NuPRL (and now JonPRL). In NuPRL you can literally write the equivalent of:

T Int = Bool
T   x = fix id

Obviously NuPRL doesn't loop forever when it encounters T Bool. It will do some computation to see if it can decide the proposition, but if it can't it gives up and makes it a proof obligation that the user must discharge. Since CTT is based on partial equivalence relations, T Bool ~ T Bool presents no problem. Reflexivity is not given, so that's a statement that needs to be proven. Indeed, typing Lennart's f from the bug report, requires, as usual, deriving that T Bool :: * and in CTT this is by definition T Bool ~ T Bool. Clearly GHC has evidence allowing it to automatically show the proof obligation is not achievable in this case. Thus the analogous situation in CTT would be a type error.

In comment:19 Lennart mentions that we still have logic programming at the type level. One way of interpreting this is to give (closed) type families Curry's semantics, i.e. narrowing, but restricted to functional relations, so no (truly) overlapping patterns. This would be a language where a function returns at most one result rather than exactly one result as in a functional language. Still, T Bool would narrow the result set to the empty set and thus be a type error.

Alternatively, Lennart may have just been referring to working with uninterpreted functions, something commonly done in logic programming, but in no way restricted to logic programming. Of course, T isn't an uninterpreted function, T Int ~ Bool. We could say it's a quotient of a type of uninterpreted functions modulo the relation T Int ~ Bool. We can push this idea a little. If someone wants to play with the current behavior in a more explicit manner all one needs to do is open up Agda and type:

import Level

data Unit : Set where U : Unit
data Bool : Set where True : Bool; False : Bool

data _==_ {l}{A : Set l} (a : A) : A -> Set l where
    refl :  a == a

subst : {l : Level.Level}{A B : Set l} -> (A == B) -> A -> B
subst refl x = x

postulate T : Set -> Set
postulate T_Bool_is_Unit : T Bool == Unit

f : T Unit -> Bool
f _ = True

g : T Bool -> Unit
g x = subst T_Bool_is_Unit x

h : T Bool == T Bool -> Unit
h refl = U

To handle overlapping, explicit inequality evidence needs to be provided. This suggests that like we can view a data declaration as adding a constructor to (codes for) *. A type family could be viewed into making that data type a higher inductive type (from Homotopy Type Theory).

comment:27 Changed 22 months ago by goldfire

Cc: sweirich@… added

I'm sure Stephanie would be interested to see that last comment.

comment:28 in reply to:  26 ; Changed 22 months ago by jonsterling

Replying to DerekElkins:

This is a fun one. We can look at what some other systems do in similar situations.

comment:17 talks about handling unevaluated terms and the discussion has been talking about partial functions. One system that deals in this realm is Computational Type Theory (CTT), the type theory underlying NuPRL (and now JonPRL). In NuPRL you can literally write the equivalent of:

T Int = Bool
T   x = fix id

thanks for the shoutout! I just thought I would clarify that, whilst in the past it was considered and perhaps experimented with, Nuprl does not currently have the ability to perform case analysis on types. (However, one of the principle reasons for types having an intensional equality in Nuprl rather than the standard extensional one is to not rule out the option of providing an eliminator to the universe in the future.)

Anyway—with regard to partial operations, you are correct that it is not really a problem in Nuprl or JonPRL if a definition is partial; reduction is guided by the user in Nuprl. (By the way, contrary to oft-repeated mythology, it *is* safe to reduce terms in any context in CTT/ETT—this is guaranteed by the fact that computational equivalence is a congruence, a well-known result that comes from Howe.)

It is *not* the case that for some function f and value m, f(m) is stuck (or worse, "canonical") if f is not defined at m; instead, it diverges. So viewing Haskell-style type families (whether open or closed) as functions or operations doesn't really work, though I believe that in many cases where a Haskell programmer reaches for a type family, they are really wanting a function/operation. I like your view of type families as generative in the same sense as data families, but quotiented by further axioms.

comment:29 in reply to:  28 Changed 22 months ago by DerekElkins

Replying to jonsterling:

Replying to DerekElkins:

comment:17 talks about handling unevaluated terms and the discussion has been talking about partial functions. One system that deals in this realm is Computational Type Theory (CTT), the type theory underlying NuPRL (and now JonPRL). In NuPRL you can literally write the equivalent of:

T Int = Bool
T   x = fix id

thanks for the shoutout! I just thought I would clarify that, whilst in the past it was considered and perhaps experimented with, Nuprl does not currently have the ability to perform case analysis on types. (However, one of the principle reasons for types having an intensional equality in Nuprl rather than the standard extensional one is to not rule out the option of providing an eliminator to the universe in the future.)

Yeah, in my reply on Richard Eisenberg's blog post,https://typesandkinds.wordpress.com/2015/09/09/what-are-type-families/, I explicitly introduce codes. I don't think this really changes the picture. (If this does, I'd like to know. Specifically if viewing (closed) type families as functions on codes of types that get interpreted into functions on types misses something important due to being limited to codes.)

It is *not* the case that for some function f and value m, f(m) is stuck (or worse, "canonical") if f is not defined at m; instead, it diverges. So viewing Haskell-style type families (whether open or closed) as functions or operations doesn't really work, though I believe that in many cases where a Haskell programmer reaches for a type family, they are really wanting a function/operation. I like your view of type families as generative in the same sense as data families, but quotiented by further axioms.

I agree, in CTT f(m) is definitely not canonical and nothing like the generative view I suggested. I mainly mentioned CTT as I think it provides an interesting perspective. Frankly, the behavior Lennart is complaining about is the behavior I expected a priori. My suggestion is essentially that codes for * are being quotiented. The mention of HITs was simply because I still really want codes to be "inductive". I go into this further on the above mentioned blog reply where I've also no doubt said at best imprecise things about CTT (though not they are not relevant to this approach.)

Note: See TracTickets for help on using tickets.