Undependable Dependencies

mentioned in issue #9627

changed milestone to %8.6.1

changed weight to 5

Interesting. I have no idea what to do about (overlap + FDs).

I would really welcome someone paying careful attention to FDs in GHC, reviewing tickets (including this one) and proposing what changes to make.

Here's my starter for 10, analysis and rationale below.

A) Apply the FunDep consistency check as per the published work.

That is, per Definition 6 of the FDs through CHRs paper. If the FunDep's argument positions of the instance head unify, their results must be strictly equal.

Except:

B) Relax to follow GHC's 'bogusconsistencycheck' providing all of

i) The FunDep is full;

ii) One of the instances is strictly more specific; and

iii) That instance's argument positions taken together are strictly more specific.

(i.e. the result positions are not less specific.)

(The bogus consistency check is that the result positions be unifiable rather than equal under the substitution induced by unifying the arguments.)

So the example on this ticket gets rejected: although the Thither instance is strictly more specific than the Yon, its argument position is not. Furthermore the Thither instance is in no substitution ordering wrt to the Hither instance. The Hither and Yon instances are happy together, and in a substitution ordering.

Note these rules allow nearly all of the dubious instances discussed in the various tickets (including 'dysfunctional dependencies'), but insist you write them in a convoluted way that would be a big hint you're probably doing something silly.

reviewing tickets (including this one)

I've added some more ticket links to this one; #10675 is the authority.

someone paying careful attention to FDs in GHC.

That's what this proposal was about. Search for the ticket numbers in the discussion.

I'll also add a comment to #10675 explaining how to apply the rules for it; and giving the types GHC infers (much more sensible) -- if we relax even the full FunDep requirement B) i).

Rationale

proposing what changes to make

What we ''don't* want is to go delving into constraints or some global search amongst instances in scope. IOW we do want rules that can be applied pairwise between instance *'heads''' and per-FunDep. That also makes it easier to explain to puzzled users why their instances are being rejected.

In general there might be multiple FunDeps; a parameter position might be an argument for one FunDep but a result for another. For non-full FunDeps (which is #10675) I don't think there's anything we could do other than apply the strict rule. But you can evade that rule (deliberately so) with a strictly more general/overlappable instance.

And I think that relaxation is justified because putting unification variables in result positions is a) what the published work gives as the semantics; and b) gives you overlapping instances in effect anyway.

Specifically, the above rules authorise what we've been doing for over a dozen years to get a type-level type equality predicate

class TypeEq a b (r :: Bool)  | a b -> r
instance TypeEq a a True
instance {-# OVERLAPPABLE #-} (f ~ False) => TypeEq a b f

The equal instance is equivalent to, and could happily be written as

instance {-# OVERLAPPING #-} (t ~ True) => TypeEq a a t

Trac metadata

Trac field	Value
Related	#10675 → #8634, #9210, #10675

Replying to [ticket:15632#comment:160109 AntC]:

I've implemented the suggestion, in a version of Hugs. So throw at me your awkward cases/counter-examples.

(Hugs latest release/Sep 2006 only allows FunDeps + Overlap if all the overlapping instances are per the strictly unify-to-equal rule. Hugs instance heads must overlap in a strict substitution order; none of GHC's allowing the potential for overlap.)

Seems to be working; I have a type-level type equality test; it handles the tricky cases I'm aware of, including this ticket. #10675 is rejected. Even handles overlap of instances for TRex records.

There is a structure of instances for which the suggested rules at ticket:15632#comment:160109 are over-restrictive: you can code round it under the rules, but awkwardly. I'd appreciate feedback on how common this is: a diamond overlap pattern. Consider

class Whuther a b  | a -> b  where ... 

instance Whuther (Int, a2, Bool) Hither  where ... 

instance (b ~ Thither) => Whuther (Int, a2, a3) b  where ...

instance (b ~ Thither) => Whuther (a1, a2, Bool) b  where ...

instance (b ~ Yon) Whuther (a1, a2, a3) b  where ...

The Hitherinstance is strictly more specific than any of the other three; the Yon instance is strictly more general. The two Thither instances overlap, but are in no substitution order, so don't meet conditions B ii), B iii) above. (For the example, it's not necessary the two instances produce the same result type for b, it just makes the example more poignant. Also it's not necessary for there to be a more general instance; in practice there usually is a 'catch-all'.)

Currently GHC accepts this example, and does the 'right thing'. (Saying GHC accepts it is not much of a claim: GHC will accept nearly any rubbishy set of instances; whether it does the 'right thing' in all cases needs a lot of testing.)

The example can be coded within the rules, via an auxiliary class:

class Whuther a b  | a -> b  where ... 

instance Whuther (Int, a2, Bool) Hither  where ... 

instance (b ~ Thither) => Whuther (Int, a2, a3) b  where ...

instance (Whuther2 (a1, a2, a3) b) => Whuther (a1, a2, a3) b  where ...

class Whuther2 a b  | a -> b  where ... 

instance (b ~ Thither) => Whuther2 (a1, a2, Bool) b  where ...

instance (b ~ Yon) Whuther2 (a1, a2, a3) b  where ...

This has taken a symmetric set of instances to asymmetric, arbitrarily moving one of the instances into the auxiliary class.

A very similar set of instances that GHC also accepts, but for which it's not clear what would be the 'right thing':

class Whuther a b  | a -> b  where ... 

instance Whuther (Int, Char, Bool) Hither  where ... 

instance (b ~ Thither) => Whuther (Int, a2, a3) b  where ...

instance (b ~ Thither) => Whuther (a1, a2, Bool) b  where ...

instance (b ~ Yon) Whuther (a1, a2, a3) b  where ...

These are accepted because of GHC's delayed/relaxed overlap check. then consider a wanted Whuther (Int, a2, Bool) b in which a2 is known to be apart from Char. This gives irresolvable overlap, because there's no reason to prefer either of the Thither instances.

I could amend the rules, still keeping within the idea of pairwise validation of instance heads. If A), B) fail:

C) Relax to follow GHC's bogus consistency check providing all of

i) The FunDep is full;

ii) The instances overlap but neither is strictly more specific;

there is a third instance strictly more specific than either; and

iii) that third instance's argument positions taken together are exactly at the unification of the two partially-overlapping instances' argument positions.

(i.e. the third instance's result positions are not less specific than the unification.)

C ii), iii) together need search amongst instances. Note there could be at most one instance that would meet the criteria:

Whuther (Int, a2, Bool) Hither meets C ii), its (Int, a2, Bool) exactly meets C iii).
Whuther (Int, Char, Bool) Hither meets C ii), but its (Int Char, Bool) is too specific for C iii).
Note that both of those instances could legitimately be in scope under the rules: they're in a strict substitution order; and their FunDep result position is the same (meets condition A)'s strict consistency condition).

To make the search a bit more deterministic, we could revise C ii) to require

C) ii') Neither of the instances is strictly more specific; and

there is a third instance at exactly the unification of the two instances.

Then C iii) is not needed. Under C ii'), the example at the top of this comment would be rejected, but the Hither instance could be amended to:

instance (b ~ Hither) => Whuther (Int, a2, Bool) b  where ...

And that would still allow another instance Whuther (Int, Char, Bool) Hither, which is strictly more specific yet.

I've tried coding either version of rule C). It needs **either** entirely restructuring the consistency check to delay until all the instances are visible (rather than validating instances one-by-one in order of textual appearance); **or** requiring the programmer write the overlapping instances in textual order most-specific first. Usual practice is to write most-specific first. Never the less this makes the validation rules brittle: programmers won't appreciate error messages suggesting re-ordering their instances might help (no guarantee).

Adding related ticket, particularly ticket:15135#comment:162593 and SPJ's reply at comment 6 "This is terribly unsatisfactory".

Changing the optimisation level or the per-instance pragmas can change which instance is selected, and thereby the meaning of the program -- even within a single module. (The O.P. for that ticket is the familiar 'orphan instances'/separate modules -- which is at least a devil we know.)

The linked manual for -fsolve-constant-dicts says

we can often produce much better code by directly solving for an available Num Int dictionary we might have at hand. This removes potentially many layers of indirection and crucially allows other optimisations to fire ...

This seems to be muddling up concerns:

selecting which instance satisfies a Wanted constraint

(in case of Overlapping instances, this means 'best'/most closely satisfies);

vs: having resolved which instance, and determined it's the same instance as for a dictionary available at hand,

then sharing the at-hand dictionary/removing layers of indirection, etc.

I can see that under separate compilation, we can't be sure whether the 'best' instance in scope in the current module might get gazumped by an instance in another module. Then we can only go by the OVERLAPPABLE pragma. But we should always search all instances in scope in the current module, irrespective of pragmas or optimisation(?) OK that search is adding a performance hit for the compiler and it only makes a difference in case of overlap; but it shouldn't impact object code, presuming it does resolve to an at-hand dictionary.

I didn't realise the pragmas had that dramatic of an effect on selecting instances. Then this applies from comment 1

I think it's arguable that an instance should only be overlappable if it says {-# OVERLAPPABLE #-}. But that's not our current spec.

Making that the rule (in some form) could detect/warn of problematic 'orphan instances'. Interpreting the rule for instances that are in no substitution order (the O.P. on this ticket #15632) still needs some head-scratching.

Trac metadata

Trac field	Value
Related	#8634, #9210, #10675 → #8634, #9210, #10675, #15135

mentioned in issue #15927

See #15927, where inconsistent constraints are put on the same function, and apparently do something useful(?)

Trac metadata

Trac field	Value
Related	#8634, #9210, #10675, #15135 → #8634, #9210, #10675, #15135, #15927

Adding a couple of related tickets. See also the paper linked from ticket:9627#comment:165601

Trac metadata

Trac field	Value
Related	#8634, #9210, #10675, #15135, #15927 → #8634, #9210, #9627, #10675, #15135, #15927, #16070

Linking #16241, which is a follow-on from #15135.

Under separate compilation, OVERLAPPABLE gives better coherence than OVERLAPPING. But it's fragile.

Trac metadata

Trac field	Value
Related	#8634, #9210, #9627, #10675, #15135, #15927, #16070 → #8634, #9210, #9627, #10675, #15135, #15927, #16070, #16241

added FunctionalDependencies OverlappingInstances labels

added Pnormal label

Here's an exploit of this undependability, which does something useful (if/when it works).

I'm not sure whether to count it as a bug or a feature.

The more I think about it, the more it seems it's the FunDep consistency check that's bogus. Contrast at the term level:

nonZero :: Int -> Bool
nonZero 0 = False
nonZero _ = True

Those two equations conform to a FunDep arg -> result. There's no expectation that 'unifying' the wildcard equation with the first equation should give the same result. The sequence of equations ensures that we don't explicitly have to guard nonZero _x | _x /= 0 .... With instances, the guarding mechanism is overlap rather than sequencing. And that distributed logic is ergonomic because we want classes to be open so for each datatype decl we put the instances in the same module. I don't see anything would go wrong if we allowed a more direct style (this isn't supported by my suggested rules)

class TypeEq a b (r :: Bool)  | a b -> r
instance                      TypeEq a a True
instance {-# OVERLAPPABLE #-} TypeEq a b False
-- argument positions overlap in strict substitution order; result position is apart
-- the OVERLAPPABLE means the arg positions, not the whole instance head

class AddNat a b c  | a b -> c, a c -> b, b c -> a
instance AddNat ZNat b b
instance {-# OVERLAPPABLE #-} (c ~ (SNat c'), Add a' b c')
         => AddNat (SNat a') b c
-- argument positions overlap in strict substitution order for b c -> a FunDep
-- result position is apart `ZNat /~ (SNat a')`

I guess the FunDep consistency rule is to block instances like:

class C3 a b c  | b -> c              -- non-full, compare Trac #10675 

instance C3 Int  (Maybe b') (Bool, b')
instance C3 Char  b         (Char, b)

Those would break FunDep consistency, under a traditional Relational theory interpretation. The b argument positions overlap in strict substitution order, the result position is apart; but the instances don't treat b ~ (Maybe b') consistently.

Haskell's FunDep arrow is doing two slightly different jobs:

Ensuring each instance represents an algorithm to get from the patterns in the argument positions to the pattern in the result position; binding tyvar b, in the C3 instances. Typically the algorithm goes via constraints in UndecidableInstances cases.
Identifying which positions 'influence' the result, even though the algorithm doesn't use the content of those positions. That is, C3's first position determines the fst of the result tuple; even though the algorithm doesn't take any tyvars from it.
So the FunDep consistency check rejects those C3 instances, and insists you put the FunDep as a b -> c -- which is correct. (Or if you genuinely intend a FunDep b -> c, put it in an auxiliary class via a superclass constraint.)

Non-full FunDeps are useful, for example

class Lens s t a b | s -> a, t -> b, s b -> t, t a -> s

So I'm not happy my suggested improvements above only apply for full FunDeps. (They do meet the objective of supporting common use cases.)

Wild thought: abandon the FunDep consistency check altogether. If you want inconsistency like those C3 instances, on your head be it/I've suggested two mechanisms to be more explicit what you intend. So the FunDep arrow is for the usage sites: it tells the compiler it'll be able to infer the target type. And for the instances: the compiler must check the instance gives an algorithm to get the target type (possibly via constraints in case of UndecidableInstances).

If there's to be case-by-case logic for the FunDep to give so-called inconsistent results, such as the TypeEq or AddNat examples, the only mechanism is overlapping instances. Change the meaning of the OVERLAPPABLE etc pragmas to mean overlap of the argument positions, not the whole instance head. So we'd like to be more explicit but can't write

instance {-# OVERLAPPABLE #-} (Add a' b c')
         => AddNat (SNat a') b (SNat c')

because on FunDep b c -> a that doesn't overlap instance AddNat ZNat b b in any substitution order.

If there's multiple FunDeps, their argument positions taken FunDep-by-FunDep must all overlap with the same polarity, so that one instance is always more specific.

mentioned in issue #18759

mentioned in issue #18851

Trac field	Value
Version	8.6.1-beta1
Type	Bug
TypeOfFailure	OtherFailure
Priority	normal
Resolution	Unresolved
Component	Compiler
Test case
Differential revisions
BlockedBy
Related	#10675
Blocking
CC
Operating system
Architecture

Undependable Dependencies

Child items 0

Activity

Undependable Dependencies

Relates to

Activity