Opened 6 years ago

Closed 2 months ago

Last modified 5 weeks ago

#2110 closed feature request (fixed)

Rules to eliminate casted id's

Reported by: igloo Owned by:
Priority: lowest Milestone: 7.10.1
Component: Compiler Version: 6.8.2
Keywords: Cc: rl, dimitris@…, ryani.spam@…, mail@…, sweirich@…, hackage.haskell.org@…, bgamari@…, eir@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: tests/simplCore/should_run/T2110 Blocked By:
Blocking: Related Tickets: #8767

Description

Some people (e.g. jedbrown, who originally brought this up) have/use C libraries that deal with arrays of doubles (i.e. CDouble) that they would like to use from Haskell code, with the Double type (perhaps there are instances of some classes for Double, but not CDouble). In jedbrown's case he has a CArray datastructure that wraps the C array and is an instance of Haskell's IArray.

However, people doing this sort of thing tend to care a lot about performance, so don't want to spend O(n) time doing O(n) allocation. Thus the temptation is to make the assumption that CDouble == Double, and to create the CArray i Double directly.

The more correct way to do it would be to make a CArray i CDouble and then use amap realToFrac to convert it to a CArray i Double. However, I don't believe that GHC is currently smart enough to optimise away amap realToFrac.

Let's look at a simpler example:

cdoubles :: [CDouble]
cdoubles = read "foo"

doubles :: [Double]
doubles = map realToFrac cdoubles

(here map is playing the role of amap). The core for this looks like (very simplified)

Q.a1 = \ (ds_aMc :: Foreign.C.Types.CDouble) -> ds_aMc
Q.doubles = GHC.Base.map (Q.a1 `cast` <mumble>) Q.cdoubles

It looks to me that if we could produce

Q.doubles = GHC.Base.map (id `cast` <mumble>) Q.cdoubles

instead then we might be able to have a rule

forall mumble . map (id `cast` mumble) = id `cast` mumble'

The major problem, as far as I can see, is how to construct the mumble'.

Does that sound plausible?

Change History (49)

comment:1 Changed 6 years ago by simonmar

  • Architecture changed from Unknown to Unknown/Multiple

comment:2 Changed 6 years ago by simonmar

  • Operating System changed from Unknown to Unknown/Multiple

comment:3 Changed 5 years ago by igloo

  • Milestone changed from 6.10 branch to 6.12 branch

comment:4 Changed 5 years ago by ryani

  • Cc ryani.spam@… added

comment:5 follow-up: Changed 4 years ago by simonpj

  • Cc rl dimitris@… added
  • Type of failure set to None/Unknown

The issue here is how to write the rule in source syntax. The internal form is fine. Here's what it should look like in internal form, I think:

  RULE "map/cast" forall (a::*) (b::*) (c::*) 
                         (co :: (a->b ~ a->c)) 
                         (f::a->b)
                  map a c (f |> co) = map a b f |> [right co]

  RULE "map/id" forall (a::*) map a a id = id

I've split the rule into two. One lifts out the cast, and one spots map-of-identity. I think (but I have not tested) that these rules will work, if only we could write them in the source program. We can write the second but not the first.

This question is related to #2600, where we want to bind type variables in rules. Here we want to bind equality constraints too, and we might want to bind dictionaries too. How about this?

  RULE "map/cast" forall type a b c. (a~b) =>
                  forall (f :: a->b). 
                  map (f :: a->c) = map f :: [b]

Two things going on here:

  • I've used two foralls. The first has just the same syntax as a type-level forall, apart from the "type" keyword. It and binds type variables and constraints. The second is the one we have right now in RULES. Maybe we could omit the "type", but that might be hard to parse.
  • I've used type signatures to force the casts to appear in the "right" place. But this is a bit indirect.

Better suggestions on syntax welcome.

Simon

comment:6 Changed 4 years ago by igloo

  • Milestone changed from 6.12 branch to 6.12.3

comment:7 Changed 4 years ago by igloo

  • Milestone changed from 6.12.3 to 6.14.1
  • Priority changed from normal to low

comment:8 Changed 3 years ago by igloo

  • Milestone changed from 7.0.1 to 7.0.2

comment:9 Changed 3 years ago by igloo

  • Milestone changed from 7.0.2 to 7.2.1

comment:10 Changed 3 years ago by igloo

  • Milestone changed from 7.2.1 to 7.4.1

comment:11 Changed 2 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1
  • Priority changed from low to lowest

comment:12 Changed 2 years ago by nomeata

  • Cc mail@… added

comment:14 Changed 2 years ago by nomeata

I’m considering whether this might be a good project for a bachelor’s thesis for my students, but I’m not sure if I can estimate the required effort correctly. What do you think - would you welcome such a contribution, or rather not have to deal with a beginner’s contribution?

comment:15 Changed 2 years ago by simonpj

I'm not sure it makes sense for an undergrad thesis.

  • It's clear what FC (Core) code to generate. The trick is to find a nice source-language design that feels easy and intuitive, and from which we can generate the right FC. Undergrads aren't going to be able to do that.
  • Adding type-variable binders in rules is fiddly, but not very glamorous.

And to be fair I am a bit concerned about dealing with an undergrad patch to the typechecker, one of the more complicated bits of GHC.

There are lots of other open tickets at Status/SLPJ-Tickets if you are interested in a list of open questions. But most of them are tricky in some way, or I'd have done them :-)

comment:16 Changed 2 years ago by simonpj

See also #5974

comment:17 Changed 19 months ago by igloo

  • Milestone changed from 7.6.1 to 7.6.2

comment:18 Changed 19 months ago by nomeata

Today after the Haskell Symposium, I again shocked someone by telling him that "map unT" or "map T, where T is a newtype constructor, is not optimized away, so I was reminded of this bug. There really ought to happen something here.

If the rule is representable internall, but not easily written in user code, how about a compiler pass that detects map-like functions and then automatically adds appropriate rules to the source?

comment:19 follow-up: Changed 19 months ago by simonpj

  • Cc sweirich@… added

Shocking it may be, but the question remains about how to fix it. Here are the issues I see. (I'm assuming we don't want something that works only for lists.)

  • What (precisely) is a "map-like" function, for some arbitrary data type? I'm not keen on spotting such things by somehow matching on their particular form. Remember data types can be mutually recursive, and may have several type parameters.
  • We can reasonably lift casts for data types that don't have a map function. For example, if we have
    newtype Age = MkAge Int
    data T a b = ..blah..blah...
    
    Then the coercion T coAge Bool :: T Int Bool ~ T Age Bool, where coAge is the coercion arising from the Age newtype, even if T doesn't have a map function defined on its first type parameter. System FC is great.
  • But how should that be specified in the source program? Suppose e :: T Int Bool. Then I have wondered about syntax like this:
       newtype[T $MkAge Bool] e
    
    Here the newtype says "here comes a newtype coercion". The bit in square brackets specifies the coercion, in the form of a type with some spot(s) replaced by a newtype data constructor, signalled by $, in this case MkAge. So the degenerate case newtype[$MkAge] e would be identical to MkAge e.

I wonder if anyone has beter ideas for syntax?

comment:20 in reply to: ↑ 19 Changed 19 months ago by nomeata

Replying to simonpj:

  • What (precisely) is a "map-like" function, for some arbitrary data type? I'm not keen on spotting such things by somehow matching on their particular form. Remember data types can be mutually recursive, and may have several type parameters.

I guess what you are not keen on is something like that:

Look for a function f that returns a "T a b c", has one argument that is also a "T a b c", and all other arguments have function types. Check if f is defined by pattern match on the "T a b c" argument. Check for each constructor "C x y z", if the body for the pattern is "C (h x) (h' y) (f h h' h z)", where the h's are the other arguments of f, or recursive calls to f (with the same function arguments). If all this holds, generate a rule that matches on f if all the function arguments are coercions c, c', c and replaces it by the (existing, as I understand it) coercion "T c c' c" applied to the non-function argument.

This is not very elegant, but it would catch most hand-written maps, including Data.List.map and the fmap in the Functor instances of Maybe, Const, Either a, Id, (,) a, Digit, Node, Elem, ViewL, ViewR (judging from a quick grep over base and libraries).

But I guess the question of how we expose the feature to the user, and whether GHC should also generate rules automatically if he can on a best-effort basis are two independent and orthogonal issues.

comment:21 in reply to: ↑ 5 Changed 19 months ago by nomeata

Looking at

Replying to simonpj:

  RULE "map/cast" forall (a::*) (b::*) (c::*) 
                         (co :: (a->b ~ a->c)) 
                         (f::a->b)
                  map a c (f |> co) = map a b f |> [right co]

  RULE "map/id" forall (a::*) map a a id = id

I wonder whether it is realistic to have the types inferred here, e.g.

{-# RULES "map/cast" forall f co. map (f `cast` co) = map f `cast` right co
          "map/id" map id = id
  #-}

This way, the current issue can be solved independently from #2600, and the remaining question is how to expose the Core expressions cast and right to the RULES parser – it seems that they are currently not exported as Haskell values.

comment:22 Changed 19 months ago by nomeata

I wish I could edit my comments, at least for 10 minutes or so...

Anyways, I just noticed that I did not understand the right value correctly. Shoudln’t that be something of type "(α ~ β) -> ([α] ~ [β])"? Unless I am reading it wrongly, the typing rule Right in the TLDI'07 paper goes the other way around. I guess I am expecting something like

{-# RULES "map/cast" forall f co. map (f `cast` co) = map f `cast` [] co
          "map/id" map id = id
  #-}

where [] :: (α ~ β) -> ([α] ~ [β]). Not sure if that already exists, though, and again Syntax problems pop up.

Or what if we free the programmer from worrying about coercions at all:

{-# RULES "map/cast" forall f co. map (casted f) = casted (map f)

"map/id" map id = id

#-}

On the LHS, casted x matches any x cast co. On the right hand side, casted y introduces a new coercion axiom and uses it. This may be less safe, but then, rewrite rules are already very unsafe.

comment:23 Changed 19 months ago by simonpj

I'm sorry, but I can't follow your line of thought.

In any case, I'm convinced that rewrite rules are not the way to solve this problem. As a programmer I absolutely do not want to rely on some relatively-complicated optimisation machinery kicking in to avoid a complete traversal of my data structure, when all I'm doing is changing the type. And I want to be able to change the type even if there isn't a map function.

Really the only difficult thing is surface syntax. Say I have

newtype Age = MkAge { unAge :: Int }

Then if I have

   e :: Tree (Either Int Int)

I'd like to be able to get these things eaily:

   e1 :: Tree (Either Age Int)
   e2 :: Tree (Either Int Age)
   e3 :: Tree (Either Age Age)

One possible syntax might be this:

   e1 = e |> Tree (Either MkAge Int)
   e2 = e |> Tree (Either Int MkAge)
   e3 = e |> Tree (Either MkAge MkAge)

Here I am using the data constructor MkAge in the middle of a type to say "make the change here!". It's less convenient in the other direction

   e4 :: Tree (Either Int Age)
   e4 = e3 |> Tree (Either unAge Age)

THere's a record selector unAge in the middle there. Maybe we need a more explicit signal that we are dropping from the type world to the term world, something like

  e4 = e3 |> Tree (Either |unAge| Age)

Simon

comment:24 Changed 19 months ago by nomeata

I find that I disagree. Consider a library that uses newtypes to hide the implementation:

newtype Nat = Nat { natToInteger :: Integer }

safeNat x = if x >= 0 then Nat x else error "negative"
...more operations on nat...

now the user of the library has a value l :: [Nat] around, but wants to go back to Integers. The natural way of doing it is map natToInteger l. And I find it reasonable to have the compiler (with the help of the library writer) to optimize that to a cast.

The programmer cannot be the one to have to think about it, because he might not even know that Nat ist but a newtype. (Although it might be documented somewhere, so he might be expecting zero-overhead over Ints, and hence expecting the map to be free.)

With my proposition, the burdon of writing the rule is not even with the library author, but the author of the data structure, in this case the list. I find it reasonable to expect him to worry about such things, it is no less tricky than using RULES for list fusion.

If I read your suggestion in comment:23 correctly, you want to introduce a type cast operator to the Haskell syntax that checked whether the types really have the same representation? Might also be useful, but isn’t this risky? We would not want the user to be able to write

let x :: Map Int = ....
    y = x |> Map (Down Int)

would we? How would that be prevented?

comment:25 Changed 19 months ago by nomeata

Thanks for bearing with my spam. I found another reason why a programmer might already expect this to work via RULES, and an indication to a possible implementation. Consider this code:

newtype X = X Int
b :: [Int] -> [X]
b = map X
c :: [Int] -> [X]
c = map unsafeCoerce#

Both functions generate almost identical core code, they even share the same „identitiy“ function:

Test.b1 :: GHC.Types.Int -> GHC.Types.Int
[GblId,
 Arity=1,
 Caf=NoCafRefs,
 Str=DmdType S,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=1, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=ALWAYS_IF(unsat_ok=True,boring_ok=True)}]
Test.b1 = \ (tpl_B1 :: GHC.Types.Int) -> tpl_B1

Test.b :: [GHC.Types.Int] -> [Test.X]
[GblId,
 Arity=1,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=IF_ARGS [] 20 60}]
Test.b =
  GHC.Base.map
    @ GHC.Types.Int
    @ Test.X
    (Test.b1
     `cast` (<GHC.Types.Int> -> Sym (Test.NTCo:X)
             :: (GHC.Types.Int -> GHC.Types.Int) ~# (GHC.Types.Int -> Test.X)))

Test.c :: [GHC.Types.Int] -> [Test.X]
[GblId,
 Arity=1,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=IF_ARGS [] 20 60}]
Test.c =
  GHC.Base.map
    @ GHC.Types.Int
    @ Test.X
    (Test.b1
     `cast` (<GHC.Types.Int> -> UnsafeCo GHC.Types.Int Test.X
             :: (GHC.Types.Int -> GHC.Types.Int) ~# (GHC.Types.Int -> Test.X)))

The only difference is how the coercion is being constructed. Now a the author of the list data type might have added this rule:

{-# RULES
"map/coerce" map unsafeCoerce# = unsafeCoerce#
  #-}

Then assuming the rule fires on c (and I could swear that it did just an hour ago, but I cannot reproduce it now, it seems that now the inlining of unsafeCoerce# happens too soon), then would it not make sense to have it also fire on b, giving us the desired result?

I tried to create a quick hack that would unfold unsafeCoerce# on the LHS of a rule so that the "map/coerce" rule would fire on both the inlined c as well as on b, but my GHC foo is not strong enough yet.

Nevertheless I think letting unsafeCoerce# in a RULE match also functions known to be just specializations of it (namely newtype constructors and deconstructors) seems to be a reasonably clean way to achieve this, without exposing any Core details (casts, equality types) to the surface syntax.

comment:26 follow-up: Changed 19 months ago by simonpj

Your previous comment ("I find I disagree"). But I don't see a good solution at the moment:

  • I really do not want anything relying on unsafeCorece. There is nothing ill-typed going on here, and it would be horrible to subert the type system to do something the (FC) type system can express perfectly well.
  • I really do not want to magically infer that a function is map-like. That seems terribly fragile to me (what about mutual recursion, types with multiple parameters, non-uniform recursion etc). At most one could make a special case for lists.

Generally the GHC story has been, lacking a good way forward, to wait until someone has a neat idea, rather than to implement a hack that we later regret. That's where I am on this one at the moment.

Simon

comment:27 in reply to: ↑ 26 Changed 19 months ago by nomeata

Hi,

Replying to simonpj:

Your previous comment ("I find I disagree").

it seems that this sentence was truncated...

But I don't see a good solution at the moment:

  • I really do not want anything relying on unsafeCorece. There is nothing ill-typed going on here, and it would be horrible to subert the type system to do something the (FC) type system can express perfectly well.

The problems seems to me that FC is able to express it nicely, but Haskell is not. But RULES are being written by Haskell programmers. Now there are two alternatives:

  • expose FC features (cast, coercions) to the Haskell programer, even if only when specifying RULES, or
  • find a reasonable and – for the Haskell programmer – intuitive approximation to what should be there that is still Haskell.

I was currently exploring the second path. For that, there needs to be a way to match newtype constructors and deconstructors, without having to talk about type casts. A LHS map unsafeCoerce would fit somewhat, but maybe this placeholder is misleading. One could introduce a new name for this purpose, say map (casted id) or map thisIsANewtypeConstructorOrDeconstructor or any other new name. This is probably the easier part.

But what to put on the RHS? You want that in the FC cast that will be there, the equality is constructed from the equality on the LHS. I feel that this is not possible without exposing FC features to the Haskell level – hence my suggestion to use unsafeCoerce on the RHS. One could also try a different name there, but it would not gain much.

I am uncomfortable with the other approach of exposing FC casts and stuff to the Haskell user. Where else has this been done before? One may argue that people who write RULES already have to know how to read core, otherwise they will never know why their rules work or not... but I don’t believe this is wanted.

  • I really do not want to magically infer that a function is map-like. That seems terribly fragile to me (what about mutual recursion, types with multiple parameters, non-uniform recursion etc). At most one could make a special case for lists.

I agree that this wasn’t a very good idea, and anyways such an automatic RULE generation would be independent of the issue at hand.

Generally the GHC story has been, lacking a good way forward, to wait until someone has a neat idea, rather than to implement a hack that we later regret. That's where I am on this one at the moment.

I agree. I will try to come up with ideas until I run out of them or a neat one appears, if you don’t mind :-)

comment:28 follow-up: Changed 19 months ago by simonpj

I meant "Your previous comment makes good points"!

comment:29 in reply to: ↑ 28 Changed 19 months ago by sweirich

Sorry to jump in late... but I agree with Simon. Exposing the FC cast operation to the user---just for newtypes---seems to be the simplest solution to the problem. If a client of the module doesn't know that Age is a newtype for Int, why would the client expect "map MkAge?" to be an identity function? Newtypes *already* have special semantics, so enhancing that seems better than introducing something that unpredictably applies.

Furthermore, what Simon is proposing is just convenient syntax for what is already possible with generalized newtype deriving. (And so should be viewed with the same suspicion.) For example, we can cast a list of types thus:

{-# LANGUAGE GeneralizedNewtypeDeriving, MultiParamTypeClasses #-}

class Castable a b where
    cast  :: c a -> c b
    cast' :: c b -> c a
instance Castable Int Int where
    cast  = id
    cast' = id
newtype Age = MkAge Int deriving (Castable Int)

x :: [Int]
x = [1 .. 10]

y :: [Age]
y = cast x

However, when the type to cast is not the last argument to the data structure, the process requires the definition of yet another newtype.

newtype E1 b a = E1 { unE1 :: Either a b }

w :: Either Int Int -> Either Age Int 
w = unE1 . cast . E1

Having this cast around may be convenient for library writers who want to freely coerce between abstract and concrete types (although it is no substitute for a real module system :-).

comment:30 Changed 19 months ago by nomeata

No new idea on the syntax, just a small note: If this gets resolved, the fmap that DerivingFunctor? generates should probably also automatically get such a RULE.

comment:31 Changed 15 months ago by liyang

  • Cc hackage.haskell.org@… added

comment:32 Changed 10 months ago by nomeata

Today I got the chance to talk to SPJ in person about this and we discussed a few variants to solve this. One promising variant, not much different from sweirich’s comment, is now summarized at ExposingNewtypeCoercions.

comment:33 Changed 10 months ago by nomeata

I just noticed that NewtypeWrappers is not linked from here, so I missed it at first.

comment:34 Changed 10 months ago by nomeata

Oh well, merged everything into NewtypeWrappers. Should have searched first before typing away.

comment:35 Changed 7 months ago by bgamari

  • Cc bgamari@… added

comment:36 Changed 7 months ago by nomeata

Please allow me to use this as a notepad; this way I (or someone else) can easily find the thoughts later.

With Coercible in GHC, one might want that

{-# RULES
"myMap/coerce"  myMap coerce = coerce
 #-}

will be a usable rule that would also fire in

foo :: [Int] -> [Age]
foo = myMap Age

At the moment, it does not do that yet. Here is the rule (as shown by -ddump-rules, with -O):

==================== Tidy Core rules ====================
"myMap/coerce" [ALWAYS]
    forall (@ t) (@ t1) ($dCoercible :: GHC.Types.Coercible t t1).
      Test.myMap @ t @ t1 (GHC.Prim.coerce @ t @ t1 $dCoercible)
      = GHC.Prim.coerce
          @ [t]
          @ [t1]
          (case $dCoercible of _ { GHC.Types.MkCoercible ds ->
           GHC.Types.MkCoercible @ [t] @ [t1] @~ [ds]_R
           })

and here is the Core code the rule needs to match (-ddump-simpl, with -O):

Test.foo1 :: GHC.Types.Int -> GHC.Types.Int
[GblId,
 Arity=1,
 Caf=NoCafRefs,
 Str=DmdType <S,1*U(U)>m,
 Unf=Unf{Src=InlineStable, TopLvl=True, Arity=1, Value=True,
         ConLike=True, WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(unsat_ok=True,boring_ok=True)
         Tmpl= \ (tpl_B1 [Occ=Once] :: GHC.Types.Int) -> tpl_B1}]
Test.foo1 = \ (tpl_B1 :: GHC.Types.Int) -> tpl_B1

Test.foo :: [GHC.Types.Int] -> [Test.Age]
[GblId,
 Arity=1,
 Str=DmdType,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=True,
         ConLike=True, WorkFree=True, Expandable=True,
         Guidance=IF_ARGS [] 20 60}]
Test.foo =
  Test.foo_myMap
    @ GHC.Types.Int
    @ Test.Age
    (Test.foo1
     `cast` (<GHC.Types.Int>_R -> Sym Test.NTCo:Age[0]
             :: (GHC.Types.Int -> GHC.Types.Int)
                  ~#
                (GHC.Types.Int -> Test.Age)))

Note that in the rule’s LHS, coerce was not unfolded, while it is in the core. Also note that the rule talks about a Coercion value that we do not have.

Here is one plan that might work. If a rule has a binder $d of type Coercible a b, then the rule is changed to have a binder c of type a ~R# b, and every occurrence of $d is replaced by MkCoercible a b c. So the rule above, (after simplification of the RHS) would read

"myMap/coerce" [ALWAYS]
    forall (@ t) (@ t1) (c :: t ~R# t1).
      Test.myMap @ t @ t1 (GHC.Prim.coerce @ t @ t1 (GHC.Types.MkCoercible @ t @ t1 @~ c)
      = GHC.Prim.coerce
          @ [t]
          @ [t1]
          (GHC.Types.MkCoercible @ [t] @ [t1] @~ [c]_R)

The next step is to simplify occurrences of coerce @ a @ b (MkCoercible @ a @ b @~ c) and replace them by (id cast c):

"myMap/coerce" [ALWAYS]
    forall (@ t) (@ t1) (c :: t ~R# t1).
      Test.myMap @ t @ t1 (id `cast` c)
      = GHC.Prim.coerce
          @ [t]
          @ [t1]
          (GHC.Types.MkCoercible @ [t] @ [t1] @~ [c]_R)

which (possibly after inlining id) with some reasonable hope matches the actual code.

I especially like how the specializing of Coercible values to ~R# values allows to use the matched coercion (which could even come from a unsafeCoerce!) in the RHS of the rule, instead of completely throwing it away and firing only if the coercion on the RHS can be found by the type checker.

comment:37 Changed 7 months ago by nomeata

Alternative, similar idea: Leave the RULE as it is, but be smart when matching (GHC.Prim.coerce @ t @ t1 $dCoercible) and if the expression is of the form (\x -> x) `cast` c, succeed with a substitution $dCoercible = MkCoercion @ t1 @ t2 @~ c. The RHS would then look a bit convoluted, but the simplifier surely cleans that up.

comment:38 Changed 7 months ago by nomeata

I have a first working prototype, pushed to the wip/nomeata-T2110 branch. Comments how to do that properly and non-hacky are welcome.

I ended up implementing the first idea, and this is what works: In the file

module Test where

import GHC.Exts
import Unsafe.Coerce

{-# RULES
"map/coerce"  map coerce = coerce
 #-}

newtype Age = Age Int

foo :: [Int] -> [Age]
foo = map Age
bar :: [Int] -> [Age]
bar = map coerce
baz :: [Int] -> [Age]
baz = map unsafeCoerce

the rule will be desugared to

------ Local rules for imported ids --------
"map/coerce" [ALWAYS]
    forall (@ a_aET)
           (@ b_aEU)
           ($dCoercible_aEO :: a_aET GHC.Prim.~R# b_aEU).
      GHC.Base.map @ a_aET
                   @ b_aEU
                   ((\ (tpl_B1 :: a_aET) -> tpl_B1)
                    `cast` (<a_aET>_R -> ($dCoercible_aEO)
                            :: (a_aET -> a_aET) ~# (a_aET -> b_aEU)))
      = GHC.Prim.coerce
          @ [a_aET]
          @ [b_aEU]
          (GHC.Types.MkCoercible
             @ [a_aET] @ [b_aEU] @~ [($dCoercible_aEO)]_R)

and this indeed matches all three calls to map:

Test.bar1 :: [GHC.Types.Int] -> [GHC.Types.Int]
Test.bar1 = \ (tpl_B2 :: [GHC.Types.Int]) -> tpl_B2

Test.foo :: [GHC.Types.Int] -> [Test.Age]
Test.foo =
  Test.bar1
  `cast` (<[GHC.Types.Int]>_R -> [Sym Test.NTCo:Age[0]]_R
          :: ([GHC.Types.Int] -> [GHC.Types.Int])
               ~#
             ([GHC.Types.Int] -> [Test.Age]))

Test.bar :: [GHC.Types.Int] -> [Test.Age]
Test.bar =
  Test.bar1
  `cast` (<[GHC.Types.Int]>_R -> [Sym Test.NTCo:Age[0]]_R
          :: ([GHC.Types.Int] -> [GHC.Types.Int])
               ~#
             ([GHC.Types.Int] -> [Test.Age]))

It does not work yet for unsafeCoerce, as for some reason that function is not unfolded:

Test.baz =
  GHC.Base.map
    @ GHC.Types.Int
    @ Test.Age
    ((Unsafe.Coerce.unsafeCoerce1 @ GHC.Types.Int @ Test.Age)
     `cast` (<GHC.Types.Int>_R -> UnivCo R GHC.Types.Int Test.Age
             :: (GHC.Types.Int -> GHC.Types.Int)
                  ~#
                (GHC.Types.Int -> Test.Age)))

but if it were, it would work as well :-)

I find this is quite a nice result so far, the user-facing part is slick and simple, and it shows that the Coercible design is at least usable. Should this direction be further pursued?

(PS: Special-casing coerce in Rules.match is difficult, as Rules is a very basic module, imported for example in MkId which defines coerceId.)

comment:39 Changed 7 months ago by nomeata

Message by SPJ, who is writing mails on planes:

I’m not convinced about your solution. It seems a bit of a hack. Moreover, the very same issue arises for ordinary type classes, so I’d prefer to solve it for all of them at once. Finally, the very existence of Coercible is going to eliminate most of these (map Age) calls anyhow. Using them, and then a RULE to optimise them away, seems perverse.

Do paste this email into the ticket... I’m on a plane.

Suppose f :: (a->a->a) -> [a] -> [a], and fGr :: [a] -> [a]

Now consider

{-# RULE “foo”   f (>) = fGr    #-}

which would translate to the following CoreRule?:

          forall a, d:Ord a.
                    f a ((>) a d) = fGr

If we saw a call to f in a polymorphic context, the rule would fire

          g :: Ord a => [a] -> blah
          g xs = ...(f (>) xs)....

But if we specialise g with a type signature, suddenly the rule wouldn’t fire:

          g :: [Int] -> blah
          g xs = ...(f (>) xs)....

Why? Because the (>) would simplify to grInt, which isn’t what the RULE looks for. This is similar to your use of coerce simplifying too eagerly.

The right solution seems to be to give the RULE a chance to fire before inlining/simplifying calls to (>). One way to do that is to prevent the ClassOp rule (defined in MkId) from firing in the “gentle” phase, which is always the first simplifier phase. I’m not sure of the consequences of that; see Note [RULEs enabled in SimplGently] in CoreMonad.

Moreover currently BuiltinRules don’t have an Activation to say when they should be active. This would be easy to change though.

comment:40 Changed 3 months ago by nomeata

I’m not convinced about your solution. It seems a bit of a hack.

I’m just re-reading this thread, and it seems that my proposed solution (replace Coercible in a rule with ~R#, and inline coercoe) is precisely what we had concluded last week – is that right or was there a misunderstanding on my side?

comment:41 Changed 3 months ago by nomeata

I looked into this and further cleaned up the branch. But I just cannot get it to optimize map unsafeCoerce away, and the problem seems to be not enough eta-expansion in the simplifier.

So here is the rule as desugared; it has the shape that we want it to have

------ Local rules for imported ids --------
"map/coerce" [ALWAYS]
    forall (@ a) (@ b) ($r$dCoercible :: a GHC.Prim.~R# b).
      GHC.Base.map @ a
                   @ b
                   ((\ (tpl :: a) -> tpl)
                    `cast` (<a>_R -> ($r$dCoercible) :: (a -> a) ~# (a -> b)))
      = (\ (tpl :: [a]) -> tpl)
        `cast` (<[a]>_R -> [($r$dCoercible)]_R
                :: ([a] -> [a]) ~# ([a] -> [b]))

And here is the Core that we want it to match:

  GHC.Base.map
    @ GHC.Types.Int
    @ Test.Age
    ((Unsafe.Coerce.unsafeCoerce1 @ GHC.Types.Int @ Test.Age)
     `cast` (<GHC.Types.Int>_R
             -> UnivCo representational GHC.Types.Int Test.Age
             :: (GHC.Types.Int -> GHC.Types.Int)
                  ~#
                (GHC.Types.Int -> Test.Age)))

And for reference, here is the information on unsafeCoerce1:

  unsafeCoerce1 :: forall a b. a -> a
    {- Arity: 1, HasNoCafRefs, Strictness: <S,1*U>,
       Unfolding: (\ @ a @ b x :: a -> x) -}

When trying to match, the matcher successfully matches the casts and then tries to match

    \ (tpl{v} [lid] :: a{tv aMF} [tv]) -> tpl{v} [lid]

against

    base:Unsafe.Coerce.unsafeCoerce1{v rKi} [gid]
      @ ghc-prim:GHC.Types.Int{(w) tc 3J} @ main:Test.Age{tc rM6}

It eta-expands both sides to match

    tpl{v} [lid]

against

    base:Unsafe.Coerce.unsafeCoerce1{v rKi} [gid]
      @ ghc-prim:GHC.Types.Int{(w) tc 3J}
      @ main:Test.Age{tc rM6}
      tpl{v} [lid]

and there it fails.

Possible solutions: If GHC would inline unsafeCoerce1 even though it is undersaturated, we’d be matching

  GHC.Base.map
    @ GHC.Types.Int
    @ Test.Age
    ((\ x :: GHC.Types.Int -> x)
     `cast` (<GHC.Types.Int>_R
             -> UnivCo representational GHC.Types.Int Test.Age
             :: (GHC.Types.Int -> GHC.Types.Int)
                  ~#
                (GHC.Types.Int -> Test.Age)))

which would succeed (this is also the shape we get from map Age). But I did not manage to define unsafeCoerce in a way that this happens.

Alternatively, the matching code could, after eta-expanding, simplify the expression, so that the unfolding would happen here.

Otherwise, my code is ready for review, see branch wip/nomeata-T2110.

comment:42 Changed 3 months ago by nomeata

  • Status changed from new to patch

With master open for new stuff, I’d like to get this there, preferably within the next one or two weeks, to reduce the number of my patches that are floating around, so I’m marking this as please review.

Open tasks are

  • Fixing the matching against unsafeCoerce (needs thought)
  • Actually adding rules to the libraries.
  • Documenting this somewhere

I’ll open separate tickets for these once this one is in.

comment:43 Changed 3 months ago by goldfire

  • Cc eir@… added

comment:44 Changed 2 months ago by monoidal

  • Milestone changed from 7.6.2 to 7.10.1

comment:45 Changed 2 months ago by nomeata

I changed the definition of coerce back to the naive one, and aimed at making the rule matcher smarter. I did this by adding a function exprIsLambda_maybe which is used when a rule template contains a lambda.

Previously, the code would either use a Lambda, if its there, or eta-expand both sides otherwise. The latter seemed to be somewhat dangerous (could waste work, although I could not produce a example for that due to later CSE and other effects), and did not help in the case discussed here.

So instead I try to „make the expression“ into a lambda. Either it already is a lambda, or it is a (nested) application of something that has a currently active unfolding which is unsaturated: In that case, unfold it, simpleOptExpr and recursively look for further lambdas. Any any cast occurring while doing so will be pushed inside the lambda.

With that change, matching unsafeCoerce works as well!

comment:46 Changed 2 months ago by Joachim Breitner <mail@…>

In b4715d67ae90f4cc847daa94f0fc056a40057d65/ghc:

Replace forall'ed Coercible by ~R# in RULES

we want a rule "map coerce = coerce" to match the core generated for
"map Age" (this is #2110).

comment:47 Changed 2 months ago by Joachim Breitner <mail@…>

In 377672ae068f6dbfa0354dfab95f41bdd26b0df4/ghc:

Test case for RULE map coerce = coerce

(This tests #2110.)

comment:48 Changed 2 months ago by nomeata

  • Resolution set to fixed
  • Status changed from patch to closed

Ok, pushed in patches

git log b4715d67^...cde88e20

Patch f4fb94f3 has a small improvement of compiler performance as a side-effect (which I did not investigate further).

The ask of actually adding such rules is tracked in #8767.

comment:49 Changed 5 weeks ago by thoughtpolice

  • Test Case set to tests/simplCore/should_run/T2110
Note: See TracTickets for help on using tickets.