to make a MkT hold strict functions only. Anyone unpacking a MkT can assume the function is strict; and anyone building a MkT gets a strictness wrapper aound whatever function they supply, so even if they supply a lazy function, it's made strict.
Seems like a natural generalisation of the existing strictness and UNPACK mechanism for data types.
Lots of details in the thread above.
Simon
Edited
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related or that one is blocking others.
Learn more.
seems tricky. I hope it can be done in a way that makes the language feel no more complicated. e.g.
type F = !Int -> Foodata Foo where C :: F
seems fine and similar to something I've wanted to do (but didn't try) (when there were multiple similar signatures in my GADT). The explicit signature on a thing (of course, explicit signatures are required for data declarations) is what can coerce it to be strict, I guess.
Strictness annotations in function results don't make sense (for the same reason id is strict -- function results are "already WHNF-strict" as much as it makes sense), and they're already lifted (in the sense that they can possibly produce |) even for unboxed result types (e.g. bot :: () -> Int#; bot x = bot x). But unboxing a result-tuple can result in a loss of sharing, so in
newtype State s a = State { runState' :: s -> (# a, s #) }
seems tricky. I hope it can be done in a way that makes the language feel no more complicated. e.g.
type F = !Int -> Foodata Foo where C :: F
This feature request is not about distinguishing strictness in types as the above would indicate. It's about changing the strictness and data representation of data type constructors that have higher order components.
This feature request is not about distinguishing strictness in types as the above would indicate. It's about changing the strictness and data representation of data type constructors that have higher order components.
While working on a builder monoid for Data.Text I ran in to a problem related to lack of unpacking for continuation arguments. Using these types:
data Buffer s = Buffer {-# UNPACK #-} !(A.MArray s Word16) {-# UNPACK #-} !Int -- offset {-# UNPACK #-} !Int -- used units {-# UNPACK #-} !Int -- length leftnewtype Builder = Builder { runBuilder :: forall s. (Buffer s -> ST s [S.Text]) -> Buffer s -> ST s [S.Text] }
the following module generates bad code with lots of boxing and closures:
module Loop (loop) whereimport Data.Monoid (mappend, mempty)import Data.Text.Lazy.Builderloop :: Int -> Builderloop n = go 0 mempty where go i acc | i < n = go (i+1) (acc `mappend` singleton 'a') | otherwise = acc
T1349.hs:7:41: Not in scope: type constructor or class `A.MArray'T1349.hs:7:52: Not in scope: type constructor or class `Word16'T1349.hs:13:43: Not in scope: type constructor or class `ST'T1349.hs:13:49: Not in scope: type constructor or class `S.Text'T1349.hs:15:20: Not in scope: type constructor or class `ST'T1349.hs:15:26: Not in scope: type constructor or class `S.Text'T1349.hs:22:45: Not in scope: `singleton'
I may have spoken too soon. Switching the loop to use foldr improves the generated core a lot. It still suffers from the extra boxing/unboxing mentioned in the ticket description.
Now I'm unsure what to do. Maybe the rero instructions you give above have been overtaken by your change to foldr? Maybe the extra boxing/unboxing that you are nw seeing is the same as in the description of the ticket, or maybe not?
I'm always keen to have well-characterised examples of missed optimisation opportunities. If you think you have one, is it possible for you to produce a reproducible standalone test case of the situation as you now see it?
It's a bit tricky to come up with a good example as inlining seems to remove the problem for simple cases. Here's an attempt. Given the module:
module Cont (run) wheretype Cont = (Int -> Int) -> Int -> Intg :: Contg k n = let n' = n + 1 in n' `seq` k n'loop :: Int -> Contloop 0 = idloop n = g . loop (n - 1)run = loop 10 id
we get the Core
Cont.$wloop :: Int# -> (Int -> Int) -> Int -> IntCont.$wloop = \ (ww_shX :: Int#) -> case ww_shX of ds_Xhe { __DEFAULT -> let { g_si5 :: (Int -> Int) -> Int -> Int g_si5 = Cont.$wloop (-# ds_Xhe 1) } in \ (x_ahg :: Int -> Int) -> let { k_si7 [ALWAYS Just L] :: Int -> Int k_si7 = g_si5 x_ahg } in \ (n_adm :: Int) -> case n_adm of _ { I# x1_ahv -> k_si7 (I# (+# x1_ahv 1)) }; 0 -> id @ (Int -> Int) }Cont.run :: Int -> IntCont.run = Cont.$wloop 10 (id @ Int)
However, manual unboxing
{-# LANGUAGE MagicHash #-}module Cont2 (run) whereimport GHC.Primimport GHC.Basetype Cont = (Int# -> Int) -> Int# -> Intg :: Contg k n = k (n +# 1#)loop :: Int -> Contloop 0 = idloop n = g . loop (n - 1)run = loop 10 (\i -> (I# i))
gives
Cont2.$wloop :: Int# -> (Int# -> Int) -> Int# -> IntCont2.$wloop = \ (ww_shs :: Int#) -> case ww_shs of ds_XgX { __DEFAULT -> let { g_shA :: (Int# -> Int) -> Int# -> Int g_shA = Cont2.$wloop (-# ds_XgX 1) } in \ (x_agX :: Int# -> Int) -> let { k_shC [ALWAYS Just L] :: Int# -> Int k_shC = g_shA x_agX } in \ (n_adn :: Int#) -> k_shC (+# n_adn 1); 0 -> id @ (Int# -> Int) }Cont2.run :: Int# -> IntCont2.run = Cont2.$wloop 10 I#
I'd like to be able to express that Cont's first parameter (of type Int -> Int) is strict in its first (and only) arguments and that I want it unboxed.
OK I get the idea. But there are a number of corners. For example:
If I have a function of type (!Int -> Int) can I pass it to a function expecting an argument of type (Int -> Int). Perhaps yes, with some automatic impedence matching?
If there is automatic impedence matching might I lose sharing? Example: suppose
f :: Int -> !Int -> Intg :: (Int -> Int -> Int) -> Int
If GHC sees (g f) in source code, should it transform to (g (\xy. case y of I# y# -> g x y#))? Tha might lose sharing if (f x) did a lot of computation before returning a function.
Does the type !a -> a make sense? It does for data constructors.
An implementation question is this: how should these bang types be represented? Is there a new type constructor for "!"? Is (!Int -> Int) different to (Int# -> Int) or not?
I thought I might share my story too, as I ran into this as well, and got aware of this ticket.
3 years after Duncan wrote his mail about the need for speed in binary I ran into the same trouble, again with the binary package.
Duncan wrote about the Put module, which creates lazy Bytestrings from an DSL.
The Get module reads such Bytestrings with a similar DSL.
I've rewritten the Get module to use CPS, which allows some useful tricks when parsing.
The Get data type from the non-CPS Get module looks like this:
-- | The parse statedata S = S {-# UNPACK #-} !B.ByteString -- current chunk L.ByteString -- the rest of the input {-# UNPACK #-} !Int64 -- bytes read-- | The Get monad is just a State monad carrying around the input ByteStringnewtype Get a = Get { unGet :: S -> (# a, S #) }
While the CPS version looks like this (heavily influenced by attoparsec):
data S = S { input :: !B.ByteString , next_input :: !B.ByteString , read_all :: !Bool } deriving Showdata Result a = Fail S [String] String | Partial (Maybe B.ByteString -> Result a) | Done S atype Failure r = S -> [String] -> String -> Result rtype Success a r = S -> a -> Result r-- unrolled codensity/state monadnewtype Get a = C { runCont :: forall r. S -> Failure r -> Success a r -> Result r }
Unfortunately, this yields terrible performance.
The old Get gives me about 26mb/s (Core 2 Duo, 1.6Ghz) when reading chunks of 1 byte at a time, while the CPS version only gives up to 1mb/s! Much time seems to be spent doing GC, as the bigger Bytestring I read the worse performance I get.
After reasoning a bit about what GHC actually does, I unpacked the state into the Get data type:
type Failure r = B.ByteString -> B.ByteString -> Bool -> [String] -> String -> Result rtype Success a r = B.ByteString -> B.ByteString -> Bool -> a -> Result r-- unrolled codensity/state monadnewtype Get a = C { runCont :: forall r. B.ByteString -> B.ByteString -> Bool -> Failure r -> Success a r -> Result r }
This at least gave me 12mb/s (I think I've even been able to get the same 26mb/s, when further trying to optimize). The price to pay is that the code is now not as maintainable, it truly looks horrible.
So it still seems this ticket is important in some cases.
Yes, this should be allowed, by analogy to data constructors: if I write a function \c x -> c x, we don't mind if the constructor c is strict or lazy in x. Now, if we want to be able to give assertions about strictness, this is less clear, but right now these sorts of bang-patterns change semantics, so that's a different tree to bark up.
I don't know how feasible this is, but one way to work around the sharing issue is to use the arity analyzer to figure out if we need to insert intermediate lambdas in the case that f x of f x y does work. It's kind of hacky though...
Yes, definitely. It doesn't make too much sense for a polymorphic variable, but !() -> () does in fact give more information than () -> ().
If we think the current design for bang patterns in data constructors is right, we use HsBangTy, erase it for any typechecking, and make sure it gets added to the right operational places. So they probably should be treated as different.
At least in the monomorphic case for data constructors I think I can see how to make this work. Let's review ordinary UNPACK pragmas as they are today, and let's just consider single-constructor types for now. Then when I have, say,
data T = C {-# UNPACK #-} !(Int,Int)
the UNPACK affects (a) construction of C and (b) pattern matching against C. I'll express it like this:
data T = Cw Int IntmakeC :: (Int,Int) -> TmakeC x = case x of (a,b) -> Cw a bmatchC :: T -> ((Int,Int) -> r) -> rmatchC x k = case x of { Cw a b -> k (a,b) }
Here
The data type T is ultimately represented by a data type with a data constructor Cw, with two fields.
A use of C to construct a value is tranformed to a call of makeC. The function makeC constructs a value of type T; its signature is what the programmer expects for his constructor C.
A use of C in pattern matching is transformed to a call of matchC. The function matchC matches against a value of type T, and applies its continuatoin k to an (Int,Int) pair, just as the programmer expects.
So the makeC/matchC pair completely express what it means to build and pattern match against constructor C. The constructor C is not really a constructor at all; it's a view on the underlying data constructor Cw.
Now for kolmodin's type
newtype Get a = C { runCont :: forall r. S -> Failure r -> Success a r -> Result r }
I think you would really like to say
newtype Get a = C { runCont :: forall r. {-# UNPACK #-} !S -> Failure r -> Success a r -> Result r }
And what might that mean? Presumably it means the function stored in the data type has its arguments unpacked. So the translation is
newtype Get a = Cw { runCont :: forall r. B.ByteString -> B.ByteString -> Bool -> Failure r -> Success a r -> Result r }makeC :: (forall r. S -> Failure r -> Success a r -> Result r) -> Get amakeC f = Get (\i ni ra -> f (S i ni ra))matchC :: Get a -> ((forall r. S -> Failure r -> Success a r -> Result r) -> r') -> r'matchC (C f) = k (\(S i ni ra) -> f i ni ra)
Or something like that. The point is that the UNPACK and ! pragmas control the generation of makeC and matchC, in a way that we can pin down quite precisely.
Things to notice
The UNPACK stuff might be recursive. For example, the ByteString arguments of S are themselves given UNPACK pragmas, so probably the real of C is more like this:
newtype Get a = Cw { runCont :: forall r. Addr# -> Int# -> Int# -> Addr# -> Int# -> Int# -> Bool -> Failure r -> Success a r -> Result r }
Now, whether you really want this I'm not sure, but that is the natural thing.
I have no idea what to do in the polymorphic case, when you can't "see" the type of the argument to be UNPACKed. But that is just as now.
As always with UNPACK, there's a danger that we'll do unnecessary unboxing and reboxing. For example, with the data type T above, every type we match against C we'll build a pair (a,b) (just look at matchC). If the body of the pattern match uses the pair entire, rather than deconstructing it, there's a danger that we'll build the pair lots of times (once for each match) rather than once (when we build the C value). And its no different for this function-UNPACK stuff.
All of this is for constructor arguments. I'm vastly less sure of the merits of this kind of stuff for ordinary functions.
Simon: if the UNPACKed thing was not the first argument e.g.:
newtype Get a = C { runCont :: forall r. Failure r -> {-# UNPACK #-} !S -> Success a r -> Result r }
We might naively expand to this:
makeC :: (forall r. Failure r -> S -> Success a r -> Result r) -> Get amakeC f = Get (\fail i ni ra -> f fail (S i ni ra))
Could we not avoid a loss of sharing by just inserting a lot of intermediate bindings:
newtype Get a = Cw { runCont :: forall r. Failure r -> B.ByteString -> B.ByteString -> Bool -> Success a r -> Result r }makeC :: (forall r. Failure r -> S -> Success a r -> Result r) -> Get amakeC f = Get (\i ni ra -> f (S i ni ra))makeC f = Get (\fail -> let ffail = f fail in \i ni ra -> ffail (S i ni ra))
The simplifier will inline the wrapper and clean up any intermediate bindings that can be proved to share no expensive work. This seems to solve the work sharing issue?