Changes between Version 12 and Version 13 of ViewPatterns


Ignore:
Timestamp:
Jan 24, 2007 2:02:06 PM (9 years ago)
Author:
simonpj@…
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ViewPatterns

    v12 v13  
    1 [[PageOutline]]
    21= View patterns: lightweight views for Haskell =
    32
    4 This page describes a rather lightweight proposal for adding views to
    5 Haskell Prime.  I'm thinking of prototyping the idea in GHC, so I'm looking
    6 for feedback.
     3This page has [http://hackage.haskell.org/trac/ghc/wiki/ViewPatterns moved to the GHC wiki], so that anyone can edit it.
    74
    8 == The problem ==
    9 
    10 We are keen on abstraction, but pattern matching is so convenient that
    11 we break abstractions all the time.  It's our dirty little secret.
    12 Looked at this way, object-oriented folk are much more obsessive
    13 about abstraction than we are: everything (including field access
    14 these days) is a method.
    15 
    16 Views have, in one form or another, repeatedly been proposed as a
    17 solution for this problem.   (See the end for a comparison with related work.)
    18 
    19 == The proposal informally ==
    20 
    21 The proposal introduces a new form of pattern, called a '''view pattern'''
    22 Here are some function definitions using view patterns.
    23 To read these definitions, imagine that `sing` is
    24 a sort of constructor that matches singleton lists.
    25 {{{
    26   f :: [Int] -> Int
    27   f (sing -> n) = n+1   -- Equiv to: f [n] = ...
    28   f other     = 0
    29 
    30   g :: [Bool] -> Int
    31   g (sing -> True)  = 0         -- Equiv to: g [True] = ...
    32   g (sing -> False) = 1         -- Equiv to: g [False] = ...
    33   g other           = 2
    34 
    35   h :: [[Int]] -> Int   
    36   h (sing -> x : sing -> y : _) = x+y
    37                         -- Equiv to: h ([x]:[y]:_) = ...
    38   h other = 0
    39 }}}
    40 So what is `sing`?  It is just an ordinary Haskell function that
    41 returns a `Maybe` type:
    42 {{{
    43   sing :: [a] -> Maybe a
    44   sing [x]   = Just x
    45   sing other = Nothing
    46 }}}
    47 So `sing` simply identifies singleton lists, and returns the payload (that is,
    48 the singleton element; otherwise it returns `Nothing`.
    49 It is very important that '''there is nothing special about `sing`'''.  It is
    50 not declared to be a view; it can be called as a normal Haskell function; the author
    51 of `sing` might not have intended it to be used in pattern matching. 
    52 
    53 ---------------------------
    54 == The proposal more formally ==
    55 
    56 The only special stuff is in the pattern. 
    57 The sole change is this: add a single new sort of pattern, of the
    58 form
    59         (''expr'' `->` ''pat'')
    60 
    61 where ''expr'' is an arbitrary Haskell expression.   I'll call a pattern
    62 of this form a '''view pattern'''.
    63 
    64 From a '''scoping''' point of view, the variables bound by the pattern (''expr'' `->` ''pat'')
    65 are simply the variables bound by ``pat``.
    66 Any variables in ``expr`` are bound occurrences.
    67 
    68 The rule for '''pattern-matching''' is this:
    69 To match a value ''v'' against a pattern ''(expr -> p)'',
    70   * Evaluate ''(expr v)''
    71   * If the result is ''(`Just` w)'', match ''w'' against ''p''
    72   * If the result is `Nothing`, the match fails.
    73 
    74 The '''typing rule''' is similarly simple. 
    75 The expression ''expr'' must have type
    76 ''t1 `-> Maybe` t2''. Then the pattern ''pat'' must have type ''t2'', and the
    77 whole pattern (''expr'' `->` ''pat'') has type ''t1''.
    78 
    79 === The value input feature ===
    80 
    81 Note that the ''expr'' is an arbitrary Haskell expression. For example, suppose you wrote
    82 a regular expression matching function:
    83 {{{
    84   regexp :: String -> String -> Maybe (String, String)
    85   -- (regexp r s) parses a string matching regular expression r
    86   --    the front of s, returning the matched string and remainder of s
    87 }}}
    88 then you could use it in patterns thus:
    89 {{{
    90   f :: String -> String
    91   f (regexp "[a-z]*" -> (name, rest)) = ...
    92 }}}
    93 Of course, the argument does not need to be a constant as it is here.
    94 
    95 '''This ability to pass arguments to the view function, to guide its matching
    96 behaviour, is a key feature of this proposal,''' shared by some, but by no means
    97 all view proposals.  I'll call it the '''value input feature'''. 
    98 
    99 Indeed, in a sense, patterns become first class.  For example, one could pass a pattern
    100 as an argument to a function thus:
    101 {{{
    102   g :: (Int -> Maybe Int) -> Int -> ...
    103   g p (p -> x) = ...
    104 }}}
    105 Here the first argument `p` can be thought of as a pattern passed to `g`, which
    106 is used to match the second argument of `g`.
    107 
    108 === Possible extension 1: multi-argument view patterns ===
    109 
    110 It would be quite useful to allow more than one sub-pattern in a view
    111 pattern.  To do this we'd need a `Maybe` data type that returns more than
    112 one result, thus:
    113 {{{
    114   data Maybe2 a b   = Nothing2 | Just2 a b
    115   data Maybe3 a b c = Nothing3 | Just3 a b c
    116         -- ..etc..., up to 8 perhaps (sigh)
    117 }}}
    118 With this in hand we can extend the views story to have multiple sub-patterns.
    119 Example:
    120 {{{
    121   snoc :: [a] -> Maybe2 [a] a
    122   snoc [] = Nothing2
    123   snoc (x:xs) = case snoc xs of
    124                   Nothing2   -> Just2 [] x
    125                   Just2 ys y -> Just2 (x:ys) y
    126 
    127   last :: [Int] -> Int
    128   last (snoc -> xs x) = x
    129   last other = error "empty list"
    130 }}}
    131 It is tiresome that we need types `Maybe2`, `Maybe3` etc, but we already have
    132 that in Haskell; consider `zip3`, `zip4` and so on.
    133 We could always get away without it, by sticking to unary view patterns and
    134 using tuples, thus:
    135 {{{
    136   snoc :: [a] -> Maybe ([a], a)
    137   snoc [] = Nothing
    138   snoc (x:xs) = case snoc xs of
    139                   Nothing     -> Just ([], x)
    140                   Just (ys,y) -> Just (x:ys, y)
    141 
    142   last :: [Int] -> Int
    143   last (snoc -> (xs, x)) = x
    144   last other = error "empty list"
    145 }}}
    146 But the tuple looks a bit clumsy.
    147 
    148 Under this proposal, the number of sub-patterns in the view pattern determines
    149 which return type the view function should have.  E.g. in the pattern '(e -> p1 p2 p3)',
    150 'e' should return a `Maybe3`.
    151 
    152 If n=0, then we want `Maybe0`, which is called `Bool`.  Thus
    153 {{{
    154   even :: Int -> Bool
    155   even n = n `div` 2 == 0
    156 
    157   f (even ->) = ...     -- Matches even numbers
    158   f other     = ...
    159 }}}
    160 Here `even` is used as a nullary view pattern, with no sub-patterns
    161 following the `->`.
    162 
    163 === Possible extension 2: the implicit `Maybe`  ===
    164 
    165 Thus far, the view function is required to return a `Maybe` type, with `Nothing` to indicate match
    166 failure.  An alternative, presented in the Erwig paper on transformational patterns (see Related work below),
    167 this implicit matching is not performed; instead, the sub-pattern is matched against
    168 whatever the view function returns.  So you'd have to write:
    169 {{{
    170 f (snoc -> Just2 xs x) = ...
    171 }}}
    172 (Note the tiresome `Just2`.)
    173 The benefit of not having the implicit matching is that you can write functions that are, perhaps,
    174 more view-like.  Example:
    175 {{{
    176 data Product = ....some big data type...
    177 
    178 data Size = Small | Medium | Big        -- View type
    179 prodSize :: Product -> Size
    180 prodSize = ....
    181 
    182 f :: Product -> ...
    183 f (prodSize -> Small)  = ...
    184 f (prodSize -> Medium) = ...
    185 f (prodSize -> Big)    = ...
    186 }}}
    187 With the built-in `Maybe` proposal, you'd instead write something like this:
    188 {{{
    189 smallProd, medProd, bigProd :: Product -> Bool
    190 smallProd p = ...
    191 medProd   p = ...
    192 bigProd   p = ...
    193 
    194 f :: Product -> ...
    195 f (smallProd ->) = ...
    196 f (medProd   ->) = ...
    197 f (bigProd   ->) = ...
    198 }}}
    199 This is not obviously worse, except that the first version is more
    200 obviously exhaustive.  Incidentally, both should generate the same
    201 code.
    202 
    203 I can think of three alternatives:
    204  * The `Maybe` stuff is built-in. This is the main proposal, because I think it is often exactly what you want.
    205  * No built-in `Maybe` stuff.  Arguably this is more consistent with pattern-guards.
    206  * Both are available, with different syntax.  For example
    207     * ''(expr `->` pat)'' for the built-in `Maybe` story
    208     * ''(expr `=>` pat)'' with no bulit-in `Maybe`
    209  
    210 == Efficiency ==
    211 
    212 View patterns can do arbitrary computation, perhaps expensive.  So
    213 it's good to have a syntactically-distinct notation that reminds the
    214 programmer that some computation beyond ordinary pattern matching may
    215 be going on.
    216 
    217 It's reasonable to expect the compiler to avoid repeated computation when
    218 pattern line up in a column:
    219 {{{
    220   f (snoc -> x xs) True  = ...
    221   f (snoc -> x xs) False = ...
    222 }}}
    223 In pattern-guard form, common sub-expression should achieve the same
    224 effect, but it's quite a bit less obvious.  We should be able to give
    225 clear rules for when the avoidance of repeat computation is
    226 guaranteed.
    227 
    228 
    229 ---------------------------
    230 == More examples ==
    231 
    232 === Erlang-style parsing ===
    233 
    234 The expression to the left of the `->` can mention variables bound in earlier patterns.
    235 For example, Sagonas et al describe an extension to Erlang that supports pattern-matching on bit-strings ([http://user.it.uu.se/~kostis/Papers/index.html#Conference "Application, implementation and performance evaluation of bit-stream programming in Erlang", PADL'07]).  Suppose we had a parsing function thus:
    236 {{{
    237   bits :: Int -> ByteString -> Maybe2 Word ByteString
    238   -- (bits n bs) parses n bits from the front of bs, returning
    239   -- the n-bit Word, and the remainder of bs
    240 }}}
    241 Then we could write patterns like this:
    242 {{{
    243   parsePacket :: ByteString -> ...
    244   parsePacket (bits 3 -> n (bits n -> val bs)) = ...
    245 }}}
    246 This parses 3 bits to get the value of `n`, and then parses `n` bits to get
    247 the value of `val`. 
    248 
    249 === Sets as lists ===
    250 
    251 Here is a module implementing sets as lists:
    252 {{{
    253 module Set( Set, empty, insert, delete, has) where
    254 
    255   newtype Set a = S [a]
    256  
    257   has :: Eq a => a -> Set a -> Maybe (Set a)
    258   has x (S xs) | x `elem` xs = Just (xs \\ x)
    259                    | otherwise   = Nothing
    260  
    261   delete :: a -> Set a -> Set a
    262   delete x (has x -> s) = s
    263   delete x s            = s
    264  
    265   insert :: a -> Set a -> Set a
    266   insert x s@(has x -> _) = s
    267   insert x (S xs)         = S (x:xs)
    268 }}}
    269 Notice that in the left-hand side `delete x (has x -> s)`, the first `x` is a binding occurrence,
    270 but the second is merely an argument to `has` and is a bound occurrence.
    271 
    272 === N+k patterns ===
    273 
    274 You can test for values.  For example here's a view function that tests for values
    275 greater than or equal to n:
    276 {{{
    277    np :: Num a => a -> a -> Maybe a
    278    np k n | k <= n    = Just (n-k)
    279           | otherwise = Nothing
    280 
    281    f :: Num a => a -> Int
    282    f (np 10 -> n) = 0           -- Matches values >= 10
    283    f (np 4  -> n) = 1           -- Matches values >= 4
    284    f other        = 2
    285 }}}
    286 You will recognise these as (n+k) patterns, albeit with slightly different syntax.
    287 (Incidentally, this example shows that the view function can be overloaded, and
    288 that its use in a view pattern gives rise to a type-class constraint (in this case,
    289 that in turn makes `f` overloaded).
    290 
    291 === Naming constants in one place ===
    292 
    293 We are always taught to write magic numbers, or constants, in one place only.
    294 In C you can write
    295 {{{
    296   #define ERRVAL 4
    297 }}}
    298 and then use `ERRVAL` in `switch` labels.  You can't do that in Haskell, which is tiresome.
    299 But with view pattern you can:
    300 {{{
    301   errVal :: Int -> Bool
    302   errVal = (== 4)
    303 
    304   f (errVal ->) = ...
    305 }}}
    306 
    307 == Concrete syntax ==
    308 
    309 Here are some other possible syntax choices I've considered:
    310 {{{
    311   f ($snoc x xs) = ...          -- Use prefix "$"
    312   g ($(bits 3) x bs) = ...      -- Extra parens for the value input feature
    313 
    314   f (%snoc x xs) = ...          -- Or use prefix "%" instead
    315 
    316   f (.snoc x xs) = ...          -- Or use prefix "." instead
    317 
    318   f (snoc | x xs) = ..          -- Use "|" instead of "->"
    319   g (bits 3 | b bs) = ...
    320 }}}
    321 
    322 I also thought about infix view patterns, where the view function
    323 appears between its (pattern) arguments, but I could not think of a
    324 nice syntax for it, so it is not provided by this proposal.
    325 
    326 == Remarks ==
    327 
    328 The key feature of this proposal is its modesty, rather than its ambition:
    329   * There is no new form of declaration (e.g. 'view' or 'pattern synonym'). 
    330   * The functions used in view patterns are ordinary Haskell functions, and can be called from ordinary Haskell code.  They are not special view functions.
    331   * Since the view functions are ordinary Haskell functions, no changes are needed to import or export mechanisms.
    332   * Both static and dynamic semantics are extremely simple.
    333 It is essentially some simple syntactic sugar for patterns.
    334 However, sometimes modest syntactic sugar can have profound consequences.
    335 In this case, it's possible that people would start routinely hiding
    336 the data representation and exporting view functions instead, which might
    337 be an excellent thing.
    338 
    339 All this could be done with pattern guards.  For example `parsePacket` could be written
    340 {{{
    341   parsePacket bs | Just (n, bs')    <- bits 3 bs
    342                  | Just (val, bs'') <- bits n bs'
    343                  = ...
    344 }}}
    345 Indeed, one might ask whether the extra syntax for view patterns is worth
    346 it when they are so close to pattern guards. 
    347 That's a good question.  I'm hoping that support for view patterns
    348 might encourage people to export view functions (ones with `Maybe`
    349 return types, and encouage their use in patten matching).  That is,
    350 just lower the barrier for abstraction a bit.
    351 
    352 '''Completeness'''.  It is hard to check for completeness of pattern matching; and likewise for
    353 overlap.  But guards already make both of these hard; and GADTs make completness hard too.
    354 So matters are not much worse than before.
    355 
    356 
    357 -------------------------
    358 == Related work ==
    359 
    360 === Wadler's original paper (POPL 1987)] ===
    361 
    362 Wadler's paper was the first concrete proposal.  It proposed a top-level view
    363 declaration, together with functions ''in both directions'' between the view type
    364 and the original type, which are required to be mutually inverse. 
    365 This allows view constructors to be used in expressions
    366 as well as patterns, which seems cool. Unfortunately this dual role proved
    367 problematic for equational reasoning, and every subsequent proposal restricted
    368 view constructors to appear in patterns only.
    369 
    370 === [http://haskell.org/development/views.html Burton et al views (1996)] ===
    371 
    372 This proposal is substantially more complicated than the one above; in particular it
    373 rquires new form of top-level declaration for a view type. For example:
    374 {{{
    375   view Backwards a of [a] = [a] `Snoc` a | Nil
    376      where
    377      backwards [] = Nil
    378      backwards (x:[]) = [] `Snoc` x
    379      backwards (x1:(xs `Snoc` xn)) = (x1:xs) `Snoc` xn
    380 }}}
    381 Furthermore, it is in some ways less expressive than the proposal here;
    382 the (n+k) pattern, Erlang `bits` pattern, and `regexp` examples are not
    383 definable, because all rely on the value input feature.
    384 
    385 I think this proposal is substantially the same as "Pattern matching and
    386 abstract data types", Burton and Cameron, JFP 3(2), Apr 1993.
    387 
    388 === [http://citeseer.ist.psu.edu/okasaki98view.html Okasaki: views in Standard ML] ===
    389 
    390 Okasaki's design is very similar to Burton et al's, apart from differences due
    391 to the different host language.  Again, the value input feature is not supported.
    392 
    393 === [http://citeseer.ist.psu.edu/erwig96active.html Erwig: active patterns] ===
    394 
    395 Erwig's proposal for active patterns renders the Set example like this:
    396 {{{
    397 data Set a = Empty | Add a (Set a)
    398 
    399 pat Add' x _ =
    400   Add y s => if x==y then Add y s
    401              else let Add' x t = s
    402                   in Add x (Add y t)
    403 
    404 delete x (Add' x s) = s
    405 delete x x          = s
    406 }}}
    407 This requires a new top-leven declaration form `pat`; and I don't think it's nearly
    408 as easy to understand as the current proposal.  Notably, in the first equation for
    409 `delete` it's ahrd to see that the second `x` is a bound occurrence; this somehow
    410 follows from the `pat` declaration.
    411 
    412 Still the proposal does support the value input feature.
    413 
    414 === [http://portal.acm.org/citation.cfm?id=232641&coll=portal&dl=ACM Palao et al: active destructors (ICFP'96)] ===
    415 
    416 Active Desctructors (ADs) are defined by a new form of top-level declaration.  Our
    417 singleton example would look like this:
    418 {{{
    419   Sing x match [x]
    420 }}}
    421 Here '''match''' is the keyword, and `Sing` is the AD.  ADs are quite like view patterns:
    422 the can do computation, and can fail to match.  But they are definitely not normal
    423 Haskell functions, and need their own form of top-level declaration.  They even have
    424 a special form of type to describe them.
    425 
    426 The value-input feature is supported, but only via a sort of mode declaration (indicated by a down-arrow) on the
    427 new form of type.
    428 
    429 They also introduce a combining form for ADs, to make a kind of and-pattern.  For
    430 example, suppose we had
    431 {{{
    432   Head x match (x:_)
    433   Tail x match (_:xs)
    434 
    435   f :: [a] -> [a]
    436   f ((Head x)@(Tail ys)) = x:x:ys
    437 }}}
    438 Here `(Head x)@(Tail ys)` is a pattern that matches ''both'' `(Head x)` and `(Tail ys)`
    439 against the argument, binding `x` and `ys` respectively.  We can model that with view patterns,
    440 only a bit more clumsily:
    441 {{{
    442   headV (x:xs) = Just x
    443   headV []     = Nothing
    444 
    445   tailV (x:xs) = Just xs
    446   tailV []     = Nothing
    447 
    448   (@) :: (a -> Maybe b) -> (a -> Maybe c) -> a -> Maybe (b,c)
    449   (f @ g) x = do { b <- f x; c <- g x; return (b,c) }
    450 
    451   f :: [a] -> [a]
    452   f (headV @ tailV -> (x,ys)) = x:x:ys
    453 }}}
    454 The clumsiness is that the "`@`" combines functions, with a kind of positional
    455 binding; the pattern `(x,ys)` is separated from the combiner so that it's less clear
    456 that `headV` binds `x` and `tailV` binds `y`.
    457 
    458 In exchange, although view patterns are a bit less convenient here, they
    459 are a much, much smaller language change than ADs.
    460 
    461 === [http://citeseer.ist.psu.edu/erwig00pattern.html Erwig/Peyton Jones: transformational patterns] ===
    462 
    463 This paper describes pattern guards, but it also introduces '''transformational patterns'''.  (Although
    464 it is joint-authored, the transformational-pattern idea is Martin's.)  Transformational patterns
    465 are very close to what we propose here.  In particular, view functions are ordinary Haskell functions,
    466 so that the only changes are to patterns themselves.
    467 
    468 There are two main differences (apart from syntax).
    469 First, transformational patterns didn't have the value input feature, althought it'd be easy
    470 to add (indeed that's what we've done). Second, transformational patterns as described by
    471 Erwig do no stripping of the `Maybe` (see "Possible extension 2" above).
    472 
    473 === [http://lambda-the-ultimate.org/node/1960 Emir, Odersky, Williams: Matching objects with patterns] ===
    474 
    475 Scala is an OO language with lots of functional features.  It has algebraic data types and
    476 pattern matching.  It also has a form of view called '''extractors''', which are
    477 pretty similar to view patterns, albeit in OO clothing.  Notably, by packaging a constructor
    478 and an extractor in a class, they can use the same class name in both expressions and terms,
    479 implicitly meaning "use the constructor in expressions, and use the extractor in patterns".
    480 
    481 The paper does a comparative evaluation of various OO paradigms for matching, and
    482 concludes that case expressions and extractors work pretty well.
    483 
    484 === Pattern synonyms  ===
    485 
    486 [http://hackage.haskell.org/trac/haskell-prime/wiki/PatternSynonyms Pattern synonyms]
    487 are a requested Haskell Prime feature. John Reppy had the same idea years ago for Standard ML; see
    488 [http://people.cs.uchicago.edu/~jhr/papers/1992/tr-sml-const.pdf Abstract value constructors],
    489 Reppy & Aiken, TR 92-1290, Cornell, June 1992.
    490 
    491 The one way in which pattern synonyms are better than view patterns is that
    492 they define by-construction bi-directional maps.  Example
    493 {{{
    494   data Term = Var String | Term String [Term]
    495  
    496   -- 'const' introduces a pattern synonym
    497   const Plus a b = Term "+" [a,b]
    498 
    499   f :: Term -> Term
    500   f (Plus a b) = Plus (f a) (f b)
    501   f ... = ...
    502 }}}
    503 With pattern views, we'd have to write two functions for the "plus" view:
    504 {{{
    505   plus :: Term -> Term -> Term
    506   plus a b = Term "+" [a,b]
    507 
    508   isPlus :: Term -> Maybe2 Term Term
    509   isPlus (Term "+" [a,b]) = Just2 a b
    510   isPlus other            = Nothing
    511 
    512   f :: Term -> Term
    513   f (isPlus -> a b) = plus (f a) (f b)
    514 }}}
    515 But perhaps that is not so bad.  Pattern synonyms also require a new form of top level declaration;
    516 and are much more limited than view patterns (by design they cannot do computation).
    517 
    518 === First class abstractions ===
    519 
    520 Several proposals suggest first class ''abstractions'' rather that first-class ''patterns''.
    521 Here are the ones I know of
    522 
    523  * [http://hackage.haskell.org/trac/haskell-prime/ticket/114 Claus Reinke's lambda-match proposal]
    524  * [http://citeseer.ist.psu.edu/tullsen00first.html Tullsen: first class patterns]
    525  * [http://ttic.uchicago.edu/~blume/pub-cat.html Matthias Blume: Extensible programming with first-class cases] (ICFP06)
    526 
    527 All these proposals are more or less orthogonal to this one. For example, Reinke
    528 proposes a compositional syntax for lambda abstractions
    529 `(\p -> e)` where pattern matching failure on `p` can be caught and composed
    530 with a second abstraction. Thus
    531 {{{
    532    (| Just x -> x+1 ) +++ (| Nothing -> 0 )
    533 }}}
    534 combines two abstractions, with failure from the first falling through to the seoond. 
    535 
    536 Blume and Tullsen have a similar flavour.  None of these proposals say
    537 anything about the patterns themselves, which in turn is all this
    538 proposal deals with.  Hence orthgonal.
     5The Haskell Prime wiki is only committee-writable, for good reasons, but I'd like to open this proposal to improvement by anyone, since it is not (yet anyway) a serious contender for Haskell Prime.