Opened 10 years ago

Closed 7 years ago

Last modified 3 weeks ago

#667 closed feature request (wontfix)

Efficient Map <-> Set conversions

Reported by: jpbernardy Owned by: jpbernardy
Priority: normal Milestone: Not GHC
Component: libraries/base Version: 6.4.1
Keywords: Data Map Set collections Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Test Case: N/A
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description (last modified by jpbernardy)

We should be able to convert a Set into a Map without re-computation of the tree Shape. (initially proposed by John Meacham)

Change History (12)

comment:1 Changed 10 years ago by jpbernardy

  • Keywords collections added

comment:2 Changed 10 years ago by jpbernardy

  • Description modified (diff)

comment:3 Changed 10 years ago by jpbernardy

This requires to move the tree structures to another module, so Set and Map can both see the structure.

Providing, toTree and "unsafeFromTree" would allow the user to implement his own algorithms.

comment:4 Changed 10 years ago by jpbernardy

This requires moving the raw tree types to their own module so Map can see Set structure and vice-versa. Note that this will be given for free if we move to A. Hey's AVL trees.

comment:5 Changed 9 years ago by jpbernardy

This ticket is resolved for the new AVL-based types.

comment:6 Changed 9 years ago by jpbernardy

  • Owner set to jpbernardy
  • Status changed from new to assigned

comment:7 Changed 9 years ago by igloo

  • Milestone set to 6.8
  • Test Case set to N/A

comment:8 Changed 8 years ago by simonmar

  • Milestone changed from 6.8 branch to Not GHC

comment:9 Changed 7 years ago by igloo

  • Resolution set to wontfix
  • Status changed from assigned to closed

The conversion functions should be proposed with the library submissions process: http://www.haskell.org/haskellwiki/Library_submissions

comment:10 Changed 7 years ago by simonmar

  • Architecture changed from Unknown to Unknown/Multiple

comment:11 Changed 7 years ago by simonmar

  • Operating System changed from Unknown to Unknown/Multiple

comment:12 Changed 3 weeks ago by Thomas Miedema <thomasmiedema@…>

In bcfae08c/ghc:

Pretty: fix potential bad formatting of error message (#10735)

This is a backport of a bug fix by Benedikt Huber for the same problem
in the pretty library (#1337), from commit
8d8866a8379c2fe8108ef034893c59e06d5e752f. The original explanation for
the fix is attached below.

Ticket #1776 originally reported an infinite loop when printing error
message. This promptly got fixed in:

  commit 2d52ee06786e5caf0c2d65a4b4bb7c45c6493190
  Author: [email protected] <unknown>
  Date:   Thu Mar 1 11:45:13 2007 +0000

      Do not go into an infinite loop when pretty-printer finds a
      negative indent (Trac #1176)

SPJ reports in the ticket: "So infinite loop is fixed, but the bad
formatting remains. I've added a test, tcfail177."

tcfail177 however hasn't triggered the formatting problem for years (as
Ian reported in c9e0e6067a47c574d9ff3721afe58e30ca1be3e4).

This patch updates the test to a version that at least still failed with
ghc-7.0 (from #1776#comment:7).

-------------------

From https://mail.haskell.org/pipermail/libraries/2008-June/010013.html,
by Benedikt Huber:

    Concerning ticket #1337, we have to change the formal specification of
    fill (it doesn't match the implementation):

    -- Current Specification:
    --   fill []  = empty
    --   fill [p] = p
    --   fill (p1:p2:ps) = oneLiner p1 <#> nest (length p1)
    --                                          (fill (oneLiner p2 : ps))
    --                     `union`
    --                      p1 $$ fill ps

    Problem 1: We want to `unnest' the second argument of (p1 $$ fill ps),
    but not the first one

    In the definition above we have e.g.

    > getSecondLayout $
    >   fillDef False [text "a", text "b", text "a"]

    >> text "ab"; nilabove; nest -1; text "a"; empty
    >> |ab|
    >> |.a|

    Problem 2: The overlapping $$ should only be used for those layouts of
    p1 which aren't one liners (otherwise violating the invariant "Left
    union arg has shorter first line").

    I suggest the following specification (i believe it almost matches the
    current implementation, modulo [fillNB: fix bug #1337] (see below):

    -- Revised Specification:
    --   fill g docs = fill' 0 docs
    --   gap g       = if g then 1 else 0
    --   fill' n []  = []
    --   fill' n [p] = [p]
    --   fill' n (p1:p2:ps) =
    --      oneLiner p1 <g> (fill' (n+length p1+gap g) (oneLiner p2 : ps))
    --        `union`
    --     (p1 $*$ nest (-n) (fill' g ps))
    --
    -- $*$ is defined for layouts (One-Layout Documents) as
    --
    -- layout1 $*$ layout2 | isOneLiner layout1 = layout1 $+$ layout2
    --                     | otherwise          = layout1 $$ layout2

    I've also implemented the specification in HughesPJQuickCheck.hs,
    and checked them against the patched pretty printer.

    Concerning Bug #1337:
    ~~~~~~~~~~~~~~~~~~~~~

    If the above formal specification is fine, it is easy to fix:
    elide the nests of (oneLiner p2) [see attached patch, record bug #1337].

    > PrettyPrint(0) $ ./Bug1337
    > ....ab
    > ...c

    The (long) explanation follows below.

    <snip/>

    ===========================================================
    Explanation of Bug #1337:

    Consider

    > fcat [ nest 1 $ text "a", nest 2 $ text "b", text "c"]

    --> expected: (nest 1; text "a"; text "b"; nest -3; "c")
    --> actual  : (nest 1; text "a"; text "b"; nest -5; "c")

    Reduction:
    === (nest 1; text a) <> (fill (-2) (p2:ps))
    ==>                     (nest 2 (text "b") $+$ text "c")
    ==>                     (nest 2 (text "b")) `nilabove`
                            (nest (-3) (text "c"))
    ==> (nest 1; text a; text b; nest -5 c)

    The problem is that if we decide to layout (p1:p2:ps) as

    | p1 p2
    | ps

    (call it layout A), then we want to have

    > (p1 <> p2) $+$ ps.

    But following law <n6> this means that

    > fcat_A [p1:nest k p2:ps]

    is equivalent to

    > fcat_A [p1,p2,ps]

    so the nest of p2 has to be removed.

    This is somewhat similar to bug #667, but easier to fix
    from a semantic point of view:
    p1,p2 and ps are distinct layouts - we only have to preserve the
    individual layouts, and no combinations of them.
Note: See TracTickets for help on using tickets.