Compact API design improvements
I took a look at the compact API today and I realized that there are a lot of improvements we should make to it:
-
fmap getCompact . compact
is a pretty common thing to do, if you don't actually care about the compact pointer. We should make a helper function for this. - The
SerializedCompact
data structure needs a bunch of instances. Especially, a user who wants to serialize the compact region needs to save this metadata somewhere, but we don't offer any help for doing this. -
importCompact
will always print a message to stderr saying pointer fixup is happening. Need to be able to suppress this message. - The serialization API is really unsafe; we should make it more safe by default by including some sort of fingerprint, at least.
- There should be a convenience function for serializing to and from a file, and serializing to and from a handle. RDMA is always going to be delicate business (so keep the old API around) but for these cases we should pave the cowpaths.
Here is some sample code that might help:
{-# LANGUAGE ScopedTypeVariables #-}
import System.Environment (getArgs)
import qualified Data.Set as Set
import System.IO
import Data.Compact
import Data.Compact.Serialized
import Foreign.Ptr
import Foreign.Storable
import Foreign.Marshal.Alloc
import Control.Monad
main = do
[dict_file, out_file] <- getArgs
dict <- readFileLatin1 dict_file
c <- compact (Set.fromList (words dict))
withBinaryFile out_file WriteMode $ \h ->
withSerializedCompact c $ \sc -> do
-- Write out the metadata header
hPutStorable h (serializedCompactRoot sc)
forM_ (serializedCompactBlockList sc) $ \(ptr, l) -> do
hPutStorable h ptr
hPutStorable h l
hPutStorable h nullPtr
-- Write out the payload
forM_ (serializedCompactBlockList sc) $ \(ptr, l) ->
hPutBuf h ptr (fromIntegral l)
mb_r <- withBinaryFile out_file ReadMode $ \h -> do
-- Read out the metadata header
root <- hGetStorable h
let go h xs = do
ptr <- hGetStorable h
if ptr == nullPtr
then return (reverse xs)
else do l <- hGetStorable h
go h ((ptr, l):xs)
blocks <- go h []
let sc = SerializedCompact {
serializedCompactBlockList = blocks,
serializedCompactRoot = root
}
-- Read the payload into memory
importCompact sc $ \ptr l -> void $ hGetBuf h ptr (fromIntegral l)
print (fmap getCompact mb_r == Just (getCompact c))
hPutStorable :: forall a. Storable a => Handle -> a -> IO ()
hPutStorable h a =
alloca $ \ptr -> do
poke ptr a
hPutBuf h ptr (sizeOf (undefined :: a))
hGetStorable :: forall a. Storable a => Handle -> IO a
hGetStorable h =
alloca $ \ptr -> do
hGetBuf h ptr (sizeOf (undefined :: a))
peek ptr
readFileLatin1 f = do
h <- openFile f ReadMode
hSetEncoding h latin1
hGetContents h
I'm happy to do these but I want to make sure I'm not stepping on FB's toes, also I don't know if bgamari wants to take patches along these lines so late in the release cycle.
Trac metadata
Trac field | Value |
---|---|
Version | 8.0.1 |
Type | FeatureRequest |
TypeOfFailure | OtherFailure |
Priority | normal |
Resolution | Unresolved |
Component | libraries/compact |
Test case | |
Differential revisions | |
BlockedBy | |
Related | |
Blocking | |
CC | bgamari, simonmar |
Operating system | |
Architecture |