Changes between Version 4 and Version 5 of ErlangInHaskell

Feb 10, 2011 11:39:45 AM (7 years ago)

Added walkthrough


  • ErlangInHaskell

    v4 v5  
    4444Given a ProcessId (from 'forkProcess' or 'spawnRemote') and a chunk of serializable data (implementing Haskell's 'Data.Binary.Binary' type class), we can send a message to the given process. The message will transmitted across the network if necessary and placed in the process's message queue. Note that 'send' will accept any type of data, as long as it implements Binary. Initially, all basic Haskell types implement binary, including tuples and arrays, and it's easy to implement Binary for user-defined types. How then does the receiving process know the type of message to extract from its queue? A message can receive processes by distinguishing their type using the 'receiveWait' function, which corresponds to Erlang's receive clause. The process can provide a distinct handler for each type of process that it knows how to deal with; unmatched messages remain on the queue, where they may be retrieved by later invocations of 'receiveWait'.
    46 A ''channel'' provides an alternative to message transmission with 'send' and 'receiveWait'. While 'send' and 'receiveWait' allow sending messages of any type, channels require messages to be of uniform type. Channels must be explicitly created with a call to 'makeChannel':
     46= Channels =
     48A ''channel'' provides an alternative to message transmission with ''send'' and ''receiveWait''. While ''send'' and ''receiveWait'' allow sending messages of any type, channels require messages to be of uniform type. Channels must be explicitly created with a call to ''makeChannel'':
    4850 makeChannel :: (Binary a) => ProcessM (SendChannel a, ReceiveChannel a)
     52The resulting ''SendChannel'' can be used with the ''sendChannel'' function to insert messages into the channel, and the ''ReceiveChannel'' can be used with ''receiveChannel''. The SendChannel can be serialized and sent as part of messages to other processes, which can then write to it; the ReceiveChannel, though, cannot be serialized, although it can be read from multiple threads on the same node by variable capture.
     54= Setup and walkthrough =
     56Here I'll provide a basic example of how to get started with your first project on this framework.
     58Here's the overall strategy: We'll be running a program that will estimate pi, making use of available computing resources potentially on remote systems. There will be an arbitrary number of nodes, one of which will be designated the master, and the remaining nodes will be slaves. The slaves will estimate pi in such a way that their results can be combined by the master, and an approximation will be output. The more nodes, and the longer they run, the more precise the output.
     60In more detail: the master will assign each slave a region of the Halton sequence, and the slaves will use elements of the sequence to estimate the ratio of points in a unit square that fall within a unit circle, and that the master will sum these ratios.
     62Here's the procedure, step by step.
     641. Compile `Pi6.hs`. If you have the framework installed correctly, it should be sufficient to run:
     66 ghc --make Pi6
     682. Select the machines you want to run the program on, and select one of them to be the master. All hosts must be connected on a local area network. For the purposes of this explanation, we'll assume that you will run your master node on a machine named `masterhost` and you will run two slave nodes each on machines named `slavehost1` and `slavehost2`.
     703. Copy the compiled executable `Pi6` to some location on each of the three hosts.
     724. For each node, we need to create a configuration file. This is plain text file, usually named `config` and usually placed in the same directory with the executable. There are many possible settings that can be set in the configuration file, but only a few are necessary for this example; the rest have sensible defaults. On `masterhost`, create a file named `config` with the following content:
     74cfgRole MASTER
     75cfgHostName masterhost
     76cfgKnownHosts masterhost slavehost1 slavehost2
     78On `slavehost1`, create a file named `config` with the following content:
     80cfgRole SLAVE
     81cfgHostName slavehost1
     82cfgKnownHosts masterhost slavehost1 slavehost2
     84On `slavehost2`, create a file named `config` with the following content:
     86cfgRole SLAVE
     87cfgHostName slavehost2
     88cfgKnownHosts masterhost slavehost1 slavehost2
     90A brief discussion of these settings and what they mean:
     92The `cfgRole` setting determines the node's initial behavior. This is a string which is used to differentiate the two kinds of nodes in this example. More complex distributed systems might have more different kinds of roles. In this case, SLAVE nodes do nothing on startup, but just wait from a command from a master, whereas MASTER nodes seek out slave nodes and issue them commands.
     94The `cfgHostName` setting indicates to each node the name of the host it's running on. If blank or unspecified, this value will be determined automatically, but to play it safe, we specify it explicitly here.
     96The `cfgKnownHosts` setting provides a list of hosts that form part of this distributed execution. This is necessary so that the master node can find its subservient slave nodes. Depending on your network configuration, it may be possible for the master to discovery other hosts automatically.
     985. Now, run the `Pi6` program twice in each of the slave nodes. There should now be four slave nodes awaiting instructions.
     1006. To start the execution, run `Pi6` on the master node. You should see output like this:
     102 2011-02-10 11:14:38.373856 UTC 0 pid://masterhost:48079/6/    SAY Starting...
     103 2011-02-10 11:14:38.374345 UTC 0 pid://masterhost:48079/6/    SAY Telling slave nid://slavehost1:33716/ to look at range 0..1000000
     104 2011-02-10 11:14:38.376479 UTC 0 pid://masterhost:48079/6/    SAY Telling slave nid://slavehost1:45343/ to look at range 1000000..2000000
     105 2011-02-10 11:14:38.382236 UTC 0 pid://masterhost:48079/6/    SAY Telling slave nid://slavehost2:51739/ to look at range 2000000..3000000
     106 2011-02-10 11:14:38.384613 UTC 0 pid://masterhost:48079/6/    SAY Telling slave nid://slavehost2:44756/ to look at range 3000000..4000000
     107 2011-02-10 11:14:56.720435 UTC 0 pid://masterhost:48079/6/    SAY Done: 3141606141606141606141606141606141606141606141606141606141606141606141606141606141606141606141606141
     109Let's talk about what's going on here.
     111This output is generated by the framework's logging facility. Each line of output has the following fields, left-to-right: the date and time that the log entry was generated; the importance of the message (in this case 0); the process ID of the generating process; the subsystem or component that generated this message (in this case, SAY indicates that these messages were output by a call to the ''say'' function); and the body of the message. From these messages, we can see that the master node discovered four nodes running on two remote hosts; for each of them, the master emits a "Telling slave..." message. Note that although we had to specify the host names where the nodes were running in the config file, the master found all nodes running on each of those hosts. The log output also tells us which range of indices of the Halton sequence were assigned to each node. Each slave, having performed its calculation, sends its results back to the master, and when the master has received responses from all slaves, it prints out its estimate of pi and ends. The slave nodes continue running, waiting for another request. At this point, we could run the master again, or we can terminate the slaves manually with Ctrl-C or the kill command.