Note on documentation: The programming API has changed and some details about the installation procedure may have changed. The current documentation is still useful for getting a sense of what Marinda is, but it's not yet a reliable source of information about using Marinda until it can be updated.

Overview

This document covers the more advanced features and techniques available with Marinda. You can often improve the efficiency or sophistication of your programs by taking advantage of these features, but you can usually get by with just the material discussed in the more introductory Client Programming Guide.

Material is grouped by topic wherever possible, but feel free to jump around and read only the topics that interest you. As with the Client Programming Guide, you may want to work through the examples yourself with your own Marinda setup.

Obtaining node information

You can call methods on the Marinda connection object to obtain information about the node on which your program is running. This information is useful for customizing the execution of your program. For instance, you can bootstrap the execution of your program by using the local node name to retrieve the appropriate configuration values from tuples stored in the global tuple space.

The node_id method returns the numeric ID of the local node, and the node_name method returns the user-configured name; for example,

>> $ts.node_id
=> 2
>> $ts.node_name
=> "nibbler"
Note
You can set the node name with the node_name configuration line in local-config.yaml.

Here’s an example of using the node name to dynamically configure a client (you could just as easily use the node ID rather than the node name for this purpose):

$gc = $ts.global_commons
config = $gc.read ["CONFIG", $ts.node_name(), nil, nil]
method, rate_limit = config.values_at 2, 3

You would populate the global commons ahead of time with configuration tuples like ["CONFIG", "nibbler", "icmp-echo", 10].

A more advanced node information is the run ID obtained with the run_id method. This is an unsigned 48-bit integer randomly generated by the local tuple-space server at start up (this value is incorporated into the unique IDs generated by gen_id). The run ID is useful for detecting a restart of the local server between invocations of your program. For this scheme to work, you would need to maintain the last known values (node ID, run ID) in the global tuple space or on local disk. If you detect a restart, you might clear out stale system state.

Streaming operations: monitor_stream, consume_stream

Because of the potential for high latency, special care is needed to communicate efficiently over long distances on the global Internet, as opposed to communicating on a local-area network. In the implementation of most tuple space operations, there must be a low-level message exchange for each matching tuple. Therefore, communication latency is often the primary factor limiting execution rate. For example, if the round-trip time (RTT) between a local server and the global server is 100ms, then the maximum achievable message exchange rate is 10 messages/sec, and thus a client can only execute 10 operations/sec in the global tuple space (however, this 100ms latency will not impact the execution rate of operations in the local tuple space, which is one of the motivations for a two-level hierarchy of tuple spaces). The write operation is an exception, since it does not require a response. Clients can execute write at a high rate, subject only to the execution speed of the endpoints and the bandwidth (but not the latency) of the path between the endpoints.

Marinda provides the streaming operations monitor_stream and consume_stream for increased throughput in high-latency deployments. These operations behave mostly like their non-streaming counterparts monitor and consume, but there are some subtle differences because streaming operations execute asynchronously. In contrast, monitor and consume execute synchronously; that is, these operations do not retrieve the next matching tuple until the client is ready for the next tuple. In terms of implementation, a client signals it is ready for the next tuple by executing the next iteration of the loop implementing the monitor or consume operation (for example, the loop $ts.monitor(...) do ... end). With the streaming operations, Marinda does not wait until the client is ready—all matching tuples (both existing and future tuples) are streamed asynchronously from the global server to a local server, where they are then buffered until the client is ready for them. As far as the client is concerned, it is still iterating over the tuples one by one in an apparently synchronous fashion, but the delay to retrieve each tuple is dramatically reduced—from a 100-millisecond RTT to a 100-microsecond interprocess-communication delay.

For the client, using these streaming operations is no different than using monitor or consume. For example, here is code to print out all matching tuples:

$ts.monitor_stream([]) do |tuple|
  p tuple
end

This might take 1 second to complete on 1000 matching tuples, whereas monitor might take 100 seconds to complete with an RTT of 100ms.

Why not use streaming operations all the time, if they increase throughput? Streaming operations are not necessarily the best nor the correct choice in all situations. In particular, streaming operations transfer all matching tuples asynchronously, and this may not be what you want. This is inefficient if you only want the first few matching tuples, or you want to iterate over the tuples that match a loose template in order to find the first tuple that meets a more stringent condition on the values. In either case, a potentially large number of unneeded tuples may be transferred. This not only wastes bandwidth, but can also increase the memory usage of Marinda servers, as these tuples will need to be buffered in memory by the local server until the client can get to them (and they need to be buffered specially by the global server in preparation for transmitting them).

Besides potential inefficiency, a streaming operation may not have the right semantics for a given situation. For example, a single consume_stream will immediately remove all matching tuples from the entire tuple space at once (as contrasted with consume, which only removes one tuple per iteration). If all the matching tuples were intended for that one client, then this is the desired result. However, there are useful coordination patterns in which tuples should be distributed fairly amongst a set of threads/processes that are all performing a consume or take on the same template; for example, the Bag-of-Tasks and the Master-Slave patterns are common ways of exploiting embarrassingly parallel tasks. Because consume_stream may lead to an unfair distribution of tuples, it is unsuitable for implementing these patterns.

Using wildcards in tuples (not just in templates)

Ordinarily, a nil value is used as a wildcard in templates. However, a nil value can also be used in tuples, to achieve a similar wildcard effect but with the roles of template and tuple reversed. So far, we have said that a fully-specified template (that is, a template free of wildcards) like [1, 2, 3] only matches tuples with exactly the same values. In truth, such a template also matches [1, 2, nil], [nil, 2, nil], and [nil, nil, nil], among others. If a tuple has a nil value in a given position, then the tuple will match any value in the corresponding position of the template (including, of course, the template wildcard nil).

Wildcards in tuples are less generally useful than wildcards in templates, but there are specific scenarios where they are valuable, as the following example shows. Suppose there are three servers located in the USA, Europe, and Asia (by server, we mean some user application providing a service that uses Marinda for coordination, not a Marinda local/global server), and suppose each server accepts requests over the tuple space with a loop like the following (in the case of the European server):

$ts.consume(["Europe", nil]) do |tuple|
  ...
end

As usual, if a client wants to issue a request to a server at a particular location, then it can include the location in its request tuple; for example, ["Europe", 123] would be picked up by the European server, and ["USA", 123] by the USA server. But what if the client doesn’t care about the location, and it just wants the next available server to handle the request? The client can’t know, in general, which server is available, so it can’t pick the right location value to use. The solution is to use nil for the location, since a tuple like [nil, 123] will match any of the templates used by the different servers. Such requests will be naturally load-balanced across available servers without any special effort by clients or servers. If multiple servers are available, then Marinda ensures that tuples are distributed in a first-come-first-served basis (that is, available servers will receive tuples in the order in which they executed the take, consume, etc. operation to retrieve the tuple), so no server is allowed to starve.

Tip
If clients never need to issue requests to specific servers, then a simpler approach is to just leave out the location (or other identifying) component from request tuples when designing the coordination protocol between clients and servers. However, tuple wildcards are useful when you need to do both—sometimes issue requests to specific servers and sometimes to any available server.
Note
Because nil values are always allowed in tuples, you need to be vigilant of bugs. A common result of bugs is to produce a nil value, which can now be easily transferred over the tuple space to another client that may be unprepared to handle it. Sanity checking the values retrieved from the tuple space can help mitigate such problems.

Writing event-based Marinda clients

The Client Programming Guide covers the synchronous blocking interface to Marinda provided by the Marinda::Client class. This straightforward interface is suitable for use by both single-threaded and multi-threaded programs (however, none of the Marinda::Client methods are threadsafe so multi-threaded clients must take appropriate measures themselves).

Marinda::Client also provides an asynchronous event-based interface, which can often be a simpler and more scalable alternative to multi-threaded programming. To use this interface, you need an event loop to dispatch events generated by event sources like Marinda::Client. As a convenience, a compatible event loop is provided by the Marinda::ClientEventLoop class. An event-based client program will have the following general structure:

  1. Create an instance of the event loop.

  2. Open one or more connections to Marinda.

  3. Add Marinda connections to the event loop.

  4. Invoke asynchronous Marinda operations.

  5. Start the event loop.

Marinda::Client methods with an _async suffix provide the asynchronous version of every Marinda operation. For example, Marinda::Client#read is the familiar blocking read operation, whereas Marinda::Client#read_async is the asynchronous version, which is used with an event loop. The following example shows a simple event-based echo server that reacts to two different kinds of tuples:

echo.rb
$ts = Marinda::Client.new(UNIXSocket.open("/tmp/localts.sock"))
$ts.hello
$ts2 = $ts.duplicate

eloop = Marinda::ClientEventLoop.new
eloop.add_source $ts
eloop.add_source $ts2

$ts.consume_async(["ECHO", nil]) do |tuple|
  p tuple[1]
end

$ts2.take_async(["QUIT"]) do |tuple|
  puts "Exiting."
  eloop.suspend()
end

eloop.start()

You can try out this example with tut.rb (a convenience script described in the Client Programming Guide):

>> load 'tut.rb'
>> $ts.write ["ECHO", 1234]
>> $ts.write ["ECHO", "hey"]
>> $ts.write ["ECHO", [1, "foo", [52.32, "bar"]]]
>> $ts.write ["QUIT"]

which produces the following output from the example server:

$ ./echo.rb
1234
"hey"
[1, "foo", [52.32, "bar"]]
Exiting.
$

Let’s examine echo.rb more carefully. We start by creating an event loop and adding the Marinda client connections as event sources. Then we invoke the asynchronous Marinda::Client#consume_async method, which starts Marinda executing the consume operation. However, unlike with the synchronous Marinda::Client#consume method, consume_async does not block waiting for results to arrive, but instead immediately returns control to the client. The Ruby block (that is, the code between do ... end) passed to consume_async is registered as a callback that will be invoked asynchronously for each result tuple. We next invoke take_async to handle quit requests. Note that we use a separate Marinda connection for take_async because we cannot execute more than one operation (synchronous or asynchronous) per connection. Note also that take_async takes a block that will be invoked with the result tuple, unlike with take which directly returns the result. In fact, all asynchronous methods require a block, even if their synchronous counterpart doesn’t take a block.

Finally, we start the event loop to begin processing the results of asynchronous operations. The eloop.start() call blocks indefinitely until either the client calls eloop.suspend() (as done in the block of take_async), or all asynchronous operations registered with eloop complete. After eloop.start() returns, you can restart the event loop by calling eloop.start() again, but if all asynchronous operations have completed, then you need to invoke at least one new asynchronous operation before restarting the event loop, or the event loop will find nothing to do and will return immediately. In the above example, $ts.consume_async runs forever, so we can simply restart the event loop, but if we wish, we can invoke a new asynchronous operation on $ts2 before restarting the event loop, since take_async has completed on $ts2, freeing up $ts2 for reuse. We can manually cancel an asynchronous operation at any time by invoking cancel on the connection; for example, we could cancel $ts.consume_async by calling $ts.cancel.

When an asynchronous operation completes or is cancelled, you don’t have to wait for the event loop to be suspended before invoking another operation on that connection. However, there is a limitation on how you can invoke a new operation—the following does not work:

$ts2.take_async(["ABC"]) do |tuple|
  puts "got ABC"
  $ts2.take_async(["DEF"]) do |tuple|
    puts "got DEF"
  end
end

Since, by definition, take_async only receives one result tuple, the operation has conceptually completed by the time the associated block executes, so you may think you can invoke another asynchronous operation on $ts2 from within the callback, but Marinda does not allow this (of course, you can invoke another operation on $ts2 from outside the callback, once the callback has finished executing). One workaround is to execute asynchronous operations on two alternating connections. For example, you can switch back and forth between two connections stored in an array, as in the following code (we use the reverse! method to interchange the connections before using the first connection in the array):

conns = [$ts2, $ts3]
conns.first.take_async(["ABC"]) do |tuple|
  puts "got ABC"
  conns.reverse!
  conns.first.take_async(["DEF"]) do |tuple|
    puts "got DEF"
  end
end
Creating custom event sources

You only need to implement the following four methods to create a custom event source that works with Marinda::ClientEventLoop:

  • def io(): should return the IO object to perform select on,

  • def want_read(): should return true if you want to wait for the IO object to become readable,

  • def want_write(): should return true if you want to wait for the IO object to become writable,

  • def read_data(): called by the event loop when data is ready to be read, and

  • def write_data(): called by the event loop when data is ready to be written.

None of these methods takes any arguments, and read_data and write_data do not return any value. Be careful implementing read_data and write_data, since these methods may be called when an IO object is not actually readable or writable (this cannot be avoided, due to limitations in the select call used to detect readability/writability). Also be sure to rescue Errno::EINTR, EOFError, and other relevant exceptions when implementing read_data and write_data. To prevent deadlock, be sure to put the IO object into nonblocking mode.

Using private tuple space regions for one-to-one communication

The local and global tuple spaces are each divided into disjoint regions. Once a client opens a connection to a region, it can communicate freely with any other client connected to the same region. These public regions facilitate many-to-many communication. Marinda provides a second, more restrictive form of region, the private region, that is designed to facilitate one-to-one communication. A private region is like a private mailbox; only the owner can read or remove tuples from a private region, but one or more other clients can write tuples into it.

A public region stands alone independently of clients, and provides persistent storage even with clients constantly joining and leaving. In contrast, the existence of a private region is directly tied to a single client connection. Whenever a client connects to Marinda, a new unique private region is created and associated with the connection itself. When the client closes the connection, the associated private region is discarded and becomes inaccessible. Thus, a connection provides access to a unique pair of public and private regions.

In Marinda, a client does not explicitly identify the public or the private region it wishes to act on with a given operation. Instead, the connection itself, on which an operation is invoked, serves as an implicit identifier or handle for the target region. This is possible because each connection is permanently associated with a single public region and a single private region. Similarly, each operation implicitly acts on either the public or the private region of a connection. All operations described so far act (only) on the public region, and the following two additional operations act (only) on the private region: take_priv and takep_priv (and their asynchronous versions take_priv_async and takep_priv_async). These operations work exactly like take and takep, and because they retrieve tuples via template matching (like ordinary operations), they allow you to retrieve tuples from a private region in a different order than they were put there. This flexibility allows you to implement something like the share-nothing message-passing concurrency model of Erlang.

Marinda uses a capability-like security model for private regions. Only the client holding a connection can remove tuples from the private region of the connection, and only clients that have a reference to a private region can write to it. A reference is an internal object that cannot be constructed by a client—a reference can only be obtained from Marinda, and only the client holding a connection (that is, the owner) can pass out references to others. Every time a client writes a tuple into a public region, the client implicitly and automatically passes out a reference to the private region of the connection used. This reference is attached as metadata to the tuple, and functions like the sender address on an envelope. When other clients retrieve the tuple, they implicitly obtain the reference, which they can then use to reply directly to the private region of the original sender. Because references can only be passed in the public region associated with a connection, only the clients that have access to a given public region can access the private regions of other clients connected to that public region, and consequently, only the clients connected to the same public region can communicate directly with each other via private regions.

A client must use special operations to write into another client’s private region. The most basic operation is reply, which simply writes a tuple into the private region of the last retrieved tuple (using any retrieval operation) on a given connection. This is tailored for the simple case of a server replying directly to a requester. To illustrate, we provide the code for a simple ping measurement server and its client. The client issues a request by writing a tuple into a public region, and the server writes the response directly into the client’s private region with reply:

ping client (1st version)
$ts.write ["PING", "192.168.0.5"]
result = $ts.take_priv ["RESULT", nil, nil]
printf "RTT = %f\n", result[2]
ping server (1st version)
loop do
  request = $ts.take ["PING", nil]
  addr = request[1]
  rtt = 123.456
  $ts.reply ["RESULT", addr, rtt]
end

We can conduct the entire request-response exchange over private regions by making use of a rendezvous tuple, a tuple written by the server that the client can retrieve to obtain a reference to the private region of the server:

ping client (2nd version)
$ts.read ["PING-SERVER", "EUROPE"]  # get ref to private region
$ts.reply ["PING", "192.168.0.5"]
result = $ts.take_priv ["RESULT", nil, nil]
printf "RTT = %f\n", result[2]
ping server (2nd version)
$ts.take_all ["PING-SERVER", "EUROPE"] do end  # clear stale tuples
$ts.write ["PING-SERVER", "EUROPE"]  # register server

loop do
  request = $ts.take_priv ["PING", nil]
  addr = request[1]
  rtt = 123.456
  $ts.reply ["RESULT", addr, rtt]
end

Some care is needed in using rendezvous tuples, since tuples will reference defunct private regions as soon as the connection they were written on closes. The second example server guards against this problem by clearing away prior rendezvous tuples at startup and then writing a fresh tuple. If a client tries to write into a defunct private region, then Marinda silently discards the tuple (but in the future, Marinda might return an error result to better indicate this situation).

Though convenient, the reply operation is inflexible, since it can only be applied to the private region of the last retrieved tuple. Marinda provides more flexible operations that take a parameter, a peer descriptor, that indicates the target private region. A peer descriptor is a small nonnegative integer (analogous to a file descriptor) that represents a remembered reference to a private region. You can think of the peer descriptor as an index into a table of remembered references; or taking the envelope analogy further, a peer descriptor is an index into your personal address book, which is filled by remembering the sender address of retrieved tuples. A peer descriptor is an indirect reference that only has meaning on the connection it was created on, and by design, knowing the value of a peer descriptor will not allow a different client to access the represented private region.

Call remember_peer after retrieving a tuple to obtain a peer descriptor for the private region associate with the tuple. Once you have a peer descriptor, you can write into a specific private region with write_to. Here is a revised ping client that uses these new operations (no change is required on the server side, although you do need to start up an ASIA server instance as well as the previous EUROPE instance):

ping client (3rd version)
$ts.read ["PING-SERVER", "EUROPE"]  # get ref to private region
eu_peer = $ts.remember_peer()

$ts.read ["PING-SERVER", "ASIA"]
asia_peer = $ts.remember_peer()

$ts.write_to eu_peer, ["PING", "192.168.0.5"]
$ts.write_to asia_peer, ["PING", "192.168.0.5"]

result = $ts.take_priv ["RESULT", nil, nil]
result = $ts.take_priv ["RESULT", nil, nil]

Only a finite (but large) number of peers can be remembered per connection, so you should free an unneeded peer descriptor p with a call to forget_peer(p). It is safe to call forget_peer on an already free peer descriptor.

Implementation limits

This section summarizes the implementation limits on Marinda clients and servers:

  • max template/tuple size = 65,535 bytes in serialized form; a template/tuple is serialized into a special text format, and this text format can be larger than the raw data values (in binary form) of a template/tuple, especially if there are long strings; strings, regardless of content, expand by a factor of about 4/3 in the text encoding

  • max string length in template/tuple = 24,575 bytes; the tuple space abstraction isn’t designed for bulk transport of data, but you can send large amounts of data in smaller chunks, if necessary

  • max values in template/tuple = 1024 values

  • max nesting depth of subarrays in template/tuple = 255 levels

  • max remembered references to private regions per connection = 231-1 references

  • max public/private regions = 248 regions per public/private scope; limited more by main memory, since all regions and their tuples must be in memory; the global server hosts all global public/private regions; each local server only hosts its own local public/private regions

  • max tuples per region = 231-1 (on 32-bit CPUs) and 263-1 (on 64-bit CPUs); limited more by main memory, since all tuples must be in memory; also, currently, an internal index will overflow after this many tuples have been written into any given region, which is a lot more likely to happen in practice than storing this many tuples; operations execute O(n), except when using the template [], so performance may degrade significantly after a few thousand tuples due to unoptimized template matching

  • max blocked operations per region = no imposed limit; however, performance may degrade significantly after a few hundred operations (like take and monitor) are blocked waiting for new tuples to arrive, since each tuple written into a region has to be checked against every blocked operation in the worst case, which is O(n)

  • max Marinda connections per client = no imposed limit; limited only by the max number of file descriptors a process can open

  • max client connections per local server = no imposed limit; limited only by the max number of file descriptors a process can open

  • max nodes in Marinda deployment = 32,768 nodes