Overview
This document covers the more advanced features and techniques available with Marinda. You can often improve the efficiency or sophistication of your programs by taking advantage of these features, but you can usually get by with just the material discussed in the more introductory Client Programming Guide.
Material is grouped by topic wherever possible, but feel free to jump around and read only the topics that interest you. As with the Client Programming Guide, you may want to work through the examples yourself with your own Marinda setup.
Obtaining node information
You can call methods on the Marinda connection object to obtain information about the node on which your program is running. This information is useful for customizing the execution of your program. For instance, you can bootstrap the execution of your program by using the local node name to retrieve the appropriate configuration values from tuples stored in the global tuple space.
The node_id
method returns the numeric ID of the local node, and the
node_name
method returns the user-configured name; for example,
>> $ts.node_id => 2 >> $ts.node_name => "nibbler"
Note
|
You can set the node name with the node_name configuration
line in local-config.yaml . |
Here’s an example of using the node name to dynamically configure a client (you could just as easily use the node ID rather than the node name for this purpose):
$gc = $ts.global_commons config = $gc.read ["CONFIG", $ts.node_name(), nil, nil] method, rate_limit = config.values_at 2, 3
You would populate the global commons ahead of time with configuration
tuples like ["CONFIG", "nibbler", "icmp-echo", 10]
.
A more advanced node information is the run ID obtained with the
run_id
method. This is an unsigned 48-bit integer randomly
generated by the local tuple-space server at start up (this value is
incorporated into the unique IDs generated by gen_id
). The run ID
is useful for detecting a restart of the local server between
invocations of your program. For this scheme to work, you would need
to maintain the last known values (node ID, run ID) in the global
tuple space or on local disk. If you detect a restart, you might
clear out stale system state.
Streaming operations: monitor_stream
, consume_stream
Because of the potential for high latency, special care is needed to
communicate efficiently over long distances on the global Internet, as
opposed to communicating on a local-area network. In the
implementation of most tuple space operations, there must be a
low-level message exchange for each matching tuple. Therefore,
communication latency is often the primary factor limiting execution
rate. For example, if the round-trip time (RTT) between a local
server and the global server is 100ms, then the maximum achievable
message exchange rate is 10 messages/sec, and thus a client can only
execute 10 operations/sec in the global tuple space (however, this
100ms latency will not impact the execution rate of operations in the
local tuple space, which is one of the motivations for a two-level
hierarchy of tuple spaces). The write
operation is an exception,
since it does not require a response. Clients can execute write
at
a high rate, subject only to the execution speed of the endpoints and
the bandwidth (but not the latency) of the path between the endpoints.
Marinda provides the streaming operations monitor_stream
and
consume_stream
for increased throughput in high-latency deployments.
These operations behave mostly like their non-streaming counterparts
monitor
and consume
, but there are some subtle differences because
streaming operations execute asynchronously. In contrast, monitor
and consume
execute synchronously; that is, these operations do
not retrieve the next matching tuple until the client is ready for the
next tuple. In terms of implementation, a client signals it is ready
for the next tuple by executing the next iteration of the loop
implementing the monitor
or consume
operation (for example, the
loop $ts.monitor(...) do ... end
). With the streaming operations,
Marinda does not wait until the client is ready—all matching tuples
(both existing and future tuples) are streamed asynchronously from the
global server to a local server, where they are then buffered until
the client is ready for them. As far as the client is concerned, it
is still iterating over the tuples one by one in an apparently
synchronous fashion, but the delay to retrieve each tuple is
dramatically reduced—from a 100-millisecond RTT to a
100-microsecond interprocess-communication delay.
For the client, using these streaming operations is no different than
using monitor
or consume
. For example, here is code to print
out all matching tuples:
$ts.monitor_stream([]) do |tuple| p tuple end
This might take 1 second to complete on 1000 matching tuples, whereas
monitor
might take 100 seconds to complete with an RTT of 100ms.
Why not use streaming operations all the time, if they increase throughput? Streaming operations are not necessarily the best nor the correct choice in all situations. In particular, streaming operations transfer all matching tuples asynchronously, and this may not be what you want. This is inefficient if you only want the first few matching tuples, or you want to iterate over the tuples that match a loose template in order to find the first tuple that meets a more stringent condition on the values. In either case, a potentially large number of unneeded tuples may be transferred. This not only wastes bandwidth, but can also increase the memory usage of Marinda servers, as these tuples will need to be buffered in memory by the local server until the client can get to them (and they need to be buffered specially by the global server in preparation for transmitting them).
Besides potential inefficiency, a streaming operation may not have the
right semantics for a given situation. For example, a single
consume_stream
will immediately remove all matching tuples from the
entire tuple space at once (as contrasted with consume
, which only
removes one tuple per iteration). If all the matching tuples were
intended for that one client, then this is the desired result.
However, there are useful coordination patterns in which tuples should
be distributed fairly amongst a set of threads/processes that are all
performing a consume
or take
on the same template; for example,
the Bag-of-Tasks and the Master-Slave patterns are common ways of
exploiting embarrassingly parallel tasks. Because consume_stream
may lead to an unfair distribution of tuples, it is unsuitable for
implementing these patterns.
Using wildcards in tuples (not just in templates)
Ordinarily, a nil
value is used as a wildcard in templates.
However, a nil
value can also be used in tuples, to achieve a
similar wildcard effect but with the roles of template and tuple
reversed. So far, we have said that a fully-specified template (that
is, a template free of wildcards) like [1, 2, 3]
only matches tuples
with exactly the same values. In truth, such a template also matches
[1, 2, nil]
, [nil, 2, nil]
, and [nil, nil, nil]
, among others.
If a tuple has a nil
value in a given position, then the tuple will
match any value in the corresponding position of the template
(including, of course, the template wildcard nil
).
Wildcards in tuples are less generally useful than wildcards in templates, but there are specific scenarios where they are valuable, as the following example shows. Suppose there are three servers located in the USA, Europe, and Asia (by server, we mean some user application providing a service that uses Marinda for coordination, not a Marinda local/global server), and suppose each server accepts requests over the tuple space with a loop like the following (in the case of the European server):
$ts.consume(["Europe", nil]) do |tuple| ... end
As usual, if a client wants to issue a request to a server at a
particular location, then it can include the location in its request
tuple; for example, ["Europe", 123]
would be picked up by the
European server, and ["USA", 123]
by the USA server. But what if
the client doesn’t care about the location, and it just wants the next
available server to handle the request? The client can’t know, in
general, which server is available, so it can’t pick the right
location value to use. The solution is to use nil
for the location,
since a tuple like [nil, 123]
will match any of the templates used
by the different servers. Such requests will be naturally
load-balanced across available servers without any special effort by
clients or servers. If multiple servers are available, then Marinda
ensures that tuples are distributed in a first-come-first-served basis
(that is, available servers will receive tuples in the order in which
they executed the take
, consume
, etc. operation to retrieve the
tuple), so no server is allowed to starve.
Tip
|
If clients never need to issue requests to specific servers, then a simpler approach is to just leave out the location (or other identifying) component from request tuples when designing the coordination protocol between clients and servers. However, tuple wildcards are useful when you need to do both—sometimes issue requests to specific servers and sometimes to any available server. |
Note
|
Because nil values are always allowed in tuples, you need to
be vigilant of bugs. A common result of bugs is to produce a
nil value, which can now be easily transferred over the tuple
space to another client that may be unprepared to handle it.
Sanity checking the values retrieved from the tuple space can
help mitigate such problems. |
Writing event-based Marinda clients
The Client Programming Guide covers the
synchronous blocking interface to Marinda provided by the
Marinda::Client
class. This straightforward interface is suitable
for use by both single-threaded and multi-threaded programs (however,
none of the Marinda::Client
methods are threadsafe so multi-threaded
clients must take appropriate measures themselves).
Marinda::Client
also provides an asynchronous event-based interface,
which can often be a simpler and more scalable alternative to
multi-threaded programming. To use this interface, you need an event
loop to dispatch events generated by event sources like
Marinda::Client
. As a convenience, a compatible event loop is
provided by the Marinda::ClientEventLoop
class. An event-based
client program will have the following general structure:
-
Create an instance of the event loop.
-
Open one or more connections to Marinda.
-
Add Marinda connections to the event loop.
-
Invoke asynchronous Marinda operations.
-
Start the event loop.
Marinda::Client
methods with an _async
suffix provide the
asynchronous version of every Marinda operation. For example,
Marinda::Client#read
is the familiar blocking read
operation,
whereas Marinda::Client#read_async
is the asynchronous version,
which is used with an event loop. The following example shows a
simple event-based echo server that reacts to two different kinds of
tuples:
echo.rb
$ts = Marinda::Client.new(UNIXSocket.open("/tmp/localts.sock")) $ts.hello $ts2 = $ts.duplicate eloop = Marinda::ClientEventLoop.new eloop.add_source $ts eloop.add_source $ts2 $ts.consume_async(["ECHO", nil]) do |tuple| p tuple[1] end $ts2.take_async(["QUIT"]) do |tuple| puts "Exiting." eloop.suspend() end eloop.start()
You can try out this example with tut.rb
(a convenience script
described in the Client
Programming Guide):
>> load 'tut.rb' >> $ts.write ["ECHO", 1234] >> $ts.write ["ECHO", "hey"] >> $ts.write ["ECHO", [1, "foo", [52.32, "bar"]]] >> $ts.write ["QUIT"]
which produces the following output from the example server:
$ ./echo.rb
1234
"hey"
[1, "foo", [52.32, "bar"]]
Exiting.
$
Let’s examine echo.rb
more carefully. We start by creating an event
loop and adding the Marinda client connections as event sources. Then
we invoke the asynchronous Marinda::Client#consume_async
method,
which starts Marinda executing the consume
operation. However,
unlike with the synchronous Marinda::Client#consume
method,
consume_async
does not block waiting for results to arrive, but
instead immediately returns control to the client. The Ruby block
(that is, the code between do ... end
) passed to consume_async
is
registered as a callback that will be invoked asynchronously for
each result tuple. We next invoke take_async
to handle quit
requests. Note that we use a separate Marinda connection for
take_async
because we cannot execute more than one operation
(synchronous or asynchronous) per connection. Note also that
take_async
takes a block that will be invoked with the result tuple,
unlike with take
which directly returns the result. In fact, all
asynchronous methods require a block, even if their synchronous
counterpart doesn’t take a block.
Finally, we start the event loop to begin processing the results of
asynchronous operations. The eloop.start()
call blocks indefinitely
until either the client calls eloop.suspend()
(as done in the block
of take_async
), or all asynchronous operations registered with
eloop
complete. After eloop.start()
returns, you can restart the
event loop by calling eloop.start()
again, but if all asynchronous
operations have completed, then you need to invoke at least one new
asynchronous operation before restarting the event loop, or the event
loop will find nothing to do and will return immediately. In the
above example, $ts.consume_async
runs forever, so we can simply
restart the event loop, but if we wish, we can invoke a new
asynchronous operation on $ts2
before restarting the event loop,
since take_async
has completed on $ts2
, freeing up $ts2
for
reuse. We can manually cancel an asynchronous operation at any time
by invoking cancel
on the connection; for example, we could cancel
$ts.consume_async
by calling $ts.cancel
.
When an asynchronous operation completes or is cancelled, you don’t have to wait for the event loop to be suspended before invoking another operation on that connection. However, there is a limitation on how you can invoke a new operation—the following does not work:
$ts2.take_async(["ABC"]) do |tuple| puts "got ABC" $ts2.take_async(["DEF"]) do |tuple| puts "got DEF" end end
Since, by definition, take_async
only receives one result tuple, the
operation has conceptually completed by the time the associated block
executes, so you may think you can invoke another asynchronous
operation on $ts2
from within the callback, but Marinda does not
allow this (of course, you can invoke another operation on $ts2
from
outside the callback, once the callback has finished executing). One
workaround is to execute asynchronous operations on two alternating
connections. For example, you can switch back and forth between two
connections stored in an array, as in the following code (we use the
reverse!
method to interchange the connections before using the
first connection in the array):
conns = [$ts2, $ts3] conns.first.take_async(["ABC"]) do |tuple| puts "got ABC" conns.reverse! conns.first.take_async(["DEF"]) do |tuple| puts "got DEF" end end
Using private tuple space regions for one-to-one communication
The local and global tuple spaces are each divided into disjoint regions. Once a client opens a connection to a region, it can communicate freely with any other client connected to the same region. These public regions facilitate many-to-many communication. Marinda provides a second, more restrictive form of region, the private region, that is designed to facilitate one-to-one communication. A private region is like a private mailbox; only the owner can read or remove tuples from a private region, but one or more other clients can write tuples into it.
A public region stands alone independently of clients, and provides persistent storage even with clients constantly joining and leaving. In contrast, the existence of a private region is directly tied to a single client connection. Whenever a client connects to Marinda, a new unique private region is created and associated with the connection itself. When the client closes the connection, the associated private region is discarded and becomes inaccessible. Thus, a connection provides access to a unique pair of public and private regions.
In Marinda, a client does not explicitly identify the public or the
private region it wishes to act on with a given operation. Instead,
the connection itself, on which an operation is invoked, serves as an
implicit identifier or handle for the target region. This is possible
because each connection is permanently associated with a single public
region and a single private region. Similarly, each operation
implicitly acts on either the public or the private region of a
connection. All operations described so far act (only) on the public
region, and the following two additional operations act (only) on the
private region: take_priv
and takep_priv
(and their asynchronous
versions take_priv_async
and takep_priv_async
). These operations
work exactly like take
and takep
, and because they retrieve tuples
via template matching (like ordinary operations), they allow you to
retrieve tuples from a private region in a different order than they
were put there. This flexibility allows you to implement something
like the share-nothing message-passing concurrency model of
Erlang.
Marinda uses a capability-like security model for private regions. Only the client holding a connection can remove tuples from the private region of the connection, and only clients that have a reference to a private region can write to it. A reference is an internal object that cannot be constructed by a client—a reference can only be obtained from Marinda, and only the client holding a connection (that is, the owner) can pass out references to others. Every time a client writes a tuple into a public region, the client implicitly and automatically passes out a reference to the private region of the connection used. This reference is attached as metadata to the tuple, and functions like the sender address on an envelope. When other clients retrieve the tuple, they implicitly obtain the reference, which they can then use to reply directly to the private region of the original sender. Because references can only be passed in the public region associated with a connection, only the clients that have access to a given public region can access the private regions of other clients connected to that public region, and consequently, only the clients connected to the same public region can communicate directly with each other via private regions.
A client must use special operations to write into another client’s
private region. The most basic operation is reply
, which simply
writes a tuple into the private region of the last retrieved tuple
(using any retrieval operation) on a given connection. This is
tailored for the simple case of a server replying directly to a
requester. To illustrate, we provide the code for a simple ping
measurement server and its client. The client issues a request by
writing a tuple into a public region, and the server writes the
response directly into the client’s private region with reply
:
$ts.write ["PING", "192.168.0.5"] result = $ts.take_priv ["RESULT", nil, nil] printf "RTT = %f\n", result[2]
loop do request = $ts.take ["PING", nil] addr = request[1] rtt = 123.456 $ts.reply ["RESULT", addr, rtt] end
We can conduct the entire request-response exchange over private regions by making use of a rendezvous tuple, a tuple written by the server that the client can retrieve to obtain a reference to the private region of the server:
$ts.read ["PING-SERVER", "EUROPE"] # get ref to private region $ts.reply ["PING", "192.168.0.5"] result = $ts.take_priv ["RESULT", nil, nil] printf "RTT = %f\n", result[2]
$ts.take_all ["PING-SERVER", "EUROPE"] do end # clear stale tuples $ts.write ["PING-SERVER", "EUROPE"] # register server loop do request = $ts.take_priv ["PING", nil] addr = request[1] rtt = 123.456 $ts.reply ["RESULT", addr, rtt] end
Some care is needed in using rendezvous tuples, since tuples will reference defunct private regions as soon as the connection they were written on closes. The second example server guards against this problem by clearing away prior rendezvous tuples at startup and then writing a fresh tuple. If a client tries to write into a defunct private region, then Marinda silently discards the tuple (but in the future, Marinda might return an error result to better indicate this situation).
Though convenient, the reply
operation is inflexible, since it can
only be applied to the private region of the last retrieved tuple.
Marinda provides more flexible operations that take a parameter, a
peer descriptor, that indicates the target private region. A peer
descriptor is a small nonnegative integer (analogous to a file
descriptor) that represents a remembered reference to a private
region. You can think of the peer descriptor as an index into a table
of remembered references; or taking the envelope analogy further, a
peer descriptor is an index into your personal address book, which
is filled by remembering the sender address of retrieved tuples. A
peer descriptor is an indirect reference that only has meaning on
the connection it was created on, and by design, knowing the value of
a peer descriptor will not allow a different client to access the
represented private region.
Call remember_peer
after retrieving a tuple to obtain a peer
descriptor for the private region associate with the tuple. Once you
have a peer descriptor, you can write into a specific private region
with write_to
. Here is a revised ping client that uses these new
operations (no change is required on the server side, although you do
need to start up an ASIA
server instance as well as the previous
EUROPE
instance):
$ts.read ["PING-SERVER", "EUROPE"] # get ref to private region eu_peer = $ts.remember_peer() $ts.read ["PING-SERVER", "ASIA"] asia_peer = $ts.remember_peer() $ts.write_to eu_peer, ["PING", "192.168.0.5"] $ts.write_to asia_peer, ["PING", "192.168.0.5"] result = $ts.take_priv ["RESULT", nil, nil] result = $ts.take_priv ["RESULT", nil, nil]
Only a finite (but large) number of peers can be remembered per
connection, so you should free an unneeded peer descriptor p with a
call to forget_peer(p)
. It is safe to call forget_peer
on an
already free peer descriptor.
Implementation limits
This section summarizes the implementation limits on Marinda clients and servers:
-
max template/tuple size = 65,535 bytes in serialized form; a template/tuple is serialized into a special text format, and this text format can be larger than the raw data values (in binary form) of a template/tuple, especially if there are long strings; strings, regardless of content, expand by a factor of about 4/3 in the text encoding
-
max string length in template/tuple = 24,575 bytes; the tuple space abstraction isn’t designed for bulk transport of data, but you can send large amounts of data in smaller chunks, if necessary
-
max values in template/tuple = 1024 values
-
max nesting depth of subarrays in template/tuple = 255 levels
-
max remembered references to private regions per connection = 231-1 references
-
max public/private regions = 248 regions per public/private scope; limited more by main memory, since all regions and their tuples must be in memory; the global server hosts all global public/private regions; each local server only hosts its own local public/private regions
-
max tuples per region = 231-1 (on 32-bit CPUs) and 263-1 (on 64-bit CPUs); limited more by main memory, since all tuples must be in memory; also, currently, an internal index will overflow after this many tuples have been written into any given region, which is a lot more likely to happen in practice than storing this many tuples; operations execute O(n), except when using the template
[]
, so performance may degrade significantly after a few thousand tuples due to unoptimized template matching -
max blocked operations per region = no imposed limit; however, performance may degrade significantly after a few hundred operations (like
take
andmonitor
) are blocked waiting for new tuples to arrive, since each tuple written into a region has to be checked against every blocked operation in the worst case, which is O(n) -
max Marinda connections per client = no imposed limit; limited only by the max number of file descriptors a process can open
-
max client connections per local server = no imposed limit; limited only by the max number of file descriptors a process can open
-
max nodes in Marinda deployment = 32,768 nodes