|
|
|
CometG: A Decentralized
Computational Infrastructure for Grid-based Parallel Asynchronous
Iterative Applications
Overview:
Comet is a scalable content-based
coordination space for wide-area P2P environments. It provides a global virtual
shared-space that can be associatively accessed by all peers in the system, and
access is independent of the physical location of the tuples or identifiers of
the host. The Comet coordination model is based on a global virtual shared-space
constructed from a semantic information space. The information space is
deterministically mapped, using a locality preserving mapping, to a dynamic set
of peer nodes in the system. The resulting peer-to-peer information lookup
system maintains content locality and guarantees that content-based information
queries, using flexible content descriptors in the form of keywords, partial
keywords and wildcards, are delivered with bounded cost. Using this substrate,
the space can be associatively accessed by all system peers without requiring
the location information of tuples and host identifiers. The Comet provides
transient spaces that enable the applications can explicitly exploit context
locality. The current prototype of Comet builds on the
JXTA peer-to-peer framework and is deployed
on PlanetLab.
Architecture:
Comet is composed of layered abstractions
prompted by a fundamental separation of communication and coordination concerns.
A schematic overview of the system architecture is shown in below. The
communication layer provides scalable content-based messaging and manages
system heterogeneity and dynamism. This layer essentially maps the virtual
information space in a deterministic way to the dynamic set of currently
available peer nodes in the system while maintaining content locality. The
coordination layer provides Linda-like associative primitives and supports a
shared-space coordination model. This layer can be enhanced with notifications
and reactivity.

Figure: A
schematic overview of the Comet system architecture.
Tuple and tuple distribution:
In Comet, a tuple is a simple XML string.
This lightweight format is flexible enough to represent the information for all
kinds of applications and has rich matching relationships. Comet employs the
Hilbert Space-Filling Curve (SFC) to map tuples from a semantic information
space to the linear node index. Each tuple is associated with k keywords
selected from its tag and names. They are defined as the keys of the tuple in
the k-dimensional (kD) information space. If the keys of a tuple only
include complete keywords, the tuple is mapped as a point in the information
space and located on at most one node. If its keys consist of partial keywords,
wildcards, or ranges, the tuple identifies a region in the information space,
corresponding to a set of points in the index space. Each node stores the keys
that map to the segment of the curve between itself and the predecessor node.
The communication layer:
The communication layer provides an associative communication service and
guarantees that content-based messages, specified using flexible content
descriptors, are served with bounded cost. This layer essentially maps the
virtual information space in a deterministic way to the dynamic set of
currently available peer nodes in the system while maintaining content
locality. This layer includes a content-based routing
engine Squid
and a structured self-organizing
overlay. Squid
provides a decentralized information
discovery and associative messaging service. It implements the
Hilbert SFC mapping to effectively map a multi-dimensional
information space to a peer index space and to the current
system peer nodes, which form a structured overlay.The overlay
is composed of peer nodes, which may be any node in the system
(e.g., gateways, access points, message relay nodes, servers, or
end-user computers). The peer nodes can join or leave the
network at any time. While the Comet architecture is based on a
structured overlay, it is not tied to any specific overlay
topology.

Figure:
Tuple distribution and retrieval using the Out/In/Rd operators.
The coordination layer:
The coordination layer provides tuple operation primitives to support
shared-space based coordination model and can be enhanced with
notifications. This layer defines the following coordination primitives, which retain the
Linda semantics, i.e., if multiple matching tuples are found, one of them is
arbitrarily returned (and removed). If there is no matching tuple, the operation
waits for one to appear. (i) Out(ts, t): an non-blocking operation that inserts the tuple t
into space ts.
(ii) In(ts,t): a blocking operation that removes a tuple t matching template
t from the
spacets and returns it. (iii) Rd(ts,t): a blocking operation that returns a tuple t
matching template t
from the space ts. The tuple is not removed from the space.
The tuple
distribution and exact retrieval processes are
illustrated as below.
Transient spaces:
Comet is naturally suitable for
context-transparent applications. To address context-aware applications which
require that context locality be maintained in addition to content locality,
Comet supports dynamically constructed transient spaces that have a specific
scope definition (e.g., within the same geographical region or the same physical
subnet). The structure of the transient space is exactly the same as the global
space, which is accessible to all peer nodes and acts as the default
coordination platform. An application can switch between spaces at runtime and
can simultaneously use multiple spaces.
|
|