|
CometG: A Decentralized
Computational Infrastructure for Grid-based Parallel Asynchronous
Iterative Applications
Overview:
CometG is a
decentralized (peer-to-peer) computational infrastructure that
extends Desktop Grid environments to support these applications.
CometG provides a decentralized and scalable tuple space,
efficient communication and coordination support, and
application-level abstractions that can be used to implement
Desktop Grid applications based on parallel asynchronous
iterative algorithms using the master-worker/BOT paradigm. The current prototype of CometG builds on the
JXTA peer-to-peer framework and is deployed
on PlanetLab.
Architecture:
The CometG computational infrastructure builds on a scalable,
decentralized tuple space Comet that spans the nodes of
the Desktop Grid. The tuple space is essentially a global
virtual shared-space constructed from the semantic information
space used by entities for coordination and communication. This
information space is deterministically mapped, using a locality
preserving mapping, onto the dynamic set of peer nodes in the
Grid system. The resulting structure is a locality preserving
semantic distributed hash table (DHT) built on top of a
self-organizing structured overlay.

Figure: A
schematic overview of the CometG system architecture.
The communication layer:
This layer provides an associative communication service
and guarantees that content-based messages, specified using
flexible content descriptors, are served with bounded cost. This
layer also provides a direct communication channel to
efficiently support large volume data transfers between peer
nodes. The communication channel is implemented using a thread
pool mechanism and TCP/IP sockets.
The coordination layer:
This layer provides the Linda-like shared-space coordination
interfaces: (i) Out(ts, t): a non-blocking operation that
inserts tuple t into space ts. (ii) In(ts,t,
timeout):a blocking operation that removes a tuple t
matching template t
from the space ts and returns it. If no matching tuple is found,
the calling process blocks until a matching tuple is inserted or
the specified timeout expires. In the latter case, null is
returned. (ii) Rd(ts,
t, timeout): a blocking operation that returns a
tuple t matching template
t from the space
ts. If no matching tuple is found, the calling process blocks
until a matching tuple is inserted or the specified timeout
expires. In the latter case, null is returned. This method
performs exactly like the In operation except that the
tuple is not removed from the space.
The application layer:
The CometG application layer provides coordination space abstractions and
programming modules to support master-worker/BOT parallel formulations of
asynchronous iterative computations. Specifically, two
customized coordination spaces, TaskSpace and BorderSpace,
are defined and implemented separately. TaskSpace stores task tuples representing application tasks
and specifying the masters that are responsible for the tasks. This space
implements First-In-First-Out (FIFO) semantics for tuple and template
operations, and provides a queue abstraction for task distribution and
management. The programming modules include masters and workers. A worker module
contains an application-specific computational component that
can locally compute a retrieved task. The worker uses the tuple
space abstractions to retrieve tasks and exchange borders.
Supporting large application/system scales:
CometG supports large application/system scales using multiple coordination
groups. A coordination group includes one TaskSpace, one
BorderSpace, and a group of masters and workers. A group can support
multiple applications with logically separate semantic spaces. An application
can also span multiple groups, each of which handles a part of the application.
The application is hierarchically partitioned, first across coordination
groups, and then across masters within each coordination group. Task with
communication dependencies should be mapped to the same coordination group if
possible as communications across groups can be expensive. Workers within a
coordination group communicate using the shared BorderSpace. Masters
within and across coordination group communicate using direct communication
channels.
Grid-based parallel asynchronous iterative applications:
CometG supports large application/system scales using multiple coordination
groups. A coordination group includes one TaskSpace, one
BorderSpace, and a group of masters and workers. A group can support
multiple applications with logically separate semantic spaces. An application
can also span multiple groups, each of which handles a part of the application.
The application is hierarchically partitioned, first across coordination
groups, and then across masters within each coordination group. Task with
communication dependencies should be mapped to the same coordination group if
possible as communications across groups can be expensive. Workers within a
coordination group communicate using the shared BorderSpace. Masters
within and across coordination group communicate using direct communication
channels.
|