ZHEN LI
Ph.D. Candidate
 
     
  Home > Research > Comet
 

CometG: A Decentralized Computational Infrastructure for Grid-based Parallel Asynchronous Iterative Applications

Overview:

CometG is a decentralized (peer-to-peer) computational infrastructure that extends Desktop Grid environments to support these applications. CometG provides a decentralized and scalable tuple space, efficient communication and coordination support, and application-level abstractions that can be used to implement Desktop Grid applications based on parallel asynchronous iterative algorithms using the master-worker/BOT paradigm. The current prototype of CometG builds on the JXTA peer-to-peer framework and is deployed on PlanetLab.

Architecture:

The CometG computational infrastructure builds on a scalable, decentralized tuple space Comet that spans the nodes of the Desktop Grid. The tuple space is essentially a global virtual shared-space constructed from the semantic information space used by entities for coordination and communication. This information space is deterministically mapped, using a locality preserving mapping, onto the dynamic set of peer nodes in the Grid system. The resulting structure is a locality preserving semantic distributed hash table (DHT) built on top of a self-organizing structured overlay.

 

Figure: A schematic overview of the CometG system architecture.

The communication layer:

This layer provides an associative communication service and guarantees that content-based messages, specified using flexible content descriptors, are served with bounded cost. This layer also provides a direct communication channel to efficiently support large volume data transfers between peer nodes. The communication channel is implemented using a thread pool mechanism and TCP/IP sockets.

The coordination layer:

This layer provides the Linda-like shared-space coordination interfaces: (i) Out(ts, t): a non-blocking operation that inserts tuple t into space ts. (ii) In(ts,t, timeout):a blocking operation that removes a tuple t matching template t from the space ts and returns it. If no matching tuple is found, the calling process blocks until a matching tuple is inserted or the specified timeout expires. In the latter case, null is returned. (ii) Rd(ts, t, timeout): a blocking operation that returns a tuple t matching template t from the space ts. If no matching tuple is found, the calling process blocks until a matching tuple is inserted or the specified timeout expires. In the latter case, null is returned. This method performs exactly like the In operation except that the tuple is not removed from the space.

The application layer:

The CometG application layer provides coordination space abstractions and programming modules to support master-worker/BOT parallel formulations of asynchronous iterative computations. Specifically, two customized coordination spaces, TaskSpace and BorderSpace, are defined and implemented separately. TaskSpace stores task tuples representing application tasks and specifying the masters that are responsible for the tasks. This space implements First-In-First-Out (FIFO) semantics for tuple and template operations, and provides a queue abstraction for task distribution and management. The programming modules include masters and workers. A worker module contains an application-specific computational component that can locally compute a retrieved task. The worker uses the tuple space abstractions to retrieve tasks and exchange borders.

Supporting large application/system scales:

CometG supports large application/system scales using multiple coordination groups. A coordination group includes one TaskSpace, one BorderSpace, and a group of masters and workers. A group can support multiple applications with logically separate semantic spaces. An application can also span multiple groups, each of which handles a part of the application. The application is hierarchically partitioned, first across coordination groups, and then across masters within each coordination group. Task with communication dependencies should be mapped to the same coordination group if possible as communications across groups can be expensive. Workers within a coordination group communicate using the shared BorderSpace. Masters within and across coordination group communicate using direct communication channels.

Grid-based parallel asynchronous iterative applications:

CometG supports large application/system scales using multiple coordination groups. A coordination group includes one TaskSpace, one BorderSpace, and a group of masters and workers. A group can support multiple applications with logically separate semantic spaces. An application can also span multiple groups, each of which handles a part of the application. The application is hierarchically partitioned, first across coordination groups, and then across masters within each coordination group. Task with communication dependencies should be mapped to the same coordination group if possible as communications across groups can be expensive. Workers within a coordination group communicate using the shared BorderSpace. Masters within and across coordination group communicate using direct communication channels.