Newsletter

4/14/09 - The Spring 2009 (PDF) edition of the CAC newsletter is available.

Newsletter Archive

 

The University of Florida , the University of Arizona and Rutgers, the State University of New Jersey , have established a national research center for autonomic computing (CAC).

This center is funded by the Industry/University Cooperative Research Center program of the National Science Foundation, CAC members from industry and government, and university matching funds.

Autonomic Data Streaming and In-transit Processing

Principal researchers: Viraj Bhat, Ciprian Docan, Manish Parashar (Rutgers University) Current collaborators: Scott Klasky, Oak Ridge National Laboratory Status: Ongoing

Summary:

Emerging enterprise/Grid applications consist of complex workflows, which are composed of interacting components/services that are separated in space and time and execute in widely distributed environments. Couplings and interactions between components/services in these application are varied, data intensive and time critical. As a results, high-through, low latency data acquisition, data streaming and in-transit data manipulation is critical.

The goal of this project is to develop and deploy an autonomic data management services that support high throughput, low latency data streaming and in-transit data manipulation. Key attributes/requirements of the service include: (1) support for high-throughput, low-latency data transfers to enable near real-time access to the data, (2) ability to stream data over wide area networks with shared resource and varying loads, and be able to maintain desired QoS, (3) minimal performance overheads on the application, (4) adaptations to address dynamic application, system, and network states, (5) proactive control to prevent loss of data, and (6) effective management of in-transit processing while satisfying the above requirements.

Application Workflow

The autonomic services address end-to-end QoS requirements are addressed at two levels, which cooperate to address overall application constraints and QoS requirements. The QoS management strategy at the application end-points combines model-based limited look-ahead controllers (LLC) and policy-based managers with adaptive multi-threaded buffer management. The application-level data streaming service consists of a service manager and an LLC controller. The QoS manager monitors state and execution context, collects and reports runtime information, and enforces adaptation actions determined by its controller. In-transit data processing is achieved using a dynamic overlay of available resources in the data path between the source and the destination (e.g., workstations or small to medium clusters, etc.) with heterogeneous capabilities and loads. Note that these nodes may be shared across multiple applications flows. The goal of in-transit processing is to opportunistically process as much data as possible before the data reaches the sink, while ensuring that end-to-end timing constraints are satisfied. The combined constraints are captured using a slack metric, which bounds the time available for data processing and transmission, such that the data reaches the sink in a timely manner. The in-transit nodes then use this slack metric to appropriately select in-transit resources from the dynamic overlay so as to maximize the data that is processed in-transit and consequently the quality of data reaching the destination.

Experiments with end-to-end cooperative data streaming demonstrated that adaptive processing using the autonomic in-transit data processing service during congestions decreases the average idle time per data block from 25% to 1%, thereby increasing utilization at critical times. Furthermore, coupling end-point and in-transit management during congestion reduces average buffer occupancy at in-transit nodes from 80% to 60.8%, thereby reducing load and potential data loss, and increasing data quality at the destination. Ongoing work is focused on incorporating learning models for proactive management, and virtualization at in-transit nodes to improve utilization.

Selected References:

  1. "An Self-Managing Wide-Area Data Streaming Service", V. Bhat, M. Parashar, H. Liu, M. Khandekar, N. Kandasamy, S. Klasky, and S. Abdelwahed, Cluster Computing: The Journal of Networks, Software Tools, and Applications, Special Issue on Autonomic Computing, Kluwer, Volume 10, Issue 7, pp. 365 - 383 December 2007.
  2. "Experiments with In-Transit Processing for Data Intensive Grid workflows", V. Bhat, M. Parashar, and S. Klasky, Proceedings of the 8th IEEE Intl. Conf. on Grid Computing, Austin, TX, Sept. 2007.
  3. "Enabling Self-Managing Applications using Model-based Online Control Strategies," V. Bhat, M. Parashar, M. Khandekar, N. Kandasamy, and S. Abdelwahed, Proceedings of the 3rd IEEE Intl. Conf. on Autonomic Computing, Dublin, Ireland, pp. 15 - 24, June 2006.