GridARM Autonomic Runtime Framework

The overall goal of GridARM autonomic runtime framework is to reactively and proactively manage and optimize SAMR application execution using current system and application state, online predictive models for system behavior and application performance, and an agent based control network. It builds on the concept of vGrid proposed by M. Parashar and S. Hariri . The GridARM architecture provides application developers with a convenient abstraction of a virtual Grid that may be significantly larger and more reliable than currently available resources. The autonomic runtime framework manages physical Grid resources, allocates them ``on-demand'', and spatially and temporally maps the virtual resources to these physical nodes. The mapping exploits the space, time, and functional heterogeneity of the simulations and underlying numerical methods to define application ``working-sets''. GridARM infrastructure services are responsible for collecting and characterizing the operational, functional, and control aspects of the application and using this information to define autonomic components, decomposing the application into natural regions (NRs) and the NR into virtual computational units (VCUs), and applying innovative allocation and scheduling strategies to map VCUs to physical Grid resources. Together, these solutions will allow application developers to concentrate on the science and its formulations without having to worry about explicitly addressing the number, limitations, and availability of resources or targeting and tuning their implementations to specific architectures and machines.

The conceptual GridARM architecture is shown in the figure above. The framework has three components: (1) services for monitoring Grid resource capabilities and application dynamics and characterizing the monitored state into natural regions; (2) deduction engine and objective function that define the appropriate optimization strategy based on runtime state and policies; and (3) autonomic runtime manager which is responsible for hierarchically partitioning, scheduling, and mapping VCUs onto VRUs, and tuning application execution within the Grid environment.

GridARM: Monitoring and Characterization

The monitoring and characterization mechanisms in the GridARM framework consist of embedded application-level and system-level sensors/actuators and are illustrated in the figure below. The application is characterized into ``natural regions'' (NRs) which are regions of relatively homogeneous activity in the application domain and can span various levels of the SAMR grid hierarchy. Application sensors monitor the structure and state of the SAMR grid hierarchy and the nature of the refined regions. One way to track such natural regions for SAMR applications is using the refinement patterns based on local truncation errors. The application state is abstracted using these natural regions and is characterized in terms of application-level metrics such as computation/communication requirements, storage requirements, activity dynamics, and the nature of adaptations.

Similarly, system sensors, built on existing infrastructures such as NWS (Network Weather Service) and MDS (Metacomputing Directory Service), sense the current state of underlying computational resources in terms of CPU, memory, bandwidth, availability, and access capabilities. These are fed into the system state synthesizer along with history information (current state stored over time in the history module) and performance estimates (obtained using performance functions from the prediction module) to determine the overall system runtime state. The current application and system state are provided as inputs to the deduction engine and are used to define the autonomic runtime objective function.

GridARM: Deduction and Objective Function

The deduction engine and the autonomic runtime manager provide the primary decision making capabilities within the GridARM framework. As shown in the figure below, the current application and system state and the overall ``decision space'' are the inputs to the deduction engine. The decision space comprises the adaptation policies, rules, and constraints defined in terms of application metrics, and enables autonomic configuration, adaptation, and optimization. Application metrics include application locality, communication mechanism, data migration, load balancing, memory requirements/constraints, adaptive partitioning, adaptation overheads, and granularity control. Based on current runtime state and policies/constraints within the decision space, the deduction engine formulates prescriptions for algorithms, configurations, and parameters that are used to define the objective function for adapting the behavior of the SAMR application. The deduction engine may be capable of self-learning by augmenting its decision space with new rules and constraints. The prescriptions provided by the deduction engine along with the objective function yield two metric - normalized work metric (NWM) and normalized resource metric (NRM) that characterize the current application state and current system state, respectively. These metric are self-defined based on the current application/system context and enable autonomic runtime management by helping to configure the SAMR application with appropriate parameters and execute optimally within the heterogeneous Grid environment.

GridARM: Autonomic Runtime Manager

The normalized metric, NWM and NRM, form the inputs to the autonomic runtime manager (ARM). Using these inputs, ARM defines a hierarchical distribution mechanism, configures and deploys appropriate partitioners at each level of the hierarchy, and maps the application domain onto virtual computational units. A virtual computational unit (VCU) is the basic application work unit that is scheduled by the GridARM framework and may consist of computation patches on a single refinement level of the SAMR grid hierarchy or composite patches that span multiple refinement levels. VCUs are dynamically defined at runtime to match the natural regions (NRs) in the application. Using natural regions to define VCUs can significantly reduce coupling and synchronization costs.

Subsequent to partitioning, scheduling operations on the virtual Grid are performed first across VRUs (Global-Grid Scheduling (GGS)) and then within a VRU (Local-Grid Scheduling (LGS)). During GGS, VCUs are hierarchically assigned to sets of VRUs, whereas LGS is used to schedule one or more VCU within a single VRU. The entire process is first spatial and then temporal, and combines a range of partition techniques (domain-based, patch-based, tree-based, etc.) and scheduling techniques (gang, backfilling, migration, etc.). A virtual resource unit (VRU) may be an individual resource (compute, storage, instrument, etc.) or a collection (cluster, supercomputer, etc.) of physical Grid resources. A VRU is characterized by its computational, memory, and communication capacities and by its availability and access policy. Finally, the VRUs are dynamically mapped onto physical system resources at runtime and the SAMR application is tuned for execution within the dynamic Grid environment.

Note that the work associated with a VCU depends on the state of the computation, the configuration of the components (algorithms, parameters), and the current ARM objectives (optimize performance, minimize resource requirements, etc.). Similarly, the capability of a VRU depends on its current state as well as the ARM objectives (minimizing communication overheads implies a VRU with high bandwidth and low latency has higher capability). The normalized metric NWM and NRM are used to characterize VRUs and VCUs based on current ARM objectives.

Current Status of GridARM Framework

The GridARM framework is currently under development at The Applied Software Systems Laboratory (TASSL) at Rutgers University, with current efforts focused on the design, implementation, and evaluation of the core building blocks of the framework. The application used in the experimental evaluation of the GridARM prototype components is the 3-D Richtmyer-Meshkov instability solver encountered in compressible fluid dynamics. RM3D has been developed by Ravi Samtaney as part of the virtual test facility at the Caltech ASCI/ASAP Center. Experimental evaluations of individual prototype components of the framework have yielded improvements in overall SAMR application execution time and other application runtime parameters as compared to the results obtained using non-adaptive schemes.

Application aware partitioning uses current runtime state to characterize the SAMR application in terms of computation/communication, application dynamics, and the nature of adaptations. This adaptive strategy selects and configures the appropriate partitioner that matches current application requirements, thus improving overall execution time by 5-30% as compared to non-adaptive partitioning schemes. The adaptive hierarchical partitioning scheme dynamically creates a group topology based on SAMR natural regions and helps to reduce the synchronization costs needed to maintain the global hierarchy state, resulting in improved application communication time by up to 70% as compared to non-hierarchical schemes. System sensitive partitioning uses current system state obtained using NWS to select and tune distribution parameters by dynamically partitioning and load balancing the SAMR grid hierarchy based on the relative capacities for each processor. In contrast to the non-heterogeneous scheme, the system sensitive approach improves overall execution time by 10-40%.

In addition, various optimizations have been incorporated within the GridARM framework that aim to improve SAMR application runtime parameters. Architecture sensitive communication mechanisms select appropriate messaging schemes that are suited for the underlying hardware architecture and help to improve application communication time by up to 50%. The workload sensitive load balancing strategy uses binpacking-based partitioning to distribute the SAMR workload among available processors while satisfying application constraints such as minimum patch size and aspect ratio. This approach reduces application load imbalance to 2-15% as compared to default schemes that employ greedy algorithms. Furthermore, performance prediction using performance functions can be used to estimate the application execution time based on current loads, available communication bandwidth, current latencies, and available memory. This approach helps to determine when the costs of dynamic load redistribution exceed the costs of repartitioning and data movement, and can result in 25% improvement in the application recompose time.