System Sensitive Runtime Management of Adaptive Applications

Goal : Design and evaluation of an an adaptive, system sensitive distribution/load balancing framework for distributed adaptive grid hierarchies that underlie parallel adaptive mesh-refinement (AMR)
techniques for the solution of partial-differential equations. The framework uses system capabilities and current system state state to select and tune the appropriate partitioning parameters (e.g. partitioning granularity, load per processor) to maximize overall application performance.

Approach :

1) Monitor resources: System characteristics and current state are determined at run-time using an external resource monitoring tool. The resource monitoring tool gathers information about the CPU availability, memory usage and link-capacity of each processor.

2)Compute Capacities: The information obtained from the monitoring tol is used to compute a capacity metric for each processor in the heterogeneous network. We are using a linear model for the calculation of the capacity of each processor.

 Capacity = a*CPU + b*Mem + c*Link

a,b,c are the weights associated with CPU utilization, Memory and Link Capacity respectively. These weights are specified by the application.

3)Partition based on capacities: The system sensitive partitioner, called {\em ACEHeterogeneous}, has been integrated into the GrACE runtime and provides adaptive partitioning and load-balancing support for AMR applications.s

Current System Model:

Block diagram of the system architecture
 
 





Experimental Setup:

i)Synthetic Load Generation: In order to compare the two partitioning schemes, it is important to have an identical experimental setup for both of them. Hence, the experimentation was be performed in a controlled environment so that the dynamics of the system state was the same in both cases. This was achieved using a synthetic load generator to load processors with artificial work. The load generator decreased the available memory and increased CPU load on a processor, thus lowering its capacity to do any work.

ii) Resource Monitoring: We used Network Weather Service (NWS) resource monitoring tool to provide runtime information about system characteristics and current system state in our experiment. The website can be found at http://nws.npaci.edu/NWS .

iii) Dynamic Load Sensing: The system sensitive partitioner queries NWS at runtime to sense system load, computes the current relative capacities of the processors and distributes the workload based on these capacities. The sensing frequency depends on the dynamics of the cluster, and the overheads associated with querying NWS and computing relative capacities, and has to be chosen to balance the two factors.

Results:

i)Application performance improvement: The total application execution time using system sensitive partitioning and the default non-system sensitive partitioner is plotted in the Figure below. In this experiment, the application was run under the similar load conditions using the two partitioners. We calculate the relative capacities of the processors once before the start of the simulation. System sensitive partitioning reduced execution time by about 18% in the case of 32 nodes. We believe the the improvement will be more significant in the case of larger cluster and in cluster with greater heterogeneity and load dynamics. Furthermore, increasing the sensing frequency also improves performance as shown in a later experiment.

ii)Load balance achieved: This experiment investigates the load assignments and the effective load balance achieved using the two partitioners. In this experiment the relative capacities of the four processors were fixed at approximately 16%, 19%, 31% and 34%, and the application regrids every 5 iterations. The load assignment for the GrACE default (ACEComposite) and the system sensitive (ACEHeterogeneous) partitioners are plotted in Figures below. As expected, the GrACE default partitioner attempts to assign equal work to each processor irrespective of its capacity. The system sensitive partitioner however assigns work based on each processor's relative capacity.

The percentage of load imbalance for the GrACE default and the system-sensitive partitioning schemes is plotted in Figure below. For the kth processor, the load imbalance I_k is defined as

I_k = {|W_k-L_k|}*{L_k}\100%

where W_k is the ideal work load that should have been assigned to the processor according to its capacity and L_K is the work that is actually assigned.

As expected, the GrACE default partitioner generates large load imbalances as it does not consider relative capacities. The system sensitive partitioner produces smaller imbalances. Note that the load imbalances in the case of the system sensitive partitioner are due to the constraints (minimum box size and aspect ratio) that have to be satisfied while breaking boxes.

ii)Adaptivity to load dynamics: This experiment evaluates the ability of the system sensitive partitioner to adapt to the load dynamics in the cluster, and the overheads involved in sensing the current state. In this experiment, the synthetic load generator was used on two of the processors to dynamically vary the system load. The load assignments at each processor was computed for different sensing frequencies. Figure shows the load assignment in the case where NWS was queried once before the start of the application and two times during the application run. The figure also shows the relative capacities of the processors at each sampling. It can be seen that as the load (and hence the relative capacities) of the processors change, the partitioning routine adapts to this change by distributing the work load accordingly. Also note that as the application adapts, the total work load to be distributed varies from one iteration to the next. As a result, the work load assigned to a particular processor is different in different iterations even though the relative capacity of the processor does not change.