Personal and handheld computers are widely available for personal use. However, they are not as fast or as powerful as super computers. This is a challenge for scientists and researchers running complicated calculations on personal computers. On the other side, super computers are much more powerful but they are not widely available due to many factors such as size and cost. Even when super computers are available, they are complicated to manage and use. Therefore an ultimate solution would be to harness the power of super computers while providing the ease and availability of a personal or a handheld computer.
iCode system combines the ease of use and accessibility of a personal computer with the computing power of a super computer. Using MATLAB as an easy and accessible front-end, and a super computer as a powerful engine at the back-end, the system gives the user the ability to do complex computation on their personal computer. The system also introduces interactive monitoring and dynamic steering functionalities for a submitted job through a web browser by using the DISCOVER [1] infrastructure. iCode system manages the super computer resources efficiently by using the Comet [2] infrastructure. Comet infrastructure uses a demand-based policy to allocate resources for jobs. Jobs are served on a first-come first-served basis. Comet infrastructure also splits the given job into smaller jobs that can be run in parallel on the super computer.
iCode system uses event-driven control. In this control system, the main loop waits for the occurrence of external event. Whenever an event becomes available, it is dispatched to the appropriate object, based on information associated with an event. The user of the system enters a code comprising of function name, the required input, and the kind of output he needs, and submits it to the MATLAB subsystem. CometServer is in a listening state initially. The MATLAB sends the job request to ComerServer. On receipt of a request, the CometServer, creates a CometMaster process and submits the task. CometMaster in turn then creates processes called CometWorkers and assigns jobs to each of them. The CometWorkers triggers the creation of ComputeNode processes for the job execution. The ComputeNodes then send the result to JobMonitor, which is in a listening state. This result is forwarded to the MonitoringPanel of the user. If the user is interested in changing some job parameters, or tries to correct some error in real-time, he sends this request to the MonitoringPanel. The MonitoringPanel submits this request to the JobMonitor, which forwards it to the ComputeNodes. When the job execution is finished by the ComputeNodes, they send back the result to CometWorker and terminate themselves. CometWorkers submits their job result to CometMaster. CometMaster integrates all the results it receives, kills all the workers it created and returns the result to ComerServer. The CometServer then forwards the result to MATLAB, terminates the connection with ComerMaster.
By combining an accessible, simple and flexible interface, a super computer, an optimized solution for running parallel application, and the ability to monitor and steer calculations in real time, we give the user an optimized and a fast way to solve complex, computer intensive problems in a very short time. The system we introduce is a framework that integrates software packages, developed at The National Science Foundation Cloud and Autonomic Computing Center at Rutgers University, along with a MATLAB front-end, to run efficiently on a super computer. The system allows users to run MATLAB functions and/or stand-alone applications on a super computer. The system optimizes the run to use the super computer resources efficiently. The system also gives multiple users, connecting from different locations, the ability to collaboratively monitor and steer the calculation in real time using a web browser.
Future work on this project involves integrating Amazon Cloud EC2, mobile devices and other desktop software into the current infrastructure. The resultant infrastructure would have cloud and super computing as a backend engine, desktop and mobile platforms as a front end. Using both cloud and super computing as a backend engine will take advantage of both infrastructures, and will also eliminate some of the disadvantages of using only one of these two solutions. Using desktop platform brings ease of use, while mobile platform brings accessibility. Future work also involves optimizing DISCOVER. DISCOVER is built on DIOS library, which was built using MPI. With the introduction of BG/P, IBM introduced a lower level massage library DCMF. We believe that using DCMF one side communication will improve the functionality of DIOS and will make interactive steering and monitoring of scientific application faster.