Newsletter

4/14/09 - The Spring 2009 (PDF) edition of the CAC newsletter is available.

Newsletter Archive

 

The University of Florida , the University of Arizona and Rutgers, the State University of New Jersey , have established a national research center for autonomic computing (CAC).

This center is funded by the Industry/University Cooperative Research Center program of the National Science Foundation, CAC members from industry and government, and university matching funds.

A Computational Engine for Financial Modeling on the Cell Broadband Engine

Principal researchers: Manish Parashar and Ciprian Docan (Rutgers University)
Current collaborators: Chris Marty (Bloomberg)
Status: Ongoing

Summary:

Computational finance addresses problems such as rick estimation and management, volatility and option prices from a numerical point of view. The importance of the results requires accurate and precise mathematical models and systems, which are usually difficult to solve. An alternative and tractable solution is to use numerical simulation models, e.g., Monte Carlo or recursive binomial algorithms. However, to achieve the desired accuracy, we have to run a large number of simulations that take a significant amount of time. Existing solutions have reduced the execution time by providing parallel implementation for the numerical algorithms and by using clusters or grids of commodity computers. The hardware and maintenance costs, the failure rate of commodity clusters, the heat dissipation and the cooling costs motivate the search for alternative solutions.

The heterogeneous multi-core architectures such as the IBM Cell Broadband Engine (CBE) represent an attractive hardware solution at a cheaper price and a more compact and small size. The CBE has a dual-code main PowerPC unit and eight additional Synergistic Processing Elements (SPE) that are used as accelerators. The PPC architecture consumes less power, produces less heat and has better virtualization support. The SPEs of the CBE have vector units that are capable of delivering up to four instructions per cycle. A CBE can deliver 204.8Gflops in single precision mode, and 14.6Gflops in double precision mode. The SPEs and the main unit are interconnected by a fast bus that alleviates the inter processor communication problem. However, this novel architecture also brings new challenges such as a new programming model, efficient data and computation scheduling, or vector unit programming.

In the current work, we have designed an autonomic computational engine to support financial modeling such as the Value at Risk method on the CBE architecture. To utilize the CBE features efficiently, we have to keep the SPEs accelerators busy all the time by feeding them constantly and continuously with data and instructions to execute. To achieve an optimal resource usage, the engine specifically addresses data pre-fetching and multi-buffering techniques. We have developed a pipeline computational model using the multi-buffering techniques and the asynchronous Direct Memory Access (DMA) data transfers that overlaps computations with data transfers. This model allows us to hide the data transfers latencies and to keep the computational pipeline full. The parallel algorithm divides the input data space into smaller blocks of fixed size, i.e., pages, and shares the workload between the available SPEs. In addition to the data parallel approach, we implemented an optimized parallel version that uses instruction level parallelism, i.e., use the vector units to run multiple instructions per cpu cycle.

We tested the prototype computational engine and the models for multi-buffering and risk measurement on two platforms, which are powered by CBE processors, i.e., an IBM Q22 blade and a Playstation3 gaming console. The results show good scalability with the number of available SPEs. The current efforts focused on designing parallel versions for pricing algorithms, and part of the future work is an extension of this approach for an end-to-end solution.