Overview and Motivation:

 

The growing demands on the Internet due to exploding popularity and usage will be met in part by effective use of caching: Internet caches serve dual purpose. First, caches minimize access latencies for objects that are shared by multiple clients or accessed repeatedly by a single client. Secondly, caches improve performance for all users by reducing load on the network backbone and servers. Given the ever-increasing number of users and highly parallel nature of the workloads, the cluster based Internet services offer two primary benefits: incremental scalability and high availability. Trading consistency for high availability and scalability, clusters of commodity PCs can easily dwarf single large machines such as SMPs. Clusters of workstations have some fundamental properties that can be exploited to meet the demands of Internet services. The inherent redundancy of clusters can be used to mask the transient failures. Using the set of memories of the cluster as a single large cache can prove extremely profitable.

 

Objective

 

The web-caching project is aimed at studying trade-offs between different schemes for cache integration and management to maximize throughput, hit rate, byte hit ratio, availability, and scalability.

 

Status

Windows NT cluster with

·        2 Xeon Servers (4 450 MHZ processors, 4GB RAM, 2 4GB SCSI U/W Disk, NT 4.0 Enterprise Edition

·        4 HP Vectra Desktop PC's running NT 4.0 Workstation, 500 MHZ processor, 256 MB RAM, 8.4GB Hard Disk

·        2X20GB DLT Tape Drive

·        4 OSM3000 RAID 7X 18 GB Disk (This is a total of 28, 18 GB dives)

is configured as a cluster-based proxy (caching) server and experimentation with community benchmark workloads such as Webbench is being carried out for different sizes of caches on different nodes.

 

Planned Course of Action:

As a part of this ongoing activity, we plan to evaluate for community benchmark workloads

·        A strict and a loose implementation of cache integration scheme. A strict implementation avoids replication of files whereas in a loose implementation, replication of files is allowed to some extent, for better load balancing and even distribution of files.

·        Mean access latencies when the caching scheme is implemented at disk (RAID) level and at memory level. Caching is effected at memory level by making use of remote peer clients’ memories and avoiding disk accesses.

·        The overhead of implementing strong consistency mechanism at proxy level in local area environments.

·        The performance of caching scheme with single caching server and distributed caching servers. The overall goal of this paper is to provide an insight into optimal ways of proxy configuration in local area environments.

 

Publications

 

Book Contribution

·        N. Shaha and M. Parashar, "Shared Memory Multiprocessors," in Encyclopedia of Electrical and Electronics Engineering," Editor: J.G. Webster, John Wiley and Sons Inc, (April 2000 estimated) (PDF)

Survey Papers

·        "Cache Coherence in Multiprocessors", prepared as course work for Computer Architecture I, Fall 1999. (PDF)

·        "Distributed Shared Memory Systems," prepared for special problems course in Distributed Computing, Fall 1999. (PDF)