Main memory caching of web documents

advertisement
Main memory caching of web documents
The wide spread use of the World Wide Web and the increasing amount of information available via
web causes a lot of overhead resulting in disk access on the web servers. The most popular files block
can be cached in the server’s main memory thus reducing the number of disk access. The traditional file
system management methods cannot be used to manage main memory caches resulting in poor
performance cause of various reasons like difference in granularity, Intervention by Operating systems,
access mode of documents. The paper discusses some mechanism which includes caching a World Wide
Web server’s documents in the main memory of server.
The frequently accessed document is kept in Main Memory cache inside the web server’s address space.
On request of a document that is in server’s address space, the server accesses the file without help
from file system. The time to receive a document is dominated by a server’s latency. Experimental
analysis was carried out by a number of traces consisting of requests gathered from different
supercomputing centers. Performance of caching policy and choice of best caching policy were affected
by the average size of the requested document. There are two variations to cache hit rate namely
Document hit rate and Byte hit rate. Document hit rate is the ratio of the number of documents found in
cache to the number of documents requested. Byte hit rate is the ratio of bytes brought from the cache
to the number of bytes requested by the clients. Higher the document hit rate more client requests
would see low latency. The principle is to cache the document only if it is lesser than the threshold using
the threshold caching policy. To eliminate the need for off-line careful tuning and providing the
responsiveness to varying access patterns Adaptive caching policy that estimates the best threshold at
run time is used.
Adaptive policy is close to the best, it should be explored for caching the large documents too. Since it
has no static threshold being set it adapts to dynamically changing access patterns and is able to choose
the optimal threshold needed in each case. The documents that are frequently accessed are small
caching smaller documents is much more useful. The paper thus helps to understand that dynamic
caching policy should be used instead of the static ones. The future work should try implementing the
technique in a larger scale of the web.
Download