An Efficient Cache Strategy in Structured Peer-to-Peer Networks

advertisement
2011 International Conference on Software and Computer Applications
IPCSIT vol.9 (2011) © (2011) IACSIT Press, Singapore
An Efficient Cache Strategy in Structured Peer-to-Peer Networks
Shin-Yi Chou and Yu-Wei Chen +
Graduate Institute of Information and Logistics Management
National Taipei University of Technology
Taipei, Taiwan
Abstract. In this paper, we propose a cache strategy which is suitable for decentralized structure peer-topeer (P2P) networks. In our cache strategy, each peer will maintain a list regarding the previous search
results. Each peer can reduce most research times by exploring its list and enhance query efficiency.
Keywords: Peer-to-Peer, decentralized structure, cache
1. Introduction
The use of P2P applications is growing dramatically, particularly for sharing files and software. With
recent widespread deployment of P2P technologies, P2P computing is attracting increasing attention. Many
P2P systems have emerged recently as platforms for users to search and share information over the Internet.
P2P systems can be classified into three categories, centralized, decentralized-unstructured, and
decentralized-structured systems. Centralized P2P system has a central server which maintains a directory
containing content information of the whole P2P system. A popular example of a centralized system is
Napster [1]. However, centralized systems are prone to a single point of failure problem. Decentralized P2P
system has no central server. Instead, the query is distributed to each node in the system, so that all nodes are
connected as Ad hoc network topology. Each node can send messages to other nodes. Generally speaking,
the decentralized P2P system can be divided into two types: unstructured and structured system. In
unstructured system such as Gnutella [2], the search mechanism use flooding query messages to neighboring
computers within a time-to-live (TTL) framework, it is not scalable. These problems have been extensively
studied, such as [3]. Sripanidkulchai et al. proposed a content location solution in which peers are loosely
organized to form an interest-based structure on top of the Gnutella network. Chen et al. [4] developed the
optimal all-to-all broadcast scheme for the case of one-port communication that they not only require the
minimal number of communication steps, but also incur the minimal number of message. The structured
system used Distributed Hash Table (DHT) strategy. Each node maintains part of information and shares
them to each other. So the fault tolerance and performance scalability is increased. The Chord [1], Tapestry
[6], Pastry [7], and CAN [8] are well-known structured P2P networks. These works focus on providing one
fundamental lookup service. In addition, the explicit network topology can limit the logical distance between
two arbitrary computers to an upper bound.
This paper proposes a strategy to improve the search efficiency for structured P2P system. We propose a
cache strategy which is suitable for decentralized structure peer-to-peer networks. In our cache strategy, each
peer will maintain a list regarding the previous search results. Each peer can reduce most research times by
exploring its list. On the system side, the whole system reduce bandwidth load; on the user side, it can
enhance query efficiency.
+
Corresponding author. Tel.: +(886-2) 2771-2171 #2364; fax: +(886-2) 8772-6946.
E-mail address:, t8938012@ntut.edu.tw (Shin-Yi Chou), ywchen@ntut.edu.tw (Yu-Wei Chen).
38
The remainder of this paper is organized as follows: Some related works are introduced in Section 2. The
design of cache list is presented in Section 3. In Section 4, conclusions are presented.
2. Related Work
Recently, many researchers designed some strategies on the top of the existing P2P networks such as
Chord [5], Tapestry [6], and Pastry [7] to enhance the efficiency of search. A popular strategy is to distribute
part of the information to each node for increasing scalability, fault tolerance and more efficiently search.
The approach in Cooperative File System (CFS) [9] is a new P2P storage system that provides provable
guarantees for the efficiency and load-balance of file storage. CFS servers provide a distributed hash table
(DHash) for block storage. DHash distributes and caches blocks at a fine granularity to achieve load balance
and decreases latency with server selection. DHash finds block using the Chord location protocol, which
operates in time logarithmic in the number of servers. In PAST [10], a large-scale P2P persistent storage
utility, Rowstron and Druschel present and evaluate the storage management caching. In the PAST system,
storage nodes and files are each assigned uniformly distributed identifiers, and replicas of a file are stored at
nodes whose identifier matches most closely the file’s identifier.
In P2P traffic, cache technology has been extensively studied. Several measurements [11, 12] study user
behavior about download and upload. Gummadi et al. [11] probe deeply into modern P2P file sharing
systems. They analyzed P2P file sharing traffic in order to dig deeper into the nature of file sharing workload.
Their results show that the user behavior causes the P2P systems distribution to deviate substantially from
Zipf curves. Sen and Wang [12] characterize the P2P traffic. They observe much skewed distribution in the
traffic across the P2P file sharing system.
3. Proposed Cache Strategy
The cache strategies have been widely used to reduce the search latency and network traffic. In this
section, we present the client a caching strategy. When a client issues a request, it directly connected to the
frequently contact node according to the caching strategy for reducing the searching time.
3.1. Node Definition
We defined the different nodes in order to clarify the statement in the next section.
(Local) node: General users join to P2P environment, called local node.
Cache node: Cached by the local node from the cache list, called cache node.
Fig. 1: The Cache List in the System.
Figure 1 shows an example of the definitions of node. As an example to N2, N2 is a local node and N8,
N20, and N24 are cache nodes since they are listed in the cache of N2.
3.2. Cache List
39
We design a cache list
l in each node.
n
The cacche list contaains two typees of informaation includin
ng:
z
k value of cache node
key
z
c
cache
contennt list
We neeed to ensuree the correctnness of the cache
c
path an
nd find the cache
c
node w
which is mattched or not..
When a nodde find cachhe nodes in thhe list, it wiill directly th
hrough the key
k value andd found the cache nodess
location. It can effectiveely reduce thhe routing tim
me.
3.3. Joinn and Leavee
In P2P file system, nodes’ joinn and leave have high variability.
v
T node m
The
must find outt the system
m
distribution first when the
t node joinn a structuredd P2P file sy
ystem. Whenn leaving the system, the node shouldd
n
informaation to ensuure that its opperation is co
ontinued. In the
t cache strrategy, the join process iss
update the node
similar the general strructure of thhe P2P systtem. Each node
n
dynam
mically mainttains DHT information..
Especially, local node builds
b
a buffeered space too store cache,, create a cacche list and a timer.
When the
t node leavves the systeem, since thee cache list is stored in the local noode, it doesn’t affect anyy
structured network.
n
In thhis conditionn, node just sends
s
a “leav
ve message” to its cache nnode that thee cache nodee
realized it has
h been leaaves. Sometiimes the “leave messagee” failed sinnce wrong usser behaviorr or networkk
bandwidth problem.
p
Wee called the “accident
“
leaave”. In this condition,
c
it will not notiify the node which in thee
cache list, so the nodee which in thhe cache lisst can’t receeive its inforrmation receently. Thereffore, a nodee
periodicallyy send a “coonfirm messsage” to the cache nodees which in cache list iin order to confirm thee
statement of cache nodees. The node will immediiately updatee the cache innformation w
when a cachee node is joinn
or leave.
3.4. Searrch Protocool
Figure 2 shows thatt the Cache Strategy
S
in thhe P2P file system.
s
Eachh node confiigured a cach
he list in thee
system, andd we set a rouuting protocool as follow:
Step 1: Searrch the cachee list and connfirm whetheer the desired
d contents exxist in the cacche list or no
ot.
Step 2: If thhe desired contents exist in
i the cache, the node dirrectly connecct to the cachhe node.
Step 3: If thhe desired contents do not exist in thee cache, then the original searching prrotocol is exeecuted.
Fig. 2.The Caache Strategy in the System
m.
4. Concllusion
In this paper,
p
we prroposed a neew cache straategy on a deecentralized P2P system. The cache strategy cann
be embeddeed in any of the decentraalized P2P sttructures. In our cache strrategy, each peer will maintain
m
a listt
regarding thhe previous search resultts. User behaaviour, e.g. locality,
l
incrreased the caache hit ratio
o. Each peerr
can reduce search timess and enhancce query effficiency by exploring
e
its list. In futuure work, wee will designn
cache strateegies on the top
t of differeent structuress.
40
5. References
[1] The Napster homepage, http://www.napster.com
[2] Clip2.com, The Gnutella Protocol Specification V0.4, http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf,
Mar. 2001.
[3] K. Sripanidkulchai, B. Maggs, and H. Zhang, Efficient Content Location Using Interest-Based Locality in Peer-toPeer Systems, Proc. IEEE INFOCOM ’03, 2003.
[4] M. S. Chen, P. S. Yu, and K. L. Wu, Optimal NODUP All-to-All Broadcasting Schemes in Distributed Computing
Systems, IEEE Trans. Parallel and Distributed Systems, vol. 5, pp. 1275-1285, 1994.
[5] Stoica, R. Morris,D.R. Karger, M.F.Kaashoek, and H.Balakrishnan, Chord: AScalable Peer-to-Peer Lookup
Servicefor Internet Applications, Proc. ACM SIGCOMM, 2001.
[6] B. Y. Zhao. J. Kubiatowicz, and A. D. Joseph. Tapestry: a fault-tolerant wide-area application infrastructure.
Volume 32, 2002.
[7] A. I. T. Rowstron and P. Druschel. Pastry: Scalable decentralized object location, and routing for large-scale peerto-peer storage utility. In SOSP, 2001.
[8] S. Ratnasamy, P. Francis, M. Handley, R. M. Karp, and S. Shenker. A scalable content-addressable network. In
SIGCOMM, 2001.
[9] F. Dabek, M. F. Kaashoek, D. R. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with cfs. In
SOAP, 2001.
[10] A. I. T. Rowstron and P. Druschel. Storage management and caching in past, a large-scale, persistent peer-to-peer
storage utility. In SOSP, 2001.
[11] K. Gummadi, R. Dunn, S. Saroiu, S. Gribble, H. Levy, and J. Zahorjan, Measurement, Modeling, and Analysis of
a Peer-to-Peer File-Sharing Workload,Proc. ACM Symp. Operating Systems Principles (SOSP ‘03), pp. 314-329,
Oct. 2003.
[12] S. Sen and J. Wang, Analyzing Peer-to-Peer Traffic across Large Networks,IEEE/ACM Trans. Networking, vol. 12,
no. 2, pp. 219-232, Apr. 2004.
41
Download