2. current software architecture

advertisement
Bilkent University
Department of Computer Engineering
Senior Project
AD HOC PEER-TO-PEER FILE SHARING SYSTEM
FOR POCKET PC
Meltem ÇELEBİ
E. Büşra ÇELİKKAYA
Hayrettin GÜRKÖK
Fatma SÜTCÜ
Supervisor: Asst. Prof. Dr. İbrahim KÖRPEOĞLU
High-Level Design Report
December 30th, 2005
This report is submitted to the Department of Computer Engineering of Bilkent University in
partial fulfillment of the requirements of the Senior Projects course CS491.
TABLE OF CONTENTS
1. INTRODUCTION .................................................................................................................. 3
1.1 Purpose of the System ..................................................................................................... 3
1.2 Design Goals ................................................................................................................... 3
1.3 Definitions, Acronyms, and Abbreviations ..................................................................... 3
1.4 References ....................................................................................................................... 4
2. CURRENT SOFTWARE ARCHITECTURE ....................................................................... 5
3. PROPOSED SOFTWARE ARCHITECTURE ..................................................................... 7
3.1 Overview ......................................................................................................................... 7
3.2 Subsystem decomposition ............................................................................................... 8
3.2.1. Link monitoring subsystem ..................................................................................... 8
3.2.2 Data transfer subsystem ......................................................................................... 10
3.2.3 Application ............................................................................................................. 10
3.3 Hardware/Software mapping......................................................................................... 11
3.3.1 Pocket PC specifications ........................................................................................ 11
3.3.2 Windows Mobile 2002 Pocket PC specifications .................................................. 12
3.3.3 D-Link Specifications ............................................................................................ 12
3.3.4 .NET Platform Microsoft Visual Studio .NET 2003 .............................................. 12
3.4 Persistent data management .......................................................................................... 12
3.5 Access control and security ........................................................................................... 13
3.6 Global software control ................................................................................................. 13
3.7 Boundary conditions ..................................................................................................... 13
4. SUBSYSTEM SERVICES .................................................................................................. 13
4.1 Application subsystem services .................................................................................. 123
4.2 Link monitoring subsystem services ............................................................................. 13
4.3 Data transfer subsystem services ................................................................................... 13
5. APPENDIX .......................................................................................................................... 15
2
1. INTRODUCTION
1.1 Purpose of the System
The purpose of this project is to provide a portable file and message sharing system to
people with Pocket PCs. There exists systems that provide instant messaging and file
transfer but most of them require people to be registered to a network before and this
procedure is time consuming. We aim to provide a system that connects devices without
requiring an infrastructure.
1.2 Design Goals
Reliability: Since TCP will be used for major packet transfers, loss of information will be
prevented. When the connection is lost between two users, the active file transfers are
paused and they can be resumed, even from another user, if the same files exist.
Scalability: The system should handle excessive connection and transfer requests by a
queuing method. For large data transfers, the system may either prevent the transfer or
allow with maximum possible reliability.
Maintainability: The system should be designed so as to allow future modifications such
as implementation of multi-hop communication.
Functionality: The system should function correctly on PDAs running on Pocket PC OS.
It should be robust preventing invalid actions in place and warning the user on abnormal,
unexpected and malicious inputs. Besides, the system continues to work when nodes join
and leave the system.
Usability: The system should have an interface which the users recognize instead of
recalling the menu elements. The interface design should be consistent with most of the
current P2P systems and grouped according to logical subsections. Help and shortcuts
should be available for users. All the tasks should be reachable within a reasonable access
time which can be satisfied by visible and easy-to-use menus. The user should provide
feedback about the system status on important actions.
Operability: The system will be easy to install on PDAs. In case of a crash, the user
should be able to recover without any problem.
1.3 Definitions, Acronyms, and Abbreviations
Ad-hoc network: A mobile ad-hoc network (MANET) is a self-configuring network of
mobile routers (and associated hosts) connected by wireless links—the union of which
forms an arbitrary topology. The routers are free to move randomly and organize
themselves arbitrarily; thus, the network's wireless topology may change rapidly and
3
unpredictably. Such a network may operate in a standalone fashion, or may be connected
to the larger network.
Data packet: A unit of data sent over a network. Usually includes: a header, destination
address and the data itself.
IEEE 802.11a/b/g: family of wireless RF communication standards or 'languages' used in
PC industry.
IP Address (Internet Protocol Address): Every machine that is on a network (a local
network, or the network of the Internet) has a unique IP number. If a machine does not
have an IP address it cannot be on a network.
Layer: In networking, layers refer to software protocols. Each layer performs services for
the layer above it.
Node: Any device connected to network. PCs, servers, and printers are all nodes on the
network.
Peer: In networking, any functional unit in the same layer as another entity.
Ping: Ping is a basic Internet program that lets you verify that a particular Internet address
exists and can accept requests.
Pocket PC: A small portable computer with a 320x240 resolution screen running the
Windows CE 3.0 operating system. Typical models are Compaq iPAQ, HP Jornada, and
Casio Cassiopia.
Pong: This is the reply of a node that receives a “ping” request from newly connected
computers on the network. The pong lists the host’s IP address, network port, and the
number of files available for sharing and their combined size.
Port: A system or network access point for data entry or exit.
Single-hop access: All nodes can reach to other nodes only within their coverage area.
TCP: The Transport Control Protocol (TCP) is a transport layer protocol that moves
multiple packet data between applications.
UDP: User Datagram Protocol transports data as a connectionless protocol, using packet
switching.
XML (Extensible Markup Language): A W3C initiative that allows information and
services to be encoded with meaningful structure and semantics that computers and
humans can understand. XML is great for information exchange, and can easily be
extended to include user-specified and industry-specified tags.
1.4 References
4
[1] Project Bridge; Requirements Analysis Report;
http://www.ug.bcc.bilkent.edu.tr/~gurkok/senior/index.htm.
[2] Stephanos Androutsellis-Theotokis; A survey of peer-to-peer file sharing technologies;
December 2004.
[3] Bernd Bruegge, Allen H. Dutoit; Object-Oriented Software Engineering; PrenticeHall; 2000.
[4] D-Link Corporation; D-Link DCF-660W Compact Flash Adapter Quick Installation
Guide; 2002.
2. CURRENT SOFTWARE ARCHITECTURE
The current P2P file sharing architectures can be classified by their “degree of centralization”.
There are three categories according to what extent they rely to one or more servers to
facilitate the interaction between peers:
• Purely decentralized P2P architectures (such as the original Gnutella Architecture and
Freenet).
All nodes in the network perform exactly the same tasks, acting both as servers and
clients, and there is no central coordination of their activities. The nodes of such networks
are termed “servents” (SERVers+clieENTS).
• Partially centralized systems (such as Kazaa, Morpheus and more recently Gnutella).
The basis is the same as with purely decentralized systems. However, some of the nodes
assume a more “important” role than the rest of the nodes, acting as local central indexes
for files shared by local peers. These nodes are called
“supernodes”, and the way in which they are selected for these special tasks vary from
system to system. It is important to note that these supernodes do not constitute single
points of failure for a p2p network, since they are dynamically assigned and in case they
are subject to failure or malicious attack the network will take action to replace them with
others.
• Hybrid decentralized architectures (such as Napster).
There is a central server facilitating the interaction between peers by maintaining
directories of the shared files stored on the respective PCs of registered users to the
network, in the form of meta-data. The end-to-end interaction is between two peer clients;
however these central servers facilitate this interaction by performing the lookups and
identifying the nodes of the network (i.e. the computers) where the files are located. The
terms “peer-through-peer” or “broker mediated” are sometimes used for such systems.
A classification of peer-to-peer file-sharing systems is shown in Figure 1. Structured and
loosely structured systems are inherently purely decentralized, while unstructured systems can
be either pure or hybrid decentralized systems or partially centralized.
Unstructured Systems
Gnutella
5
In unstructured networks (such as Gnutella), the placement of data (files) is completely
unrelated to the overlay topology. Since there is no information about which nodes are likely
to have the relevant files, searching essentially amounts to random search, in which various
nodes are probed and asked if they have any files matching the query.
Unstructured P2P systems differ in the way in which they construct the overlay topology, and
they way in which they distribute queries from node to node. The advantage of such systems
is that they can easily accommodate a highly transient node population. The disadvantage is
that it is hard to find the desire files without distributing queries widely. For this reason
unstructured P2P systems are considered to be unscalable. However, work is done towards
increasing the scalability of unstructured systems.
Partially centralized unstructured systems
Kazaa, Morpheus
Kazaa and Morpheus are two similar partially centralized systems which use the concept of
“supernodes”, in which nodes that are dynamically assigned the task of servicing a small
subpart of the peer network by indexing and caching files contained in the part of the network
they are assigned to. Both Kazaa and Morpheus are proprietary and there is no detailed
documentation on how they operate.
Peers are automatically elected to become supernodes if they have sufficient bandwidth and
processing power using proprietary algorithms.
In Morpheus a central server provides new peers with a list of one or more supernodes with
which they can connect. supernodes index the files shared by peers connected to them, and
proxy search requests on behalf of these peers. Queries are therefore sent to supernodes, not to
other peers.
The advantage of partially centralized systems is that discovery time is reduced in comparison
with purely decentralized systems, while there is still no unique point of failure such as one
single central server. If one or more supernodes go down, the nodes connected to them can
open new connections with other supernodes, and the network will continue to operate. In the
event that a very large number or even all supernodes go down, the existing peers can become
supernodes themselves.
Gnutella (more recent architecture)
The concept of supernodes has also been proposed in a more recent version of the Gnutella
protocol. A mechanism for dynamically selecting supernodes organizes the
Gnutella network into an interconnection of SuperPeers (as they are referred to) and client
nodes.
As a node with enough CPU power joins the network, it immediately becomes a SuperPeer
and establishes connections with other SuperPeers, forming a flat unstructured network of
SuperPeers. It also sets the number of clients required for it to remain a SuperPeer. If it
receives at least the required number of connections to client nodes within a specified time, it
remains a SuperPeer. Otherwise it turns into a regular client node. If no other SuperPeer is
available, it tries to become a SuperPeer again for another probation period.
There are not many peer-to-peer file-sharing systems for handheld devices. One of them is
“tunA” which enables audio streaming between handheld devices.
tunA
6
The overall architecture of the tunA platform is illustrated in two diagrams. The first, Figure 2
shows how individual devices communicate with other peers on the same Ad-Hoc network in
two ways: a UDP multicast channel shared between all of them, and separate each-to-each
TCP/IP connections between all peers.
The second, Figure 3 shows an expanded view of a single tunA peer, and the interaction
between each of the major subsystems. Broadly speaking, tunA peers discover each other by
periodically multicasting UDP packets announcing their presence to all nearby devices, and
maintaining a list of those peers from whom it has detected similar packets within a specified
time. When a user selects a local audio track, the system begins to multicast packets
consisting of some timing info, and frames of MP3 data to all interested peers (itself
included.) A separate ‘listening’ process marshals this data into a buffer from which the MP3
decoder reads. The timing info is used to regulate the contents of the buffer, and the requests
of the decoder to provide a synchronized audio experience between peers. The IM component
exchanges profile data, text messages, and graphical avatars over separate TCP/IP
connections. A database maintains a record of all peers, events, audio, and messages
encountered by the system.
3. PROPOSED SOFTWARE ARCHITECTURE
3.1 Overview
The main characteristics of the proposed system are as follows:

Ad-hoc peer-to-peer network structure allows all nodes to act as both servers and
clients. In a P2P network, a node is a service requester from other nodes and also a
service provider to other nodes simultaneously (Figure 3.1).
Peer
Peer
service1()
service2()
…
serviceN()
requester
*
*
provider
Client
Server
Figure 3.1 – A peer is both a server and a client requesting and providing a series of
services simultaneously.

One of the main features of the system will be to discover the neighbors of each
node and maintain the information for the sake of consistency. Further by
combining the neighbor information of each node, the system can discover the
network topology.

Because the devices are mobile and the media is wireless, direct links between
devices are subject to change frequently. That is why the system should not only
maintain the neighbor information but also be able to provide reliability during
7
transmitting process. The direct links between the devices can be broken and the
nodes can move to other parts of the network.
3.2 Subsystem decomposition
The application level consists of three subsystems which are application, data transfer and
link monitoring. The kernel level includes standard layers; network layer (IP), transport
layer (TCP and UDP) and hardware level (IEEE 802.11).
Application
Application
level
Data transfer
Link monitoring
TCP
Kernel level
UDP
IP
IEEE 802.11
Figure 3.2 - The layers of the system
3.2.1. Link monitoring subsystem
This subsystem constitutes that fundamental of the program. In order for the other
subsystems to function properly, first, this subsystem must properly detect the peers
within range and maintain the topology. The main services provided by this subsystem
are neighbor detection and link maintenance.
Neighbor detection
The initial connection between the peers will be created with HELLO messages
broadcasted by the new coming nodes to all the existing peers in the network. The
message basically contains the IP Address and port of the newly joined node. Then all
the peers receiving the broadcast message reply back with a message containing their
IPs. Now this new node is connected to the network and can reach to other peers
whose IP Addresses are stored.
8
HELLO (IPA)
+= IPA
1
2
A
HELLO (IPA)
.
.
+= IPA
.
.
HELLO (IPA)
n
+= IPA
Figure 3.3 – New node ‘A’ broadcasts HELLO message to peers and all the peers
update their lists
IPA
1
HELLO (IP1)
+= IP1..n
HELLO (IP2)
A
2
IPA
.
.
.
.
HELLO (IPn)
IPA
n
Figure 3.4 – All peers reply to node ‘A’ with their IPs and node ‘A’ updates its list
Link maintenance
The proper way of leaving the network is broadcasting a GOODBYE message so that
all the peers can remove its IP from their lists. This way there will not be unnecessary
REQUEST messages sent to non-existing nodes.
GOODBYE (IPA)
- = IPA
1
2
A
GOODBYE (IPA)
GOODBYE (IPA)
.
.
n
- = IPA
.
.
- = IPA
Figure 3.5 – Leaving node ‘A’ broadcasts a GOODBYE message to all peers so that
they can remove its IP from the list
However there will be cases where the nodes will leave the network without a
GOODBYE message due to a crash or immediate close of the program To compensate
this, the list of peers in the network will maintained by periodically sending PING
9
messages to the nodes whose IP Addresses are stored in the list. If the package drops
or receives unsuccessful response, the node is supposed to be disconnected from the
network without a GOODBYE message. This maintenance is essential for consistency
and reliability.
IPA
1
2
- = IP2
IP1..N
A
.
.
n
Ping
.
.
IPA
Pong
Figure 3.6 – Node ‘A’ periodically pings the others. Node ‘2’ does not pong back so it
removes IP Address of node ‘2’ from the list
3.2.2 Data transfer subsystem
The main function of this subsystem is to provide reliable data transfer between the
nodes. Since the network structure is dynamic, the nodes can join and leave during a
file transfer. Hence, the system should handle the connection loss between these two
nodes, and the transferring file should be protected.
The data transfer protocol has the following features. Data packets have a unique
message ID, which includes the sender, receiver information, packet sequence number
and packet resend count. The receiving node sends an acknowledge message to the
sender, if the sender does not receive the acknowledgement within a given time, then
the sender retransmits the message.
During the data transfer, if a packet is known to be missing by the receiver, then the
receiver demands the missing packet, by sending the sequence number of the missing
packet, to the sender. Then the sender retransmits the missing packet to the sender.
Therefore, the data can be transferred completely and reliably.
3.2.3 Application
The actual user application (user interface) is going to be implemented after the
successful installation of the core layers into the wireless devices. The application will
run over the layers using its services. Thus the application is considered as a different
subsystem.
Given the results of the data request and transfers, user interface provides visualization
and accessibility tools. It is a module of the system that interacts with the user directly
10
and provides an abstraction while accessing the result of the operations. The design of
UI subsystem provides simple access to complex data.
Figures 5, 6 and 7 are some screenshots of the proposed UI demonstrating main
windows for each sub-menu.
3.3 Hardware/Software mapping
The system model is mapped on the hardware and software by mapping objects about
processor, memory, Input/Output issues and mapping associations about connectivity
issues. This system is a wireless peer-to-peer ad hoc system; therefore, each part of the
system has different hardware and software requirements. However, since the system
consists of individual nodes only, the tasks are not located at different locations.
The system requirements can be analyzed by Pocket PC hardware specifications, Pocket
PC operating system specifications, performance issues, connection tools specifications
and development environment issues.
Figure 3.7 – Deployment diagram for the system
3.3.1 Pocket PC specifications
We are using Asus A620 Pocket PCs during the development of our project. The
system is supposed to be running on the devices with at least these specifications. The
device has the following hardware specifications:
Processor
Memory
Operating System
Display
Expansion Slot
Notification System
400Mhz Intel® PXA255 Processor
64 MB RAM
Microsoft® Pocket PC 2002
3.5” Trans-reflective TFT LCD Display (65k colors)
255 levels of brightness
320x240 resolution
Compact Flash Type II slot
Event Notification
11
Audio
IrDA
Battery
AC Power
Size
Weight
Charge status
Integrated Microphone and Speaker
3.5mm stereo headphone jack
FIR/SIR (infrared)
1300mAH Lithium Rechargeable Battery
AC Input: 100-240VAC, 50/60Hz
Output: 5VDC, 2V (Typical)
125mm x 76.8mm x 13.3mm (L x W x H)
141grams
3.3.2 Windows Mobile 2002 Pocket PC specifications
Asus A620 Pocket PCs use Windows Mobile 2002 Pocket PC operating system that
provides multimedia applications and network connection services. Windows Pocket
PC 2002 has features that will be used by the system such as 802.11 support and .Net
Compact Framework.
3.3.3 D-Link Specifications
The wireless Compact Flash card that is used in this project is the D-Link DCF-660W,
which is IEEE 802.11b compliant. This card can create connection to an existing
wireless network, at the same time, it can create connections in Ad-Hoc mode and
Infrastructure mode. This system is an Ad-Hoc system that does not need an Access
Point. This card can reach a maximum signal rate up to 11 Mbps. Moreover, it
provides an Auto Fall-Back mechanism to adjust the speed of the adapter
automatically, according to the distances. The DCF-660W requires a Compact Flash
type I or II interface, which is available in the Pocket PC we are using. Besides, the
battery consumption of DCF-660W is 80mA in power save mode and less than
350mA in transmission mode. [4]
3.3.4 .NET Platform Microsoft Visual Studio .NET 2003
The development platform of the project is Microsoft Visual Studio .NET 2003 and
the implementation language is C#. The Visual Studio .NET 2003 supports the .NET
Compact Framework, so that development is possible in the devices such as the Pocket
PC, as well as other devices powered by the Microsoft Windows CE .NET operating
system. .NET Compact Framework constitutes the methodology of .NET platform.
.NET Compact Framework provides machine independent code development. Besides
the implementation language, that is C#, offers useful object oriented programming in
an efficient way.
3.4 Persistent data management
When a new node connects to the network, it receives the IP addresses and port numbers
of the existing nodes in the network to send requests. In order to provide the link
maintenance, this information must be stored. Since the nodes in the range are volatile and
12
they can easily join or leave the system, a text file or database would not very suitable to
store their information. Instead, a dynamic data structure such as a vector is better to use
for easy search, update, add and remove.
Moreover, each node will keep a list of its shared files and attributes to send to other
nodes when requested. A text file would not be efficient for transferring and handling
various attributes. However, an XML file is easy to maintain as it can be structured
according to the requirements.
3.5 Access control and security
Providing the security in ad-hoc networks is difficult, because access to the system can not
be restricted with a characteristic like IP address, since everything is dynamic. All users
can access to the system. However, users can see only the shared folders of the peers
through the program.
3.6 Global software control
In the software, the application and the core layers work independently. However, the
application should use the services of the layers, and the layers can send data to the
application. Therefore, layers and application will run as different threads and use a shared
global data structure.
The sequencing of actions in the layers is controlled by an event-driven main loop. The
loop waits for an event to occur such as a request from the application or a received
message from the environment. When an event becomes available it is dispatched to a
thread according to the type of the message, in order not to block other events.
3.7 Boundary conditions
Initialization: The program is brought to steady-state as it is run. To perform use-case
operations, the user has to “Connect” to the network. There is not a need for an existing
network of other devices; it can be the first node. The only condition is to have a Pocket
PC with a 802.11 wireless solution.
Termination: The preferred method of terminating the program is clicking on
“Disconnect” and “Exit” buttons respectively. In this way, other users will be informed
about the leave of the user and the temporary files will be removed.
Failure: The program may quit due to a crash, bug or external error (e.g. power supply).
We should implement the system so as to avoid internal errors. Also the user may exit the
program improperly, without disconnecting from the network. This is compensated with
period ping messages.
4. SUBSYSTEM SERVICES
4.1 Application subsystem services
13
When the graphical user interface is started, the Pocket PC becomes a node of the system.
The connection is established if there are existing nodes in the range, and user interface
displays available services that the user can control. If there are no existing Pocket PCs
running this program, a new connection is created and user can see the responses.
4.2 Link monitoring subsystem services
The creation of the links between other systems, first of all, UDP sockets are created.
They specify the type of the communication. The new node is connected to the other
nodes by sending the connection data to them. After the node can introduce itself to the
other nodes, the neighbor nodes are connected to each other. This connection services are
maintained after a new node joins or leaves the system. The links are preserved until the
connected node leaves, or the connection fails.
4.3 Data transfer subsystem services
Data transfer services use TCP connections in order to provide reliability. These services
are required during file and message transfers between the nodes. The system provides
lossless transfer of the data and handles the storing procedures of this data.
14
5. APPENDIX
Figure 1 – Table classifying the commonly used P2P programs
Figure 2 – How individual devices communicate with other peers
15
Figure 3 – Expanded view of a peer and the interaction between each major subsystem
Figure 4 – Single-hop access example. Yellow node searches for a file and only the blue node
has the file.
16
Figure 5 – Screenshot of main window with main menu options displayed
Figure 6 – Screenshot of main users window and submenu options
17
Figure 7 – Screenshot of a file collection window and file options
18
Download