Algorithm and Code - Networked Software Systems Laboratory

advertisement
The Technion - Israel Institute of Technology
Electrical Engineering
Computer Networks Laboratory
Project Report
Subject:
Double FTP Client
Author:
Jonathan Charbit
Project Supervisor:
Ilan Hazan
Summer 5759
1
Abstract
Project specifications as defined by the Technion’s Network &
Communication lab were to design a FTP client, which could retrieve a file
from different servers, or even from the same one using two different pathes.
The idea was to get different parts of the same file, using FTP
command “REST”, as specified in RFC 959.
Here we implemented a very simple version under UNIX environment
written in C programming language : two parallel processes are run, each
one retrieving the first and second part of the selected file.
2
Index
Abstract
2
Index
3
Introduction
4
Theoretical background
5
1.The File Transfer Protocol
5
2.Terminology
7
User guide
11
1.First mode of operation
11
2.Second mode of operation
12
Program design
13
A. Routine description
13
1.Main routines
13
2.Other routines
18
B. Data types
21
Conclusion and suggestions
22
3
Introduction
The basic idea was to establish two FTP connections to retrieve the
two parts of a same file.
The way that we have selected to implement the program was to get
all the information about the second server at level of the program execution,
i.e. before the program is running, the user has to specify where to retrieve
the second part of the file, and the details about the second connection.
At the beginning of the program, a struct variable is set up that will be
used to run the second process (when using the 2give command, see
Program design). The second process is independent: it has its own data and
control connections.
4
Theoretical background
1.The File Transfer Protocol
The File Transfer Protocol, FTP, is the primary Internet standard for
file transfer: from server to client (retrieve) or from client to server (put).
FTP was written specifically for computers running TCP/IP and the protocol
is using two TCP connections in the same time:
_ control connection, for communication of FTP commands (from
client to server) and FTP reply (from server to client)
_ data connection for transfer of data (file transfer or list of files in
current directory)
FTP objectives were to promote sharing of files (computer programs
or data), to encourage indirect or implicit use of remote computers, to shield
a user from variations in file storage systems among hosts, and to transfer
data reliably an efficiently.
Though usable directly by a user at a terminal, FTP was designed
mainly to be used by programs. It supports several commands that allow
bidirectionnal transfer of both binary and text files between computers,
where the requesting computers acts as a client, and the second one as a
server.
A user account is required on the remote machine. Some servers allow
anonymous connections.
The user protocol-interpreter ( PI ) initiates the control connection,
under Telnet protocol. After the connection being established, FTP
commands are sent by user-PI to the server process via the control
connection. In the second direction, FTP replies are sent by server process to
user-PI. The communication channel from the user-PI to the server-PI is
established as a TCP connection from the user to the standard server FTP
port. The user protocol interpreter is responsible for sending FTP commands
and interprets the FTP replies of the server.
5
When user wants to retrieve a file, he has to send to server the value
of a free port to open on it the data connection. This is done by FTP
command “PORT port_number” (the command is sent on control
connection, like others FTP commands). Then user has to listen on the
specified port and to wait for data transferring from server. When the
transfer ends, the user-PI has to close the data connection. If user wants to
close data connection before the whole file has been sent (which is done by
the first process in our program, because it retrieves the first part of the file),
user-PI has to close the data connection and send (on control connection) the
FTP command “ABOR” (stand for abort).
Not like the control connection, the data connection is not permanent :
for each transfer of data (file transfer or file list of current directory), a new
data connection has to be opened.
At the end of the FTP connection, it is the responsibility of the user to
request the closing of the control connection (via FTP command “QUIT”),
while it is the server that closes it effectively.
6
2.Terminology
ASCII
In FTP, ASCII characters are defined to be the lowest half of an eightbit code set (i.e. the most significant bit is zero).
Control Connection
The communication path between the USER-PI and SERVER-PI for
the exchange of FTP commands and replies. This connection follows the
Telnet protocol.
Data Connection
A full duplex connection over which data is transferrred, in a specified
mode and type. The data transferred may be a part of a file, an entire file or a
number of files (also list of files in current directory is sent over data
connection). The path may be between a server-DTP and a user-DTP, or
between two server-DTP’s.
Data Port
The passive data transfer process “listens” on the data port for a
connection from the active transfer process in order to open data connection.
DTP
The data transfer process establishes and manages the data
connection. The DTP can be passive or active.
EOF
The end-of-file condition that defines the end of a file being
transferred.
7
FTP commands
A set of commands that can be sent from user to server and control
information flowing from user to client in both direction.
File
An ordered set of computer data (including programs), of arbitrary
length, uniquely identified by a pathname.
Pathname
Pathname is defined to be the character string, which must be input to
a file system by a user in order to identify a file. Pathname normally contains
device and/or directory names, and file name specification. FTP does not
yet specify a standard pathname convention. Each usr must follow the file
naming conventions of the file systems involved in the transfer.
PI
The protocol-interpreter; the user and server sides of the protocol
have distinct roles implemented in a user-PI and a server-PI.
Reply
A reply is an acknowledgment (positive or negative) sent from server
to user in response to FTP command. The general form of a reply is a
completion code (including error codes) followed by a text string. The codes
are for use by programs and the text is usually intended for human users.
Server-DTP
The data transfer process, in its normal “active” state, establishes the
data connection with the “listening” data port. It sets up parameters for
transfer and storage and transfers data on command from its PI. The DTP
can be placed in a “passive” state to listen for, rather than initiate a
connection on the data port.
8
Server-FTP process
A process or set of processes which perform the function of file
transfer in cooperation with a user-FTP process and possibly another server.
The functions consist of a protocol interpreter (PI) and a data transfer
process (DTP).
Server-PI
The server protocol interpreter “listens” on FTP port for a connection
from a user-I and establishes a control connection. It receives standard FTP
commands from the user-PI, sends replies, and governs the server-DTP.
Type
The data representation type used for data transfer and storage. Type
implies certain transformations between the time of data storage and data
transfer.
User
A person or a process on behalf of a person wishing to obtain file
transfer service. The human user may interact directly with a server-FTP
process, but use of a user-FTP process is preferred since the protocol design
is weighted towards automata.
User-DTP
The data transfer process “listens” on the data port for a connection
from a server-FTP process. If two servers are transferring data between
them, the user-DTP is inactive.
User-FTP process
A set of functions including a protocol interpreter, a data transfer
process and a user interface which together perform the function of file
transfer in cooperation with one or more server-FTP processes. The user
9
interface allows a local language to be used in the command-reply dialogue
with the user.
User-PI
The user protocol interpreter initiates the control connection from its
FTP port to the server-FTP process, initiates FTP commands, and governs
the user-DTP if that process is part of the file transfer.
10
User guide
Two modes of operation are available:


simple FTP
multiple FTP
1.First mode of operation
In the first mode of operation, a simple FTP client is running with this list
of commands implemented:






curdir : prints the remote current directory
dir : prints list of the files in the remote current directory
giveme file_name: retrieves file file_name on server
lcd directory: changes local current directory to directory
ldir : prints list of the files in the local current directory
mkdir new_dir : create a directory named new_dir in the local file
system

quit : stops the program

rhelp : prints list of the server supported FTP commands

type data_type: changes the data representation type to data_type
To run in first mode, just type:
client server_name
All the commands are case sensitive and should be typed in lowercase.
11
2.Second mode of operation
When using the second mode, parallel transferring from two servers
is available.
To run in second mode, one has to give all the information about the
second server:
client first_server_name second_server_name username password
path file_name file_size
When you get the “mftp>” prompt, just type 2give, and the two processes
are running.
12
Program design
A. Routine description
1.Main routines:
The program is beginning in the file my_client.c inside the main() function.
If a second server is specified by user, the struct variable file_location (see
data types) is set up.
Then, ConnectToServer() is called.
ConnectToServer (server_name):
Server_name : string argument, name of the server to connect to.




creates a socket (only the control connection is created)
connects to server (and get a reply)
calls to Login() (with NULL as login and pass arguments).
if success, calls to GetAndInterpret() in a loop till the quit command is
typed by user

close the control connection
GetandInterpret(sock):
Sock: struct my_socket * argument, see datatypes
This is the principal function of the program, where command typed by user
is processed.
By calling GetCmdandParam(), the command line is decomposed into
command name and arguments.
13
Here the list of all the commands; the principal ones are described in detail:










curdir : prints the remote current directory
dir : prints list of the files in the remote current directory
giveme file_name: retrieves file file_name on server
lcd directory: changes local current directory to directory
ldir : prints list of the files in the local current directory
mkdir new_dir : create a directory named new_dir in the local file
system
quit : stops the program
rhelp : prints list of the server supported FTP commands
type data_type: changes the data representation type to data_type
2give: retrieves two parts of the file in two different processes
curdir:
sends on the control connection the FTP command “PWD” (for Present
Working Directory) and print the reply on screen (by calling ReadAndPrint).
dir:

opens the data connection by calling OpenDataConnection() with the
struct sock

sends, on the control connection, the port number for the data
connection by calling SendPort()

sends, on the control connection, the FTP command “LIST”

reads on data connection and prints the file list, by calling WriteList()
giveme file_name:
calls GetFile() with the filename get from the command line as second
parameter, 0 as third parameter meaning that the transfer begins at the first
byte of the file, and ILLIMITED as last parameter to say that the transfer
ends only at the end of the file.
lcd directory:
14
Executes the system call chdir with parameter directory.
ldir:
Executes the shell command ls –l.
mkdir new_dir:
uses system calls to create new_dir on local working directory
quit:



sends FTP command “QUIT” to server
prints reply
return QUIT value
rhelp:


sends FTP command “QUIT” to server
prints reply which contain the command list (the command list is sent
by server as a reply on control connection)
type data_type:


sends FTP command “TYPE” with appropriate parameter to server
prints reply
2give:


splits in two processes
father process : calls GetFile() with appropriate parameters : the file
name and size are taken from the struct file_location * my_file (see data
types) set up in the beginning of the program. The third parameter is 0,
meaning that the transfer begins at the first byte of the file; fourth
parameter is the number of blocks to transfer: it is calculates in a way that
15
no “hole” is left between the two processes (sometimes it might causes a
“double copying “ of some bytes).
Blocks_to_transfer=file_size/(2*FILE_BLOCK_SIZE) + 2
The “ +2” comes to prevent “holes” because Blocks_to_transferis is an
integer variable, and some problems may occur when divising.

son process : calls to ConnectToNewServer() to create new sockets
for control and data connections and retreive the second part of the file.
Now we continue the main routine description.
GetFile(sock, file, beginning, blocks_to_transfer):
Sock: my_socket * argument, see data types
File: string argument, file name to transfer
Beginning: number of bytes to restart from (set to zero if transfer from the
beginning of the file)
blocks_to_transfer: number of blocks to transfer, size of each block is
FILE_BLOCK_SIZE bytes (set to ILLIMITED if transfer till the end of the
file)









calls OpenDataConnection()
creates (or opens if existing) file on the local file system
sends FTP command “REST” (stand for restart) with the value of
beginning
moves file pointer in the local file system in the right place
reads on data connection and prints on file on a loop (each loop is
transferring a block) till blocks_to_transfer are transferred or end of file
is reached
prints total transferred bytes
if stops before EOF, sends FTP command “ABOR” (for abort)
closes file pointer and data connection socket
returns transferred bytes
16
ConnectToNewServer(my_file):
My_file: file_location* argument, see data types


creates socket for the new control connection
connects to new server, using username and password from my_file
(set up at the beginning of the program)

calls to GetFile() with appropriate parameters: the file name and size
are taken from the struct file_location * my_file (see data types) set up in
the beginning of the program. Third argument is
(size+FILE_BLOCK_SIZE)/2. Fourth argument is ILLIMITED.
17
2.Other routines:
ReadAndPrint(sd, gen_code):
Sd: integer argument, number of the control connection socket
Gen_code: integer variable, set up to FIRST_ITERATION needed for
recursive calls
ReadAndPrint() is a recursive function. Every call gets one reply from the
server and prints it. The number of reply the server is going to send is not
known at the time the first reply is sent. FTP protocol specifies that the
server should put a “marker” at the beginning of the last reply.
ReadAndPrint() is calling itself until this marker is seen on the beginning of
the reply.
To identify the first call, gen_code is set to FIRST_ITERATION.
GetCmdandParam(cmd, param, cmd_line):
Cmd: string that contains ,at the end of the routine, the name of the
command typed by user
Param: array of strings, that contains ,at the end of the routine, the
parameters of the command line
Cmd_line: command line typed by user
GetCmdandParam() decomposes cmd_line in words : first word in cmd, and
next words in param.
18
Login(sd, login, pass):
Sd: integer argument, number of the control connection socket
Login: string argument, username for the connection (useful for the second
connection)
Pass: string argument, password for the connection (useful for the second
connection)

gets username from user and sends USER command on control
connection

gets password from user and sends PASS command on control
connection

if login and pass are not NULL, they are sent to server
OpenDataConnection(sock):
Sock: my_socket * argument, see data types



creates socket for data connection
chooses a port for data connection, randomally
listen on this port
SendPort(sock):
Sock: my_socket * argument, see data types


gets data connection port number
sends FTP command “PORT” with the appropriate number
19
SendRest(sock, num):
Sock: my_socket * argument, see data types
Num: number of bytes to restart from


sends FTP command “REST” with the appropriate number
prints reply
WriteList(sock):
Sock: my_socket * argument, see data types

reads on data connection and prints to screen on a loop until no more
data is sent by server

prints reply (from server on control connection)
20
B. Data types:
For socket ids, a struct variable has been defined.
struct my_socket {
int ctl; /* socket id for the control connection */
int data; /* socket id for the data connection */
}
A struct variable has been defined for information about the file to retrieve
by the help of the command “2give” (with two parallel running processes).
struct file_location {
char* server_name; /*server name for the second connection*/
char* file_name;
/* name of the file to retreive*/
char* login;
/*username for the second connection*/
char* password;
/*password for the second connection*/
char* path;
/*path of the file to retreive*/
int size;
/* size of the file to retreive*/
}
21
Conclusion and Suggestions
This project was only the first steps in the implementation of a very
wide idea: parallel transfer of different parts of a same file.
Many advanced implementations could be designed in future:




increasing the number of parallel connections
implementation using threads instead of different processes
insertion of time measurement to compare performances
instead of attribute to each connection a part of the file before running
time, ones could design a program in which each connection takes care of
a “File block”, and when the whole “File block” has been transferred go
for the next “File block”
Example with N connections and File block size :1000 Kbytes
Connection #1: 1-1000
Connection #2: 1001-2000
.
.
.
.
Connection #N: (N-1)*1000 +1- N*1000
The first of the N connections that complete its “File Block” will transfer the
N+1th “File Block” : N*1000 +1 – (N+1)*1000… and so on…

File block size could be defined differently and dynamically for each
connection, in view of connection performance
22
23
Download