– Distributed Chapter 6 Processing and File Systems

advertisement
Chapter 6 – Distributed
Processing and File Systems
Aims:
Contrast distributed processing with centralised processing.
Outline methods used to synchronise events between
processes.
Outline typical implementations of remote processing, especially
using RPC.
Define the strengths of distributed file systems.
Outline different implementation methods for distributed file
systems.
Distributed processing
• Using specialized resources, which would not normally be
accessible from a local computer, such as enhanced processing or
increased amount of memory storage.
• Using parallel processing, where a problem is split into a number
of parallel tasks, which are distributed over the network.
• Reducing the loading on the local computer, as tasks can be
processed on remote computers.
Distributed Processing
Remote processing
Client requests
a remote process
and passes process
parameters
Network
Server runs process and
returns the results to the
client
Interprocess communications
• Pipes. Pipes allow data to flow from one process to another, and have
a common process origin.
• Named pipe. A named pipe uses a pipe which has a specific name
for the pipe.
• Message queuing. Message queues allow processes to pass
messages between themselves, using either a single message queue or
several message queues.
• Semaphores. These are used to synchronize events between
processes.
• Shared memory. Shared memory allows processes to interchange
data through a defined area of memory.
• Sockets. These are typically used to communicate over a network,
between a client and a server (although peer-to-peer connections are
also possible).
Connection over
a network
Process
Process
AA
Process
Process
BB
Process
Process
AA
Gets access
Resource
Resource Sleep until
to resource
ready
and increments
a semaphore
(wait)
Process
Process
AA
Process
Process
AA
Shared
Shared
memory
memory
Process A | Process B
Process
Process
BB
Socket
Semaphores
Shared
memory
Process
Process
BB
Process
Process
BB
Pipe
Semaphores
There are only two operations on the semaphore:
• UP (signal). Increments the semaphore value, and, if necessary, wakes
up a process which is waiting on the semaphore. This is achieved in a
single operation, to avoid conflicts.
• DOWN (wait). Decrements the semaphore value. If the counter is
zero there is no decrement. Processes are blocked until the counter is
greater than zero.
Semaphore
Process A
Semaphore
11
Wait decrements
the semaphore
wait
wait ();
();
00
code
codethat
thatmust
mustbe
be
mutually
exclusive
mutually exclusive
signal
signal ();
();
Process B will go
to sleep as the
semaphore has
a zero value
Process B
wait
wait ();
();
11
Signal increments
the semaphore
Process B will wake
up when the
semaphore value
becomes a non -zero
code
codethat
thatmust
mustbe
be
mutually
exclusive
mutually exclusive
signal
signal ();
();
Semaphore
V  I W  S
where
I is the initial value of the semaphore.
W is the number of completed wait operations performed on the
semaphore.
S is the number of signal operations performed on it.
V is the current value of the semaphore (which must be greater than or
equal to zero).
Example of deadlock
#define MAX_BUFF 100
/*
maximum items in buffer
*/
int buffer_count=0;
/*
current number of items in buffer */
int main(void)
{
/* producer_buffer();
on the producer */
/* consumer_buffer();
on the consumer */
}
void producer_buffer(void)
{
while (TRUE){
/*
Infinite loop */
put_item();
/*
Put item*/
if (buffer_count==MAX_BUFF) sleep();/*
Sleep, if buffer full */
enter_item();
/*
Add item to buffer*/
buffer_count = buffer_count + 1;
/*Increment number of items in
the buffer */
if (buffer_count==1) wakeup(consumer);
/*was buffer empty?*/
}
}
Example of deadlock (cont.)
void consumer_buffer(void)
{
while (TRUE) {
/*
Infinite loop */
if (buffer_count==0) sleep();
/* Sleep, if buffer empty */
get_item();
/* Get item
*/
buffer_count = buffer_count - 1;
/* Decrement number of items in the
buffer*/
if (buffer_count==MAX_BUFF-1) wakeup(producer_buffer);
/* if buffer not full
anymore, wake up producer*/
consume_item();
/*remove item*/
}
}
Deadlock
• Resource locking. This is where a process is waiting for a resource
which will never become available. Some resources are preemptive,
where processes can release their access on them, and give other
processes a chance to access them.
• Starvation. This is where other processes are run, and the
deadlocked process is not given enough time to catch the required
event.
Deadlock example
C
B
A
D
E
F
Four conditions for deadlock
 Mutual exclusion condition. This is where processes get exclusive
control of required resources, and will not yield the resource to any
other process.
 Wait for condition. This is where processes keep exclusive control
of acquired resources while waiting for additional resources.
 No preemption condition. This is where resources cannot be
removed from the processes which have gained them, until they have
completed their access on them.
 Circular wait condition. This is a circular chain of processes on
which each process holds one or more resources that are requested
by the next process in the chain.
C
B
A
D
E
F
Deadlock avoidance – Bankers algorithm
•
•
•
Each resource has exclusive access to resources that have been granted to it.
Allocation is only granted if there is enough allocation left for at least one
process to complete, and release its allocated resources.
Processes which have a rejection on a requested resource must wait until
some resources have been released, and that the allocated resource must stay
in the safe
Its problems include:
•
•
•
•
•
Requires processes to define their maximum resource requirement.
Requires the system to define the maximum amount of a resource.
Requires a maximum amount of processes.
Requires that processes return their resources in a finite time.
Processes must wait for allocations to become available. A slow process may
stop many other processes from running as it hogs the allocation.
RPC
•
•
•
•
Servers. This is software which implements the network services.
Services. This is a collection of one or more remote programs.
Programs. These implement one or more remote procedures.
Procedures. These define the procedures, the parameters and
the results of the RPC operation.
• Clients. This is the software that initiates remote procedure calls to
services.
• Versions. This allows servers to implement different versions of the
RPC software, in order to support previous versions.
Protocol stack
Application
Application
program
program
Remote
Remote
process
process
Session layer (RPC) supports
the running of remote
processes and passing run
parameters and results
Transport layer sets up
a virtual connection, and
streams data
Network layer responsible
for the routing data over the
network and delivering it at the
destination
Application
program
Application
Application
Presentation
Presentation
Session
Session
RPC
Transport
Transport
TCP/IP
UDP/IP
Network
Network
Network
Data link
Data
DataLink
Link
Physical
Physical
Ethernet/ISDN/
FDDI/ATM/etc
RPC operation
Client
The
Thecaller
callerprocess
process
sends
a
sends acall
callmessage,
message,
with
all
the
with all the
procedure’s
procedure’s
parameters
parameters
Server
Server
Serverprocess
process
waits
for
waits foraacall
call
Process, and
parameters
Server
Serverreads
reads
parameters
parametersand
andruns
runs
the
process
the process
Caller
Callerprocess
processwaits
waits
for
foraaresponse
response
The
Thecaller
callerprocess
process
sends
a
call
message,
sends a call message,
with
withallallthe
the
procedure’s
procedure’s
parameters
parameters
Server
Serversends
sendsresults
results
to
the
client
to the client
Results
Server
Serverprocess
process
waits
for
waits foraacall
call
Distributed file systems
Administration
services
Mounted as
a local drive
Localized
file storage
(rather than
accessing a
remote file)
Network
Distributed
databases
Networked file
system (NFS)
Centralized
configuration
(passwords, user IDs,
and so on)
Distributed file system
• File system mirrors the corporate structure. File systems can
be distributed over a corporate network, which might span cities,
countries or even continents.
• Easier to protect the access rights on file systems. In a
distributed file system it is typical to have a strong security policy on
the file system, and each file will have an owner who can define the
privileges on this file.
• Increased access to single sources of information. Many users
can have access to a single source of information.
• Automated updates. Several copies of the same information can be
stored, and when any one of them is updated they are synchronized to
keep each of them up-to-date.
• Improved backup facilities. A user’s computer can be switched-off,
but their files can still be backed up from the distributed file system.
Distributed file systems (cont.)
•
•
•
•
•
Increased reliability. The distributed file system can have a backbone which
is constructed from reliable and robust hardware, which are virtually 100%
reliable, even when there is a power failure, or when there is a hardware fault.
Larger file systems. In some types of distributed file systems it is possible
to build-up large file systems from a network of connected disk drives.
Easier to administer. Administrators can easily view the complete file
system.
Interlinking of databases. Small databases can be linked together to create
large databases, which can be configured for a given application. The future
may also bring the concept of data mining, where agent programs will search
for information with a given profile by interrogating databases on the Internet.
Limiting file access. Organizations can setup an organization file structure,
in which users can have a limited view of the complete file system.
Traditional v. corporate structure
\\
users
users
orgname
orgname
config
config
sales
sales
progs
progs
fred
fred
production
production
research
research
UK
UKOffice
Office
bert
bert
US
USOffice
Office
Distributed file system
Single
tree
Global
Filesystem
file
system
/etc
Drives mounted
over the network
to create a single tree
/progs
/user
/sys
Networ
Network
Networ
kk
Application
C:
Forest of
drives
E:
D:
F:
Drives mounted
over the network
to a forest of drives
NFS
NIS
Presentation
XDR
Session
RPC
Transport
TCP
Network
IP
Data link
Physical
Ethernet/
Token Ring
RPC procedures and responses
NFS server
Remotely accessed
file system
RPC procedures
getattr, setattr,
read, write,
create, remove,
rename, link,
symlink, mkdir,
rmdir, readdir
File system either
mounted onto a single
tree or as a forest
of drives
Network
Network
RPC response
Requested data,
parameters or
status flag (such as:
NFS_OK and
NFSERR_PERM)
NFS client
NIS domains
#/etc/protocols
#/etc/protocols
ip
0
ip
0
icmp
1
icmp
1
ggp
3
ggp
3
tcp
6
tcp
6
Master NIS server maintains:
/etc/passwd
Domain passwords
/etc/groups
Domain groups
/etc/hosts
IP addresses and host names
/etc/rpc
RPC processes
/etc/network
Used to map IP address to networks
/etc/protocols
Known network layer protocols
/etc/services
Known transport layer protocols
IP
IP
ICMP
ICMP
GGP
GGP
TCP
TCP
Server
#/etc/groups
#/etc/groups
root::0:root
root::0:root
other::1:root,hpdb
other::1:root,hpdb
bin::2:root,bin
bin::2:root,bin
sys::3:root,uucp
sys::3:root,uucp
freds_grp::4:fred,fred2,fred3
freds_grp::4:fred,fred2,fred3
Clients
NIS
NIS
Domain
Domain
#/etc/rpc
#/etc/rpc
portmapper
portmapper
rstatd
rstatd
rusersd
rusersd
nfs
nfs
ypserv
ypserv
100000
100000
100001
100001
100002
100002
100003
100003
100004
100004
portmap sunrpc
portmap sunrpc
rstat rstat_svc
rstat rstat_svc
rusers
rusers
nfsprog
nfsprog
ypprog
ypprog
#/etc/hosts
#/etc/hosts
138.38.32.45
138.38.32.45
198.4.6.3
198.4.6.3
193.63.76.2
193.63.76.2
148.88.8.84
148.88.8.84
146.176.2.3
146.176.2.3
bath
bath
compuserve
compuserve
niss
niss
hensa
hensa
janet
janet
#/etc/passwd
#/etc/passwd
root:FDEc6.32:1:0:Super unser:/user:/bin/csh
root:FDEc6.32:1:0:Super unser:/user:/bin/csh
fred:jt.06hLdiSDaA:2:4:Fred Blogs:/user/fred:/bin/csh
fred:jt.06hLdiSDaA:2:4:Fred Blogs:/user/fred:/bin/csh
fred2:jtY067SdiSFaA:3:4:Fred Smith:/user/fred2:/bin/csh
fred2:jtY067SdiSFaA:3:4:Fred Smith:/user/fred2:/bin/csh
#/etc/services
#/etc/services
ftp
21/tcp
ftp
21/tcp
telnet
23/tcp
telnet
23/tcp
smtp
25/tcp
smtp
25/tcp
pop3
110/tcp
pop3
110/tcp
#/etc/networks
#/etc/networks
loopback
127.0.0.0
loopback
127.0.0.0
localnet
146.176.151.0
localnet
146.176.151.0
Production 146.176.142.0
Production 146.176.142.0
NIS domains
Master NIS
Server maintains:
/etc/passwd
/etc/groups
/etc/hosts
/etc/rpc
/etc/network
/etc/protocols
/etc/services
and so on.
Master sends updates to
NIS slaves
NIS
NIS
Domain
Domain
3. The client then binds to
the first server which
responds
Slave NIS
server
Slave NIS
server
2. Client broadcasts
an NIS request to the
domain
1. Client is
started
NIS
client
Download