Outline

Outline

• Review of Quiz #1

• Distributed File Systems

4/10/2020 COP5611 1

Distributed File Systems

• A distributed file system is a resource management component in a distributed operating systems

– It implements a common file system shared by all the computers in the systems

• Two important goals

– Network transparency

– High availability

4/10/2020 COP5611 2

Architecture

4/10/2020 COP5611 3

Architecture

– cont.

• Normally for performance reasons distributed file systems are organized as a client-server architecture

– File servers store files and perform storage and retrieval upon client’s requests

– Two most important parts are

• Name server

• Cache manager

4/10/2020 COP5611 4

Architecture

– cont.

4/10/2020 COP5611 5

Architecture – cont.

4/10/2020 COP5611 6

Mounting

• Mounting is a way to bind together different file systems to form a single hierarchical structured name space

– It is widely used in both local and distributed

UNIX machines

– In distributed file systems, file systems maintained by remote servers are mounted at the clients

4/10/2020 COP5611 7

Mounting

– cont.

4/10/2020 COP5611 8

Mounting

– cont.

4/10/2020 COP5611 9

Mounting

– cont.

4/10/2020 COP5611 10

Mounting

– cont.

4/10/2020 COP5611 11

Automounting

- cont.

4/10/2020 COP5611 12

Automounting

– cont.

4/10/2020 COP5611 13

Caching

• Caching is commonly used in distributed file systems to reduce delays in accessing the data

– In file caching, a copy of the data stored at a remote file server is brought to the client, reducing access delays due to network latency

– The effectiveness of caching is based on the temporal locality in programs

– Files can also be cached at the server side

4/10/2020 COP5611 14

Client Caching

4/10/2020 COP5611 15

Client Caching

– cont.

4/10/2020 COP5611 16

Cache Consistency

4/10/2020 COP5611 17

Hints

• An alternative approach to caching

– The cached data is treated as hints

– The cached data is not guaranteed to be completely accurate

• The cache consistency issue is ignored in this implementation

– This is useful for applications which can recover from invalid cached data

4/10/2020 COP5611 18

Bulk Data Transfer

• Bulk data transfer is to transfer multiple data blocks instead of just the block being referenced by the client

– Temporal locality and the fact that most files are accessed in their entirety

– Reduce the network communication overhead by reducing the cost of executing communication protocols

4/10/2020 COP5611 19

Security

4/10/2020 COP5611 20

Naming in Distributed File Systems

• A name in file systems is a way to reference a file or a directory

• Name resolution refers to the process of mapping a name to an object (or in the case of replication, to multiple objects)

• A name space is collection of names

4/10/2020 COP5611 21

Naming in a Local File System

4/10/2020 COP5611 22

Naming in a Local File System

– cont.

4/10/2020 COP5611 23

Naming in Distributed File Systems – cont.

• Three approaches to naming in distributed file systems

– The simplest scheme is to concatenate the host name to the names of files

• Not network transparent

• Not location-independent

– Mounting remote directories to local directories

• Location transparent but not network transparent

– A single global directory

• Limited to a few cooperating computers

4/10/2020 COP5611 24


• Context

– Content can be used to partition a file name space

– Here a filename consists of a context and a name local to the context

– Name resolution involves interpreting the name within a context, which may invoke other contexts recursively

4/10/2020 COP5611 25


• Name Servers are responsible for name resolution in distributed file systems

– A name server is a process that maps names specified by clients to stored objects such as files and directories

– A single name server vs. multiple name servers

4/10/2020 COP5611 26

Caches on Disk or Memory

• Cache in main memory vs. cache on a local disk

– Cache in main memory

• Advantages

• Disadvantages

– Cache on a local disk

• Advantages

• Disadvantages

4/10/2020 COP5611 27

Writing Policy

• This is related to the cache consistency

– It decides what to do when a cache block at the client is modified

– Several different policies

• Write-through

• Delayed writing policy for some time

– Delayed writing policy when the file is closed

4/10/2020 COP5611 28

Cache Consistency

• Schemes to guarantee consistency

– Server-initiated approach

• Servers inform the cache managers whenever the data in client caches become stale

• Cache managers can retrieve the new data when needed

– Client-initiated approach

• Cache managers validate data with the server before returning it to the clients

– Limited caching

4/10/2020 COP5611 29

Availability

• Availability is an important issue in distributed file systems

– Replication is the primary mechanism for enhancing the availability of files in distributed file systems

• Replication

– Unit of replication

– Replica management

4/10/2020 COP5611 30

Scalability

• Scalability deals with the suitability of the design to support more clients

– Caching helps reduce the client response time

– Server-initiated cache invalidation

– Some clients can be used as servers

– The structure of the server process also plays a major role in scalability

4/10/2020 COP5611 31

Semantics

• Semantics of a file system characterize the effects of accesses on files

– For example, a read operation should return the data (stored) due to the latest write operation

– Guaranteeing the semantics when employing caching, is difficult and expensive

4/10/2020 COP5611 32

Outline • Review of Quiz #1 • Distributed File Systems 5/29/2016

Distributed File Systems

Architecture

Architecture

Architecture

Architecture – cont.

Mounting

Mounting

Mounting

Mounting

Mounting

Automounting

Automounting

Caching

Client Caching

Client Caching

Cache Consistency

Hints

Bulk Data Transfer

Security

Naming in Distributed File Systems

Naming in a Local File System

Naming in a Local File System

Caches on Disk or Memory

Writing Policy

Cache Consistency

Availability

Scalability

Semantics

Related documents

Products

Support

Outline • Review of Quiz #1 • Distributed File Systems 5/29/2016

Outline

Distributed File Systems

Architecture

Architecture

Architecture

Architecture – cont.

Mounting

Mounting

Mounting

Mounting

Mounting

Automounting

Automounting

Caching

Client Caching

Client Caching

Cache Consistency

Hints

Bulk Data Transfer

Security

Naming in Distributed File Systems

Naming in a Local File System

Naming in a Local File System

Caches on Disk or Memory

Writing Policy

Cache Consistency

Availability

Scalability

Semantics

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib