PPT - Cern

advertisement
Non Home Directory Storage
FOCUS
Judy Richards
July 1st 1999
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
1
Scope of this presentation
Data - places for storing it today
•
•
•
•
•
AFS project space
AFS workspaces
HPSS
local disk
NOT possible future solutions Eurostore, SAN,...
Backup Services
Archiving
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
2
AFS
 500Gbytes of home directories.
700Gbytes of project space used, not all of it backed up.
Can be used from both Unix and NT.
Files cached onto client machine.
Transfer speed - 1Mbyte/sec with a ‘following wind’.
• 2 mins. to transfer a 100Mbyte file.
Subsequent reads at local disk speed.
‘Born’ in late 1980s - not designed with today’s big cheap
files/disks in mind.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
3
AFS (2)
New project space is on PCs with EIDE discs attached.
Space currently on order, using 25Gbyte disks, costs about
40Sf/Gbyte.
Space managed centrally for the users.
• Space is managed by ‘volumes’ which have a quota.
• Volumes can be moved from one server to another transparently
to the user, transfer rate about 2Gbytes/hour.
• Volumes moved by automatic procedures as disks fill up.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
4
AFS (3)
Maximum volume size is 2Gbytes.
Users and particularly administrators want bigger volumes.
Larger volumes can be created but can’t be backed-up or
moved.
Support of larger volumes promised - but no date.
AFS is convenient for smaller files, particularly those
used by multiple people.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
5
AFS workspaces
‘Project space’ for individual users.
Implemented on cheaper PC disks.
AFS volumes can be allocated to a user from the groups
project space budget.
Mounted under the users home directory as ‘~/w’,
‘~/w2’,’~/w3’,…
Allocated using the afs_admin command
• afs_admin create_workspace -p project-name -u userid -q nnn
May want to think about this before making your
COCOTIME requests!
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
6
Evolution of AFS home directory space
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
7
AFS Home Directory Space for large groups
COCOTIME
end 1999
180
90
40
30
Used
30/6/99
91
60
26
15
Average
Mb/User
49
44
57
46
ALEPH
DELPHI
L3
OPAL
25
45
20
40
25
37
17
17
44
41
24
25
COMPASS
NA48
5
25
5
15
41
61
ATLAS
CMS
ALICE
LHCB
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
8
Evolution of AFS Project space
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
9
AFS Project Space for large groups
ATLAS
CMS
ALICE
LHCB
ALEPH
DELPHI
L3
OPAL
COMPASS
NA48
email: Judy.Richards@cern.ch
CERN
COCOTIME
end 1999
250
160
65
80
Allocated
30/6/99
166
78
24
42
53
45
50
80
16
40
30
37
22
120
9
60
Judy Richards
C5
June 4th, 1999
10
Hierarchical Storage Management
In an ‘HSM’ system files are moved from a disk pool to a
tape pool on a ‘least recently used’ basis.
Currently used implementation is HPSS - High
Performance Storage System
Designed for managing ‘larger’ files.
Data transfer speed limited by network connection.
• 5 Mbyte/sec for a server in the Computer Centre.
• 1Mbyte/sec to the desktop over the general purpose network.
HPSS could provide a complement to AFS/NICE home
directories and AFS project space for users who don’t have
sufficient data to justify private tapes.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
11
HPSS
Concern over management and administration costs.
Limit its use to modest amounts of data and straightforward applications.
Use via a CERN interface so that underlying system can be
changed (almost?) transparently to the user.
User data would be migrated if we change the underlying
hsm.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
12
HSM Proposal
Allows user to copy data between local disk and a
hierarchically managed file store.
• hsm [get,put,query,delete …etc] local-file remote-file
Allow copying of files to and from stagepools (for
PLUS/WGS users) as well as local disk on a Unix or NT
workstation.
Files must be > 20Mbytes.
Current maximum size is 2Gbytes but larger sizes possible
‘next year’.
Only 1 version of a given filename may be stored.
Include a function to copy directories into a tar file.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
13
HSM Proposal (2)
Files are ‘world readable’ so no confidential data should be
stored.
No quota restrictions on individual users (hopefully!).
At the end of each year files that have not been accessed
(read or modify) for 5 years will be deleted.
An extensive advertizing of this process but individual file
owners will NOT be contacted.
Files belonging to users whose last CERN computer
account is deleted will also be deleted.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
14
User Local Disks
(too) Cheap!
For a PC current cost is around 30Sfr/Gbyte.
Since disks are so cheap users could buy 2 and use
mirroring to protect against hardware (but not human)
failures.
• Should we include mirroring as an option in our standard
installations?
Backup of desktop machines is a problem!
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
15
Backup of User Local Disk
Backup of desktop systems must be considered as an
‘exceptional service’ for cases that can be individually
justified.
 For the system, re-installation is faster than restore - keep
your system ‘standard’!
Very valuable files should be in home directories or project
space in AFS or NICE.
Copying files into HSM provides protection for data files
• data is backed up
• reasonable protection against human error - delete of ‘local’ copy
doesn’t delete copy in HPSS
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
16
Backup services
Three parallel services in the Computer Centre.
• Can we rationalize them?
Stand-alone system specific to AFS - must be retained.
• Only alternative with AFS support is ADSM
• Performance insufficient for file base of our size.
Legato service
ADSM service
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
17
Legato Service
Used for backing up servers in the Computer Centre
Main clients are PDP managed servers, NICE and Oracle.
Backups done directly to tape - requires good network
connection.
Architecture appropriate for larger critical servers.
• Cycles of full, differential, incremental backups
Decision taken to move IT/Computer Centre servers from
ADSM to Legato.
A disaster recovery plan with remote physical backup
copies has to be implemented.
• (This exists for AFS and ADSM today.)
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
18
ADSM
In use since about 1990 (servers on CERNVM).
Designed to backup remote systems - minimize network
traffic.
Incremental backups only.
Data to be backed up cached onto local disk of the server.
Copied to tape when disk cache full.
Data for a given node spread over numerous tapes.
Restore times can be very long.
Very little central control over what clients back up.
Off-site disaster/recovery copies are implemented.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
19
ADSM (2)
350 clients split over 2 servers.
• Relatively small percentage are physics desktops.
2.5 Terabytes of backups.
35 Gbytes of directories (420 million files)
1 Terabyte of archive - significant part are effectively full
backups.
Monolithic directory of files
• large size is a source of concern
• took 5 days to rebuild database after a corruption
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
20
ADSM (3)
Also used for ‘pubarch’ - archive facility for AFS files
• 850K files, 450Gbytes, 100tapes
• few hundred files per day
Also holds ‘vmarch’ -archive of CERNVM filebase
• total of 75Gbytes.
• still some low level activity - 40 files restored in week 19
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
21
ADSM future
Do we need to provide a remote backup service for a
limited number of clients?
• Probably yes
Recent growth in requests for backup of Linux systems.
Standardizing on use of Legato in the Computer Centre.
Review pros and cons of moving ADSM service for remote
machines to Legato.
• non-negligible investment in manpower and money
• mustn’t forget its use for archiving
Target date for review, November 1999.
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
22
Archiving
Options today
•
•
•
•
pubarch - for AFS files
ADSM archive for ADSM clients
hsm (HPSS)
user makes a copy to ‘traditional’ tape
Services missing today
• archive for the Windows world
• archive for email
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
23
Archiving (2)
What are the user requirements?
• Keep everything for ever!
What are the system constraints?
• Volumes of data and directories of data must be kept within
manageable sizes.
So how long should we keep archives??
• eg how long do we keep Vmarchive?
• If we drop ADSM what do we do with pubarch ADSM archive?
• Users will certainly request that we move them to a replacement
system!
Proposal for an Archive policy, October 1999
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
24
Main points for Discussion??
comments on evolution of AFS space usage?
feedback to the HSM proposal
feedback on backup of desktop machines
feedback on archiving requirements
email: Judy.Richards@cern.ch
CERN
Judy Richards
C5
June 4th, 1999
25
Download