This document will guide you through the process of sending/receiving data to/from HPC
Wales’ storage systems through Secure File Transfer Protocol (sFTP and scp) and will also provide some further information on these systems. HPC Wales does not recommend or support using File Transfer Protocol (FTP) due to its inherent insecurity.
Secure File Transfer Protocol (s FTP ) is used to transfer files from one computer to another over a network, such as the Internet, in a secure manner. It is often used to upload data and other documents from a private development machine to a public server. Unlike FTP, sFTP data transport is encrypted, so no credentials or data are sent in clear text (and interceptable by third parties).
This document is intended to provide a quick overview of Data Storage and Movement. More detailed information can be found in the User Guide and further help can be obtained through the Support Desk:
Telephone
Website support@hpcwales.co.uk
08452 572 207 https://hpcwprod.service-now.com/
Throughout this document, type in this font indicates what is displayed, whilst bold font denotes user input.
Transferring Data To and From the HPC Wales Systems
Transferring Files Between HPC Wales Sites
How do I Transfer Lots of Small Files / How do I Zip Files?
HPC Wales provides two primary types of storage for all users. The two types of storage available are NFS and Lustre.
Each file system has a targeted use. NFS is provided for long term storage with a unique allocation available at each site which is backed up nightly. This is in comparison to Lustre, which is intended for temporary use during high performance, large, parallel computations on the large HPC Wales production hubs – see the Infrastructure Quick Guide for more on the HPC Wales systems.
Each HPC Wales site has its own NFS-based home directory storage mounted on nodes at
/home.
As the Access Nodes (login.hpcwales.co.uk) are based in Cardiff, they also see the same home user home directories as the Cardiff HPC resources.
The two HPC Wales hub sites have Lustre based temporary file systems mounted at
/scratch.
Note : It is important that regardless of any backup policies in place, you back up your valuable data frequently. If you would like advice on this please contact Support.
Once data i s on an HPC Wales’ system, it can be freely and quickly moved between the different HPC Wales sites over the private HPC Wales network.
Page 2 of 6 Ref:HPCW-HT-14-004
Via the Linux Command Line
To create an sFTP or scp connection on a Linux command line platform (i.e. Fedora, SuSE,
CentOS, Ubuntu etc.) you will need to use the Terminal application, which is built into the
Linux Operating system and in most cases available through the menu bar.
If you are familiar with the syntax of FTP, you can open an sFTP connection to the Access
Nodes and the Cardiff home directories using: sftp username@sftp.hpcwales.co.uk
This opens a session in which traditional FTP commands such as GET and PUT operate in a secure manner. Alternatively, to send an individual file or directory in a single command, you can use scp: scp sourcefile username@sftp.hpcwales.co.uk:
For example, if your account username was jones.steam then you would use: scp sourcefile jones.steam@sftp.hpcwales.co.uk:destfile
Note : If your intention is to transfer many small files then it is recommended you zip up your files into a single archived file. There will be a considerable performance overhead to sending lots of small files from your local workstation to the HPC Wales storage system.
Instructions on how to achieve this are detailed later in this document .
Further Examples: scp data username@sftp.hpcwales.co.uk:repo_dir
Copy the file called data into a directory on the HPC Wales Cardiff system called repo_dir . scp -r data_dir username@sftp.hpcwales.co.uk:
Recursively copy (-r option) a directory called data_dir and all of its contents to your HPC
Wales Cardiff home directory. scp -r data_dir username@sftp.hpcwales.co.uk:repo_dir
Recursively copy a directory called data_dir and all of its contents into a directory called repo_dir to the root of your HPC Wales Cardiff home directory.
sFTP with a Graphical Interface (GUI) on Windows & Linux Platforms
The Windows Platform does not have a built-in sFTP client. You will need to install an sFTP client (e.g. winSCP, Filezilla) to achieve this functionality. This document will give examples using FileZilla as this program is available on both Windows and Linux.
Page 3 of 6 Ref:HPCW-HT-14-004
Please note : If you do not have Administrative rights for your local workstation and are unable to install programs, FileZilla can be downloaded as a Portable Application. This allows you to use a portable version of the program (that can be installed by an unprivileged user). For further information please contact the HPC Wales Support Desk.
Once you have installed FileZilla, you should receive the following screen on start-up into which you must input your HPC Wales user credentials and server details (for credentials see section: Logging into an Access Node ):
Once connected, you can use this interface to copy files to and from your PC to the HPC
Wales systems. A fuller description of the use of FileZilla is beyond the scope of this document but to upload a file from your computer, you would:
Select a location in the right pane to upload your file to (e.g. /home/<username>)
Locate your file to be imported in the left pane
Right click the file and select Upload
A similar process exists to download files from HPC Wales. Files can also be dragged between local and remote using the mouse.
If you would like further help please contact the Support Desk.
Page 4 of 6 Ref:HPCW-HT-14-004
To directly transfer data to or from one of the HPC Wales sites from the Internet, connect your SFTP/SCP client to the hostname 'sftp.hpcwales.co.uk
' and you will be automatically placed in your Cardiff home directory.
To access non-Cardiff home directories, browse to one of the following directories:
/hpcw/ab/username/
/hpcw/ba/username/
/hpcw/gl/username/
/hpcw/sw/username/
The two letter code highlighted represents the target site home directory you wish to access.
AB is Aberystwyth, BA is Bangor, GL is Glamorgan, SW is Swansea.
Use scp in the same way as above, but with the various site login servers (as used for access to submit jobs) as the destination. There is no need to specify username here, as usernames are common across all HPC Wales sites.
E.g
.
[username@cf-log-002 ~]$ scp -r data gl-log-001:
Copies a directory 'data' from Cardiff to Glamorgan.
When transferring a large number of small files over the Internet, you can experience a significant performance loss. The stop/start process causes this after each file is successfully transferred and the next one prepared. To avoid this, we recommend zipping/taring your files into one large file (to prevent the stop/start performance loss).
In Windows this can be achieved using Winzip or 7-Zip Portable.
Note : If you do not have administrator rights on your PC and are unable to install software, you can download 7-Zip as a portable application. Portable applications do not require installing and can be installed on USB disk drives (for portability).
If you are using Linux this can be achieved by using the tar program. i.e.:
$ tar -cf archive_name.tar
Using the above example :
$ tar -cf archive.tar foo bar
This would create an archive called archive.tar containing files foo and bar . The archive could then be t ransferred as a single file to HPC Wales’ system (or between HPC Wales’ systems).
If you are using the Internet you may wish to take advantage of compression (supported in
Page 5 of 6 Ref:HPCW-HT-14-004
Winzip, 7-Zip and tar). This can be achieved in tar by passing an extra flag ( -z ) during the creation process. i.e. :
$ tar -czf archive.tgz foo bar
The above example would create a compressed archive of files foo and bar called archive.tgz. It will take longer to create the archive but the end result should be an archive file that is smaller in size (allowing for a faster file transfer over the Internet). Compression is recommended for large files being sent over the Internet. Wildcards can be used here, so to include all files in the current directory, the asterisk * could be used instead of a list of specific files to include. If you would like any assistance with this process please contact the
Support Desk.
A range of documentation is available, including a comprehensive ‘User Guide’. For documentation and information available please visit the HPC Wales Info Hub; http://www.hpcwales.co.uk/resources
Page 6 of 6 Ref:HPCW-HT-14-004