Slides from the event can be downloaded here

advertisement





As of the Fall of 2010, the ME department has over 700
total processor cores running the Linux OS.
The larger processor and memory based systems tend to
use Linux rather than windows.
In the ‘bang for the buck’ arena, servers that do not
require terminals are far more economical for
computational needs. A 24 core/48gb Linux server costs
about the same as three dual core windows desktops.
ECN provides licenses for certain advanced software and
compilers only on the Linux environment.
Linux offers a true multi-user OS. You can start jobs and let
them run for months – literally. We have machines that
have been up for over 300 days straight without a reboot
or power cycle.

Differences:
 Linux is a multi-user system with a multi-user
kernel
 Windows is a single user system with a single user
kernel

Commonalities:
 Its all just a framework to do something
 Many applications are cross-platform – the OS
merely provides the environment to run


Advantages?????
Linux:
 Its free and its widely available
 Its core is suitable for servers, workstations and
appliance applications
 Linux can be run as a larger petabyte capacity system
or configured to run off a 1.44mb floppy disk with no
HD locally.

Windows:
 Very common, pervasive OS. Its pretty much what
most computers run.
 Very large assortment of software available


Disadvantages?????
Linux:
 Its complex. Simply put – the more capable you
are, the more complex it becomes.
 Many features are command line driven.

Windows:
 At its core – its really a single user operating
system. (Fast user switching does not fix this)
 Most features are GUI driven.




Linux is a version of UNIX, much like BSD is a version of UNIX. The
defining characteristic of Linux is its kernel. The Linux kernel is used in
different flavors (versions) of Linux such as Fedora, Red Hat, SUSE,
Debian, Ubuntu etc. OpenBSD, FreeBSD, MacOS do not use the Linux
kernel and therefore are not Linux.
One of the defining factors is the kernel (hardware is the other). The
kernel is the ‘core’ or ‘guts’ of the operating system. It provides the
gateway between the software and the hardware. Common ‘flavors’
share the same kernel and in general, can share applications without
recompiling (may require LOTS of work with library locations though!).
Mac uses a BSD based kernel. Mac also uses the Aqua display manager
rather than the traditional X display manager common to other UNIX
variants. What this means is MAC is a UNIX OS but it is not Linux and
compiled apps for Linux likely will not run on MacOS.
Solaris is another BSD variant but also requires Sparc hardware. (nonPC). There is a large collection of UNIX variants on non-pc hardware (IRIX,
AIX, Tru64, Solaris, HP-UX, VMS, VAX etc)
Relationship of disk space/home directory to machine itself.
CPU’s (processors) are tied to machine, jobs run on CPU’s so jobs are tied to
machines. You login to a machine to run a job on a machine
 Disk Space transcends the machine itself. Files reside on disks and disk space is
NOT tied to a machine so that single file can be available on multiple machines at
the same time. This is accomplished using NFS (network file system) server app.


Machine 1
CPU(s)
Machine 2
CPU(s)
Disk
Space
Disk
Space
Machine 3
CPU(s)
The UNIX/Linux file systems form a ‘Tree’ where disk space on one
machine is available on all machines.
 Each File has an ‘address’ to find it. That address is its absolute
UNIX path.

home
camp
a
/home/camp/a
project
a
/home/project/a
gadget
a
/home/gadget/a
robusta
a
/home/robusta/a
b
/home/robusta/b
coep
a
/home/coep/a
tribe
a
/home/tribe/a

Backups:
 ECN offers backup services for a fee
▪ Backups are done each night and restores are available
through the ECN webpage.
 Some machines are backed up each night some
are not. For specifics – ask your advisor or the eshop. Most research machines are NOT backed
up. Most departmental file servers are.
▪ Costs are License ($200 or $600) + $2/gb of capacity. For
instance, coep server with a 750gb hd costs $600 for the
license and $1500 for data, total cost $2100
CLUSTER MACHINES

NON-CLUSTER MACHINES
robusta
tribe
robusta01
coep
robusta02
seanm
robusta03
navy
robusta04
pande
robusta05
cater
robusta06
sharif
robusta07
apfel
What’s the difference??? - mostly just how they are used!!!
CLUSTERS

NON-CLUSTER
 Tend to have a common
 Each machine has its own
home directory on the ‘head
node’ that each ‘node’ points
to.
 Tends to have dedicated
network segments
 *may* have specialized high
speed interconnect (robusta,
steele, coates)
home directory or points to
the departmental file server
 Tends to share their network
segment with other machines
 Parallel jobs *are* run on
non-cluster machines
There is a LOT of overlap in usage patterns of machines

There are a few main ways to ‘connect’ to a
Linux/UNIX machine
 Console login (sitting in front of the machine)
 Remote text only connection (SSH)
 Remote Graphical connection (X – always
tunneled through SSL/SSH)

Connecting to windows machine via Linux?
 Use the ‘rdesktop’ command from the Linux
command line
▪ IE: rdesktop 128.46.184.220

Text only (SSH) – command line
 This is the ‘telnet’ type interface but encrypted via SSL.
 In its base mode – its text only which makes it *very*
quick over fast connections and quite usable even over
dialup connections.
 SSH provides the base to ‘tunnel’ graphical applications
when paired with the proper client.
 SFTP is a secure FTP client variation based on the SSH
protocol

Applications include: Cygwin, SecureCRT, SSH Secure
Shell and SecureFX for Windows. Unix systems usually
have the ssh command built in as well.

Graphical – remote connections
 Linux is multi-user by default so a single Linux machine can
support many simultaneous graphical sessions
 Remote connections REQUIRE the connection to be ‘tunneled’
via SSH. From cygwin on a PC, this can be accomplished using
the ‘ssh –X’ command.
 Graphical connections can be VERY network intensive so
beware of attempting complex rendering over slow network
links.
 The graphical ‘command line’ window is known as an xterm (X
terminal).

Requires a local ‘X’ server to be running. For PC’s, an
application such as Cygwin or Xceed is required. The
‘server’ application needs to be on the machine you are
sitting at. The ‘client’ application is on the remote system.

The shell is the command interpreter. Unlike
windows, Linux offers many different options
with different feature sets.
 tcsh, csh (C shell and C shell variants)
 sh, ash, dash, bash, zsh (Bourne Shell and
Variants)
 ksh (Korn shell)
 Plus a host of restricted shells, customized feature
set shells and other ‘exotic’ shells.

By default, users start with the tcsh in ME


(this is because this is the shell I use and its easier on me to get people started)
Customization files for tcsh
 .login, .cshrc, .tcshrc, .logout

Customization files for the bash shell
 .profile, .bash_profile, .bash_login, .bash_logout, .bashrc

Customization files for the zsh shell
 .zprofile, .zlogin, .zshrc

Customization files for the ksh shell
 .profile

Customization files for the sh shell
 .profile

Customizations
 All user controlled customization files should be in




the root of your home directory
Can set the path variable which defines where to
search for programs/applications
Can set aliasing for commonly used programs to
simpler commands
Can configuring your command prompt with
history or different formats
Can execute commands on login or logout

The Linux shell has a set of predefined variables.
These variables are known as your environment
variables. These can be setup on the command
line or in one of the customization files for the
shell.
 If you custom build software, you may need to modify
your environment
 If you use certain software packages, you may need to
setup variables in your environment
 You may want to use this for customizing your
‘experience’ while working in Linux

Linux has three levels of permissions, the user
level, group level and world level.
 Each level offers three options – read (y/n), write (y/n) and execute (y/n)
 It is legal and permissible to be allowed to write to a file but not read it.
 For directories, traversal rights are controlled by the execute bit
 Permission set using the chmod command.
▪ chmod g+r <filename>
▪ chmod –R 775 <filename>
Permission blocks
drwxr-xr-x
d
rwx
r-x
r-x
Flag
User
Group World


Linux shells are applications themselves.
They can be ‘stacked’ such that you may use
tcsh as the default but can invoke a bash shell
as needed on top of the tcsh login shell.
Scripts work in this regard. They create their
own non-login shells in which to run. You can
invoke scripts within your current shell or to
run in their own environment space.

The basics
▪ (books and classes are taught on this subject)
 A shell script, to run it its own environment space must
have a ‘sha-bang’ line as the first characters/line of the file.
▪ #!/bin/sh
▪ #!/usr/local/bin/tcsh
 A Shell script must be ‘executable’ - IE the ‘x’ bit set
 Each shell has varying capabilities of loops, variables,
math functionality and external communications
available. The ‘sh’ shell is the most common with the ‘ksh’
being the most common for programming.
 Python, Perl, PHP etc are interpreters, not shells. They can
do scripting and ‘application’ functions but lack the user
level interactivity that normally defines a ‘shell’

A simple listing of the fundamental
commands that are useful to know
ls
cd
pwd
cat
rm
rmdir
chmod
man
vi
pico
grep
tail
head
ssh
sftp
logout
more
less
ps
nohup
kill
top
uptime
who

Examples:
Listing all files in the current directory, with permission bits and hidden files
•ls -al
Showing all of the processes on a machine for the user loganm
•ps –ef | grep loganm
killing a task (PID from above command)
•kill -9 <PID>
IE: kill -9 23999
Removing a directory and all files/directories it contains
•rm –rf <directory name>
IE: rm –rf test_program1
Making a script executable
•chmod u+x <scriptname> IE: chmod u+x testprog.sh
starting a process independent of the terminal session
•nohup <script>
IE: nohup important_job.ksh
killing a task (PID from above command)
•kill -9 <PID>
IE: kill -9 23999
Finding a file recursively
•find . -name "mysql.h" -print

The shell environment also lends itself to
interpreted scripts. These are languages
which offer much greater flexibility and
capability over simple shell scripts but lack
the complexity of compiled languages.
Common languages supported include (but
not limited to)
PERL
Python
PHP
Ruby
Java

We also support the compiled languages.
These are the most versatile and the most
complicated to use. There is full support for
mathematical libraries (BLAS, LAPACK),
socket programming, database connectivity
etc. These use compilers to create the
executable and we support the following
suites.
GNU Suite
Intel Suite
Portland Group

There are several common locations for
software to be found. Many are in the default
‘path’ of the machine.





/usr/bin
/bin
/usr/local/bin
/usr/opt/bin
Packages are installed in either the Red Hat
default (core Linux applications) or in /package
(nfs shared disk)
One of the great features of Linux is the ‘sandbox’ capability
of applications. Users can build and run fully functional
versions of applications within their own user space. Each
application is dependent only on itself.
 If you build your own packages, be sure to set the install
directory (prefix) with the configure script. The defaults will
*never* work.


The basic process to build and application is below
Install
prerequisite
software
Run the ‘configure’
script with proper
local install options
Run the
‘make’
command
Run the ‘make install’
command to install
the software

The X display server’s built in display manager is very
basic. To enhance usability, several new ‘Graphical
Display Managers’ have been developed.





KDE
Gnome
CDE
Many others
These GDM’s offer user level customizations with
significant options. I can go through some examples
during the Q&A portion for adding applications and
managing menus for Gnome.

Linux offers several means for utilizing multiple CPU cores
to accomplish your tasks
 MPI based code
 Parallel aware applications
 Threads

It should be noted – the efficiency of multiple cpu usage is
related to your code/job. Some jobs parallelize easily,
others do not. It is up to you, the designer of the job, to
determine what is the best way to run your job

Parallel computing is not free. Parallel aware applications
require more licenses for parallel jobs. The speed increase
is not linear and in some cases, can be speed decreases.
MPI (message passing interface) is the most common
way to do parallel computing with C and Fortran
based codes.
 Can be used on the same machine, across Ethernet
with multiple machines or through Infiniband
 To use MPI – you must compile your code with the
‘wrapped’ MPI aware compiler. All of the major
compilers have ‘wrapped’ versions available.
 MPICH2 does require configuration of your user
environment and the use of a ‘daemon’ application on
each node. (MPD and MPD rings)


Abaqus and Fluent both support native and
easy parallel computing support
 Abaqus offers two means for multi-cpu usage
▪ cpus=n command line option
▪ mpi implementation using HP-MPI (not implemented on
ECN)
 Fluent offers a parallel solver (see the Fluent docs
for details)

Be aware – parallel processing requires more
licenses from a limited pool!


Matlab is fundamentally a uni-processor
application.
Recently – Matlab added some rudimentary
parallel computing functionality by using the
parallel computing toolbox and ‘worker’
instances of Matlab
 Parallel ‘for loops’
 Batch command parallel execution
 Distributed data for very large arrays

Read the toolbox documentation!
Threads - think of the fork command but with
communication. Different languages provide different
options
 Named pipes – much like threads but at a lower level.
Again, implementation is VERY language dependent.
 Socket level programming – fundamentally, this is an
implementation strategy using standard network
libraries. MPI is built on this framework.
 SCSI backplane or fabric backplane – this is the level
Infiniband and Myrinet are implemented on. There
can be some advantages to gain in latency and
bandwidth. This is also the structure GPU
programming falls. (CUDA chips)


We have over 700 processor cores running in ME 242A and over 750 cores in servers in ME
department as a whole. On top of this – we have and support another 30+ Linux workstations,
many of which are multi-core machines.

Publicly Available Departmental Linux Machines




ME – (5) Dual core workstations (steam, wind, water, air, cog), (1) 24-core server (horsepower)
Herrick – (2) single core workstations (stave, manon), (4) 8-core servers (herrick, cohen, bernhard, fontaine)
ECN – (1) 16-core server (riptide) <shared by the college of engineering>
Research Groups (servers only)
















Sadeghi – 164 cores in 12 servers (coep, tribe, pande, sharif, cater, apfel, navy, seanm, cat, schaeffler,
ashtekar, evansville)
Frankel – 124 cores in 17 nodes – Supremo + Robusta clusters + coffeeexpress. (robusta has Infiniband)
Wassgren – 76 cores in 14 nodes - Camp cluster
Key – 72 cores in 9 servers (Ubuntu non-networked cluster - Zucrow)
Shin – 68 cores in 6 servers (femtosim, picosim, lampsim, mansim, microsim, ultrasim)
Lucht – 56 cores in 7 servers (densitymatrix – densitymatrix7)
Ramani – 48 cores in 2 servers (shape, kernel)
Xu – 48 cores in 2 servers (bncws1, bncws2)
Mongea – 24 cores in 1 server (tfm)
Siegmund – 16 cores in 2 servers (fracture, asterix)
Ruan – 16 cores in 4 servers – nanoenergy cluster
Martini – 8 cores in 1 server (vader)
Subbarayan – 8 cores in 1 server (magenta)
Son – 4 cores in 1 server (gadget)
Fisher – 4 cores in server (edwards)
Li – 2 cores in 1 server (noise)
Download