Here

advertisement
Introduction to Linux
Alan Orth
April 17, 2010
ILRI, Nairobi
What is Linux?
… An Operating System
- (just like Windows and Mac!)
- Created in the 1990s by Linus Torvalds
- Microsoft DOS was too limiting
- UNIX was expensive and restrictive
- Linux was born
What is Linux?
Examples of Linux Operating Systems, called
“distributions”:
-
Ubuntu (obvious?)
Debian
Fedora
Redhat
CentOS
SuSE
Big list at: http://distrowatch.com
Why Linux?
- Linux is “free”
- Free (money)
- Free (freedom... “open source”)
- Peer reviewed
- … makes Linux a good match for science
Why Linux for
Bioinformatics?
- Bioinformatics == the application of information
technology and computer science to the field of
molecular biology
- Data sets are getting bigger, we need more
processing power power!
- … computers with that kind of power use Linux
- extremely efficient and stable
- excellent in text processing
Get Your Feet Wet
Most research institutions and universities have
Linux servers.
Use an SSH client like
“putty” to connect to our
Linux server from
Windows:
Server: hpc.ilri.cgiar.org
Username: user1
Password: user1
Getting Familiar
Linux has a graphical environment like Windows,
but the real power lies in its command line mode.
In Linux, you type commands in the “shell.” After
you have entered a command you press Enter to
run the command.
Familiarize yourself with your environment:
whoami – print the name of the current user
id – print information about the current user
who – print a list of other users who are logged in
date – print the current date and time on the server
cal – print a calendar for the current month
echo – print a text string to the screen
Getting Familiar
Linux commands come in various forms. Some are
simple, and can be used by themselves:
whoami
cal
Other times you can add “arguments” to change the
behavior of the command. Arguments are
separated by one or more spaces:
cal 4 2009
Other other commands require arguments (they
don’t make sense to run by themselves)
Navigating the File System
Files and folders are organized in a hierarchical
fashion. The top of the hierarchy is called the
“root.”
Here is the standard directory structure in Linux:
bin
/
etc home
james
alan
work pics
- the “root” directory is often represented by “/”
- “directory” is a fancy word for “folder”
Navigating the File System
Before we can start solving world hunger, we have
to learn how to move around the file system
comfortably.
Analyze your directory structure using some of the
following commands:
pwd – print the current “working” directory
ls – list the contents of the current directory
cd – change to another directory
mkdir – create a new directory
Navigating the File System
Create some directories and get the hang of moving
around them:
mkdir one
mkdir two
mkdir two/three
cd one
What if you want to move to two now? There is no
two in the current directory (verify with ls). Our
directory structure looks like this:
user1
one
two
You are here
three
Navigating the File System
If we want to move to the directory two we have to
first move back up in the directory hierarchy. Once
we move back to user1 we will be able to move into
two.
cd ..
cd two
In Linux “..” means “parent directory,” and you see
once we move to the parent, we're able to then
move to two.
Other special directories include “.” (the current
directory), and “~” (your home directory).
Working With Files & Folders
Commands used for managing files and folders:
cp – copy a file
mv – move a file (this is how you rename)
rm – delete a file
file – print the type of file
more – read a text file
less – read a text file (less is more, but better!)
head – print the beginning of a file
cat – print a file to the screen
Working With Files & Folders
Reference for some basic commands which use or
require additional arguments:
ls
ls
ls
mv
cp
rm
rm
rm
-lh
(“long” list of files)
-la
(“long” list of hidden files)
-lh file
(“long” list of file)
file file1
(rename file to file1)
file filecopy (copy file to filecopy)
file
(delete file)
-i file
(delete file, but ask first)
-r folder
(delete folder)
Working With Files & Folders
Copy files from Windows → Linux?
Use WinSCP!
SCP is the “secure copy” protocol which uses the
same username and password you use with Putty.
You can also download files from the Internet using
the following commands:
wget – “web get” utility for HTTP and FTP
ftp – “file transfer protocol”
links – simple, text-based web browser
Your First File
Use the text editor nano to create a new file named
“hello”:
cd ~
nano hello
Type a simple message and then save the file by
writing it to the disk: ^O (Control-O)
In the world of Linux, the “^” character in key
combinations signifies pressing the Control key.
Exit the text editor by pressing ^X
(Control-X)
Working With Files & Folders
Make a copy of your new text file:
cp hello hello2
cat hello
more hello
Press “q” when you're done to quit more. Do you
see how the two are different?
cat hello hello2
cat simply prints a file to the screen, while more is
used to interactively view a text file one page at a
time. Programs like more are called “pagers.”
I/O Redirection
By default, command line programs print to “stdout”
(standard out).
I/O redirection manipulates the
input/output of Linux programs, allowing you to
capture it or send it somewhere else.
Make a copy of hello (without using cp):
cat hello > hello3
cat hello3
The “>” character performs a “redirect,” taking the
output of the cat command and putting it into the
file hello3.
I/O Redirection
Now try using echo:
echo “My name is Alan” > hello3
cat hello3
What happened to hello3?... It was overwritten!
The “>” operator creates a new file to store the
output, but if the file already exists it will be
overwritten!
Use “>>” to append to a file:
echo “Appended” >> hello3
I/O Redirection
Another useful technique is to redirect one
program's output into another program's input; this
is done using a “pipe.”
For example: when a command produces a lot of
output, and you want to read the output one page
at a time:
who | more
This is an important technique and will come in
handy when you begin using Linux for text
processing.
Text Processing Basics
See how many times a certain user is logged in.
grep prints lines which match a given string:
who | grep “aorth” | wc -l
wc counts words, but can also count lines if you
pass it the “-l” argument. You can also do the same
thing, using grep's counting argument:
who | grep -c “aorth”
Count the number of sequences in a fasta file:
grep -c “>” Tutorial.fna
More Text Processing
sed, the stream editor, can do powerful things with
text files. One common example is a search and
replace:
echo Hello
echo Hello | sed 's/Hello/Goodbye/'
Delete blank lines from a file using sed:
cat myfile | sed “/^$/d” > mynewfile
tr can also be used to translate text:
echo “HELLO” | tr 'A-Z' 'a-z'
More Text Processing
But sed is the king of text substitution; we can use something
called “regular expressions” to match complex text patterns
and act on them.
In this example, nucleotides were to be replaced with integers,
A = 1, C = 2, G = 3, T = 4:
sed -e 's/\bA\b/1' autosomes.txt | less
Here we search for an “A” bordered by a word barrier on both
sides (standing alone), and replace it with a “1”.
Now add a substitution for the “C”, etc. Eventually you would
want to redirect your output to a new file instead of less:
sed -e 's/\bA\b/1' -e 's/\bC\b/2' autosomes.txt >
autosomes_int.txt
Shell Scripts
A shell script is a text file with a list of commands
inside. Shell scripts are good for automating tasks
you use often, or running batch jobs.
Enter the following in a new file, script.sh:
echo “Date and time is:”
date
echo “Your current directory is:”
pwd
Run the script like this: sh script.sh
More Shell Scripts
A more advanced shell script utilizing a loop:
for num in 1 2 3
do
echo “We are on $num…”
done
Where to Get Help
You can always read the manual! To see the “man
page” for the ls command:
man ls
WWWeb resources:
LinuxQuestions.org
UbuntuForums.org
Me:
a.orth@cgiar.org
Download