Introduction to Linux/Unix

advertisement
NMR Spectroscopy and Protein Structures
Chem 991A Special Topics in Physical Chemistry
Lectures: MWF 10:30am-11:20am, Rm 733 Hamilton Hall
Class Projects & Exams: Thur. 6:00-8:00pm, Rm 733 Hamilton Hall
COURSE OUTLINE
Instructor: Dr. Robert Powers
Office
Address: 722 HaH
Phone: 472-3039
e-mail:rpowers3@unl.edu
web page: http://bionmr.unl.edu/
Labs
721 HaH
Phone: 472-5316
Office Hours: 11:30-12:30 am MWF or by Special Appointment.
Required Text:
J. N. S. Evans, Biomolecular NMR Spectroscopy, Oxford University Press
Recommended Text:
M. H. Levitt, Spin Dynamics – Basics of Nuclear Magnetic Resonance, Wiley
Course Outlined (cont.)
Some Other Recommended Resources
“NMR of Proteins and Nucleic Acids” Kurt Wuthrich
“Protein NMR Spectroscopy: Principals and Practice”
John Cavanagh, Arthur Palmer, Nicholas J. Skelton, Wayne Fairbrother
“Principles of Protein Structure” G. E. Schulz & R. H. Schirmer
“Introduction to Protein Structure” C. Branden & J. Tooze
“Enzymes: A Practical Introduction to Structure, Mechanism, and Data Analysis” R.
Copeland
“Biophysical Chemistry” Parts I to III, C. Cantor & P. Schimmel
“Principles of Nuclei Acid Structure” W. Saenger
Course Outlined (cont.)
Some Important Web Sites:
RCSB Protein Data Bank (PDB)
http://www.rcsb.org/pdb/
Database of NMR & X-ray Structures
BMRB (BioMagResBank)
http://www.bmrb.wisc.edu/
Database of NMR resonance assignments
CATH Protein Structure Classification
http://www.cathdb.info/
Classification of All Proteins in PDB
SCOP: Structural Classification of Proteins Classification of All Structures into
http://scop.berkeley.edu
Families, Super Families etc.
PDBeFold
http://www.ebi.ac.uk/msd-srv/ssm/
Compares 3D-Stuctures of Proteins to
Determine Structural Similarities of New
Structures
NMR Information Server
http://www.spincore.com/nmrinfo/
NMR Groups, News, Links, Conferences, Jobs
NMR Knowledge Base
http://www.spectroscopynow.com/
A lot of useful NMR links
Course Outlined (cont.)
Course Work:
Oral Reports (2):
Ubiquitin Assignment
Problem Set:
Exam 1:
Exam 2:
Final Exam:
Total:
100 pts
100 pts
100 pts
100 pts
100 pts
200 pts
700 pts.
(variable due dates)
(due Dec. 13)
(due Dec. 13)
(Thur., Oct. 3)
(Thur., Nov. 7)
(Fri, Dec. 20, 10am-12pm)
Answer keys for the problem sets and exams will be posted on BlackBoard.
Grading scale: A+=95%; A=90%; A-=85%; B+=80%; B=75%; B-=70%;
C+=65%; C=60%; C-=55%; D=50%; D-=45%; F=40%
Course Outlined (cont.)
Class Participation
• Reading assignments should be completed prior to each lecture. The required
text will only supplement the lecture material. A vast majority of the material for
the class will come from the lectures.
• You are expected to participate in ALL classroom discussions
Exams
• All exams (except the final) will take place at 6 pm in Hamilton Hall Rm. 733 on
the scheduled date.
• The length of each exam (except the final) will be open-ended. You will have as
much time as needed to complete the exam.
• Bring TI-89 style calculator or a simpler model, and an approved translator if
required.
• A review session will take place during the normal class time prior to each exam.
• ALWAYS SHOW ALL WORK!!!!
Lecture Topics (Tentative Schedule)
Date
Topic
I. Overview of Protein Structures
Aug 26
Introduction
Aug 28
Linux and Awk
Aug 30
Protein Structures from an NMR Perspective
Sept 4
Sept 6
Sept 9
Sept 11
Sept 13
Sept 16
Sept 18
Sept 20
Sept 23
Sept 25
Protein Modeling Software
Sept 27
Sept 30
Oct 2
Oct 3
EXAM 1
Oct 4
Molecular Mechanics and Dynamics
Oct 7
Oct 9
Comparison of X-ray and NMR Structures
Oct 11
Oct 14
Isotope Labeling of Proteins
Oct 16
II. NMR Assignment Problem
Oct 18
NMR Software
Oct 21 to Oct 22
Fall Break
Chapter
4
3.9
3.5-3.9
4.2.2 – 4.2.3
2
3.9
Lecture Topics (continue)
Date
Topic
Oct 23
Oct 25
2D NMR
Oct 28
Oct 30
3D NMR
Nov 1
4D NMR
III. NMR Structure Determination
Nov 4
NOEs
Nov 6
Nov 7
EXAM 2
Nov 8
Nov 11
Chemical shifts, Coupling constants, Amide Exchanges
Nov 13
Nov 15
Stereospecific assignments, RDCs
Nov 18
Quality of NMR Structures
Nov 20
IV. Protein Dynamics
Nov 22
T1,T2, NOE & S2
Nov 25
Nov 27 to Nov 29
Thanksgiving
V. Protein-Ligand Structures
Dec 2
SAR by NMR, Other 1D and 2D Methods
Dec 4
Transfer NOE
Dec 6
Filtered & edited NMR experiments
Dec 9
Metabolomics
Dec 11
Dec 13
Problem Set & Ubiquitin Assignment due
Dec 20
FINAL EXAM
Chapter
2.1
2.2
2.3
3
3.1
4.1.4, 3.2, 4.1.3, 5.2
4.1.2
3.10
1.3,1.4,
6.3
6.5
6.7
ORAL PRESENTATION OF STRUCTURE PAPERS
– Two 20 minute Oral Presentations
•
•
•
•
Thursday Evenings at 6pm in HaH 733
Audience Participation is Expected (like a journal club)
Presentation Dates Randomly Assigned (see syllabus page 4)
50 points per presentation – total of 100 points
– Paper of Your Choice
• A Protein Structure Should be a Major Focus of the Paper
• The Paper Topic Should be of General Interest and of Significant Impact
• Send an Electronic Copy of the Paper to the Class Prior to Your
Presentation
– Some Recommended Sources
• Nature Structural Biology, Science, Nature, Cell, Molecular Cell,
Structure, Protein Science, PNAS, Journal of Molecular Biology,
Biochemistry, and Journal of Biomolecular NMR.
• The paper may cover a protein structure or a protein-complex (small
molecule, protein, DNA, RNA, etc).
ORAL PRESENTATION OF STRUCTURE PAPERS
– Presentation Goal
• Present a Clear Understanding of the Goals and Findings of the
Paper to the Class
• Why was the particular protein the target of the paper?
• How was the structure determined? Were there any challenging
issues?
• What structure was determined for the protein (fold?)
• What are some interesting features of the structure (dynamics)?
• Are there any unique structural differences compared to other
members of the family?
• What structural features are important to function?
• How was the structure used to support or refute the biological focus
of the paper?
• Does the structure actually support the conclusion or did the
author’s over interpret the data?
• Does the data/structure suggest other equally plausible
conclusions?
ORAL PRESENTATION OF STRUCTURE PAPERS
– Grading
•
•
•
•
Combination of My Assessment and the Other Students’ Assessment
Each Student will be Limited to Giving Approximately 30% As, 55%
Bs, And 15% Cs
Default Grade is a B, an A or C will Require Justification
All the assessments will be averaged together to determine the
number of points
Average Assessed Grade: A: 50pts, B+: 45pts, B: 40pts, B-: 35pts, C+: 30pts, C: 25pts
– Assessing the Presenter
•
•
•
•
•
•
How well did the presenter understand the material?
How clearly did the presenter discuss the material?
Was the chosen paper of general interest and biologically significant?
Was the structure relevant and important to the paper?
How well did the presenter answer questions?
Did the paper lead to an interesting discussion?
ORAL PRESENTATION OF STRUCTURE PAPERS
Tentative Schedule
Oral Presentation Schedule
9/19
9/26
9/5
9/12
Jonathan Catazaro
Jeffrey Jeppson
Mark Carter
Jessica Periago
10/17
10/24
Mark Carter
Jessica Periago
12/12
Bradley Worley
Shulei Lei
10/10
Teklab Gebregiworgis
Darrell Marshall
Jonathan Catazaro
Jeffrey Jeppson
11/14
11/21
12/5
Bradley Worley
Shulei Lei
Teklab Gebregiworgis
Darrell Marshall
Course Assignments
– Two Separate Graded Assignments
•
•
•
•
A standard problem set – included at the end of the syllabus
An NMR assignment problem
Due data for both assignments is the beginning of class on Fri. Dec. 13
Late Problem Sets will NOT be accepted
– Grading - General
• Each Assignment is worth 100 pts. (200 pts. total)
• Show ALL work to receive full credit
• You must submit your own set of answers
– Some Additional Considerations
• Please start both assignments NOW!
• Please work together
• Please visit my office hours for assistance
Course Assignments
– The Standard Problem Set Has Two Sections
•
•
•
•
Writing simple AWK programs to manipulate files
Using Xplor and other software to analyze protein structures
Due date for both assignments is the beginning of class on Fri. Dec. 13
Late Problem Sets will NOT be accepted
– Grading – Standard Problem Set
•
•
•
•
•
No unique answer for programing section, either it works or it doesn’t
E-mail me your scripts and I will run them
If it works full credit, if not zero points
The analysis of the protein structures section will have defined answers
Please submit the answers to the protein structure section on the due
date
Course Assignments
– NMR Assignment Problem Set
• Determine the backbone NMR Assignments for Ubiquitin
1
10
20
30
40
MQIFVKTLTG KTITLEVEPS DTIENVKAKI QDKEGIPPDQ
50
60
70
76
QRLIFAGKQL EDGRTLSDYN IQKESTLHLV LRLRGG
Sequence:
•
The completed project should include a cover page that summarizes your
assignments using the following template:
Res HN
M1
Q2
I3
.
.
.
G76
•
15N
Ca
Cb Ca(i-1)
Cb(i-1)
CO(i-1)
Include peak-pick list from the six spectra used to assign the protein
Course Assignments
– NMR Assignment Problem Set
• You will ALL have access to a standard dataset of NMR spectra:
• 2D 1H-15HSQC, 2D 1H-13C HSQC, 3D HNCO, 3D HNCA, 3D CBCANH, and
3D CBCACONH
• Data will be available on the computers in the Research Instrument NMR
Facility (HaH 832)
• All the necessary software for the processing and analyzing of the data
will also be available on these computers
– Goal
•
•
•
Assign the minimal set of backbone resonances (HN, 15N, 13CO, Ca, Cb)
Provide practical experience with using NMR data to assign a protein
Complete as much of the backbone assignments as possible
– Grading – NMR Assignment Problem Set
•
•
Based on how complete the assignments are
Scaled based on overall success of the class
Introduction to Linux/Unix
Linux: A UNIX–like operating system developed as a free and open source software
User interface is a traditional and cumbersome command line in a shell (window)
“Linux is for Adults” – Stephan Grzesiek
There are a number of flavors (distributions) of Linux with different graphical
user interfaces (GUI) or desktop interfaces (attempt to be Mac or Windows-like)
- Debian, Fedora, Ubuntu, Mageia, Mint Linux, etc.
Similarly, there are a number of PC-look-a-like software programs (free & commercial)
(WORD, EXCEL, etc). Initially thought it would replace the Windows PC
Very popular in academia because it is free and open for development
Introduction to Linux/Unix
Typical Linux Shell Environment
Simple “command line” execution of
programs or editing of files
Typical Linux “Windows” Environment
Mimics PC/Mac desktop GUI environment
Introduction to Linux/Unix
Connecting from a PC by a Terminal
Emulation Software (PuTTY)
command line environment
Connecting from a PC by Samba
PC/Mac folder environment
Introduction to Linux/Unix
– Graphical User Interface (GUI) or PC/MAC Desktop Environment
• You can use the Desktop like a PC, but can be cumbersome
 Minimal (if any ) standards, everything in the environment needs to be
configured
 Downside of open-source (free) software – many contributors with little to no
managers
– More common to work in a shell using the command line
•
Primitive (“Old School”)
 Minimal mouse functions, pull down menus or other common features we
are accustom to
•
•
Need to memorize commands and options (“flags”)
Need to open a Terminal, Window or Shell
 Right click mouse and select “open terminal”
Introduction to Linux/Unix
– Three Common Linux Commands: pwd, ls and cd
• pwd – identifies the current path or directory
• ls – list the files and folders in the current directory
• cd path - move to the defined path (change directory)
‒ cd .. (move up one directory),
‒ cd ../.. ( move up two directories)
Introduction to Linux/Unix
‒ For a Complete List of Linux Commands and Explanations see
• http://linuxcommand.org/
• Or the book “Linux in a Nutshell”
‒ Some Other Common Commands
•
•
•
•
•
•
•
•
•
•
•
•
•
echo “text” – display or print text
exit – close a terminal
clear – clear all text in a terminal
mkdir - make a new directory
rm - remove/delete file
mv - moves files
cp - copies files
ps – lists all active user programs and display a PID (process identification
number)
kill pid - will kill (stop) the process with the listed pid number
man command - will display the manual for the listed command
cat file – display the contents of a file (also used to combine or
concatenate multiple files)
vi file – will open file with a primitive text editor
chmod file [flags] – will change or set permissions for file defined by flags
Introduction to Linux/Unix
‒ It Gets More Complicated!
‒ A number of commands have a range of options that are implemented
on the command line with a “flag”
•
•
•
•
•
•
ls –l - lists files and folders with associated permissions
rm –R - remove/delete folder
mv –i – prompt before overwriting an existing file with the same name
cp –n – do not overwrite an existing file with the same name
cp –u – only overwrite an older file with the same name
ps –axu – lists the detailed status of every process on the system with the
name of the user
• chmod 755 file – change file’s permission such that file's owner may read,
write, and execute the file. All others may only read and execute the file.
‒ Multiple flags can be used simultaneously
•
Again, man pages, Linux web site and reference books provide more details
Introduction to Linux/Unix
‒ One More Very Useful Command
‒ sort
• Quickly re-order or sort the rows of a tabular file with n number of columns
sort –rn $n filename > newfilename
- $n – the number of the column that will be sorted
- r – sort in reverse order
- n – sort based on numeric value of the string
Permissions
‒ You can’t read, write, edit or execute a file without permission!
Directory
File Owner
Number of files
in Directory
Size of file in
kilobytes
Group Owner
belongs to
Filename
File Date or
Time Stamp
Permissions
‒ Reading and understanding permissions
Permissions
Permissions
‒ Where did the 755 come from in the chmod command?
Think of the permission settings as a series of bits :
rwx rwx rwx = 111 111 111
rw- rw- rw- = 110 110 110
rwx --- --- = 111 000 000
and so on...
rwx
rwr-x
r--xx
-x--x
---
=
=
=
=
=
=
=
=
111
110
101
100
011
010
001
000
in
in
in
in
in
in
in
in
binary
binary
binary
binary
binary
binary
binary
binary
=
=
=
=
=
=
=
=
7
6
5
4
3
2
1
0
Pipes and Redirection
| (pipe) - passes output of one Linux command to the input of a
second command
• Example: ls |wc (wc – counts the number of characters, words and lines)
• Not limited to just one pipe, can string multiple pipes together
>, < - redirection of files
• command > filename – output of command (or program) is sent to a file
called filename instead of being displayed on the screen
 Example: ls > file_list
• command < filename – the file filename is the input to the command or
program
 Example: xplor < psf.inp
Background Calculations
‒ For long calculations don’t want the process directly associated with
the window or shell
•
•
•
•
Window must remain open and active during calculation
Window is “locked” until the program is finished
Calculations will be stopped if the window is closed
A intense calculation can overwhelm the shell environment, leading to the
window crashing or even slow down your computer
• Output displays on window can be lost, lock window or crash computer
‒ Instead, submit your “job” to the “background”
• Lowers the calculations priority to access the CPU
• Any interactive calculation has the highest priority
• Example: background - xplor < psf.inp > psf.out &&
interactive - xplor < psf.inp
‒ Use ps command to monitor status of background jobs
vi – Primitive Text Editor
‒ Opens any text based file for reading, editing and writing
• Only simple text or ASCII files can be edited with vi
• You will see gibberish with *.doc, *.pdf, etc.
‒ Like Linux, vi uses a number of simple command line functions
• A number of the functions require a key combination (ctrl key + another
key)
• For a Complete List of Vi Commands and Explanations see
 “The Vi Lovers Home Page” http://thomer.com/vi/vi.html
 Or “Learning the vi Editor” by L. Lamb, O’Reilly & Associates, Inc
‒ vi filename
•
•
If filename exists, vi will open the file for editing
If filename doesn’t exist, vi will create the file for editing
vi – Primitive Text Editor
Cursor
What part of the text is shown:
All
Top
Bot
Percentage
Editing Mode
Line number
Column number
Cursor Location
vi – Primitive Text Editor
‒ Working with files
• :q – quits only if no changes to the file have been made
• :q! – force vi to quit without saving any changes
• :wq filename – quits and writes the contents of the file to a new file named
filename
• :wq! – quits and writes the file to the current filename
• :r filename – inserts the contents of the file filename into the current file at
the cursor location
‒ Moving around the file
•
•
•
•
•
•
:number – jumps to the specified line number in the text
G or :$ - jump to last line
Ctrl-g – gives current line number
Ctrl-f or Ctrl-d – move forward
Ctrl-b or Ctrl-u– move back
Arrow Keys – allows you to move around the file and position the cursor
vi – Primitive Text Editor
‒ Adding to a file
• Enter Key – adds a blank line at the cursor position
• Esc key – exits or leaves the active vi function
• i or a – enters insert mode, allows text to be typed into the file at the
location of the cursor
• R – enters replace mode, allows text to be typed into the file at the location
of the cursor replacing any existing text
‒ Deleting
•
•
•
•
•
•
•
dd – deletes the line at the position of the cursor
dw – deletes the word at the position of the cursor
x – deletes the character under the cursor
r – replace the character under the cursor
D – deletes from the cursor position to the end of the line
u – undo the last edit or change
U – undo all the edits on a single line
Place a number in front of command and the command will be executed that
many times
vi – Primitive Text Editor
‒ Copying and Pasting Text
• number yy– yanks (copy) the specified number of lines (starting at the
cursor)
• p– put (pastes) the previously yanked (copied) lines in the text after the
cursor
• J – joins two lines at the position of the cursor
‒ Global Search and Replace
• /text – moves the cursor to the next location of text in the file
• n – moves to the next occurrence of text in the file
• :%s/search_string/replacement_string/g – globally replace search_string
with replacement_string
Awk/Nawk – Primitive (but Powerful)
Programing language
‒ Interpreted (not compiled) language
• C-like
• A file containing the software code needs to be passed to Awk
awk –f awk_script.awk infilename > outfilename
-
awk_script.awk – the Awk program
infilename – the file used by the Awk program
outfilename – the output generated by the Awk program
‒ Awk significantly simplifies writing a quick program
• Automatically handles opening and reading files and inputing data into
standard variables
• Structured to read a file composed of rows and columns
• IMPORTANT – sequentially reads each row as it executes the program
 If 10 rows, the program gets executed 10 times – major source of confusion
Awk Program Structure
BEGIN
INPUT
logic statements (if, and, or, not)
arithmetic
looping
table arrays
printing
END
OUTPUT
BEGIN/END
• All of the commands in the
section defined by BEGIN
occurs BEFORE the file is
read
• All of the commands in the
section defined by END
occurs AFTER the file is read
• To comment out a line of text
from a script add “#” before
text
- Line is skipped by Awk
# This script politely introduces itself
BEGIN {
print “Hello, world”
}
{
#Main – Does Nothing, but still reads file
}
END {
print “Bye, world”
}
BEGIN/END
BEGIN {
CAmax[0]= "65.52"
CAmin[0]= "43.00"
CBmax[0]= "38.70"
CBmin[0]= "0.00"
Res[0]="A“
i=1
• The BEGIN section is commonly
used to set or define the value
of variables used by the MAIN
program
• Also, to open an input data or
information from other files
• The END section is commonly
used to print out the results of
the Awk Program
While {getline < ref.pck > 0) {
CAref[i] = $1
CBref[i] = $2
i++
}
}
{
#Main – Does Nothing, but still reads file
}
END {
For (i = 1; i <= NR; i++) {
print CA[i],CB[i]
}
}
MAIN
• The various functions of AWK performs the tasks you want as the
program sequentially reads the input file
Consider the following input file:
PkID
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
NH
9.35
9.10
9.73
7.80
8.84
8.14
9.01
8.15
N15
126.75
126.69
126.68
126.57
126.52
125.85
125.35
125.24
CA
53.19
59.42
60.73
57.28
58.35
65.85
62.57
54.86
CB
40.06
31.90
38.11
33.99
28.58
31.89
42.15
40.69
CAi
63.53
52.92
54.64
56.10
53.25
53.03
52.70
55.79
CBi
69.87
43.03
31.38
30.75
40.03
41.07
41.84
30.35
COi
172.90
174.94
171.92
172.60
173.12
173.12
171.99
171.17
$1
$2
$3
$4
$5
$6
$7
$8
Awk sequentially reads each row redefining the value of each standard variable ($1 to $8)
- NF is set to the number of fields (columns), 8 in this example
- NR is set to the number of rows, 9 in this example
- $0 is a string corresponding to the entire row
MAIN
• The primary Awk functions can be grouped
into 5 categories
– Logic statements
– Arithmetic
– Looping
– Arrays
– Printing
MAIN
• As the file is being read in, you
can now write instructions to
test, change or manipulate the
original data
• You can define your own
variable names
• You can do any number of
arithmetic functions (
– Basic math +,-,*,/,^
– General functions – cos(x), exp(x),
sqrt(x), etc.
BEGIN {
CAmax[0]= "65.52"
CAmin[0]= "43.00"
CBmax[0]= "38.70"
CBmin[0]= "0.00"
Res[0]="A"
{
PkID=$1
NH=$2
N15=$3
CAiatom=$4
CBiatom=$5
CAatom=$6
CBatom=$7
COi=$8
CAup=sqrt(CAmax[0] – Caiatom)
CBdn = CBiatom/CBmin[0]
CO2 = Coi^2
Functions
• Logic statements
– if (logical test of a parameter/variable)
– Probably most important logic command
– General call structure is
• if (statement to test) {action}
• Example: if ($1 == “HAPPY”)
– Reads “if column 1 equals HAPPY”
– If this is true then we do something
– else
‒ Used to perform an action when the if statement is false
‒ else {action}
‒ Example
BEGIN {
$1 = “HAPPY”
if ($1 == “HAPPY”) print “I am HAPPY”
else print “I am SAD”
}
Functions
• Logic statements
– ! (not) true if not a match
• Example: if ($1 !=“HAPPY”)
• True if $1 NOT EQUAL to “HAPPY”
– && (and) true only if both conditions are met
• Example: if ($1 > $2 && $1 > $3)
• True if $1 is larger than BOTH $2 and $3
– || (or) true if one of multiple conditions are met
• Example: if ($1 > $2 || $1 > $3)
• $1 only needs to be larger than either $2 or $3 for the statement
to be true
BEGIN {
Functions
While {getline < ref.pck > 0) {
CAref[i] = $1
CBref[i] = $2
i++
}
• Loops – allows you to repeat a
set of instructions until a
condition is met
{
{
For (i = 1; i <= NF; i++) {
if ($i >= 54.0 && <= 55.0) count++
}
• Major source of problem –
infinite loop
– The exit condition is never met
{
• Two loop functions
– For
– While
END {
For (i = 1; i <= NR; i++) {
print CA[i],CB[i]
}
}
Functions
• Arrays – allows you to assign
multiple values to a single
variable
BEGIN {
i=1
{
{
PkID[i]=$1
NH[i]=$2
N15[i]=$3
Caiatom[i]=$4
Cbiatom[i]=$5
Caatom[i]=$6
Cbatom[i]=$7
Coi[i]=$8
• Effectively allows you to sort or
group information
• Two types of Arrays
– 1D: CA[0]
– 2D: CB[0,0]
i++
}
Functions
• printf – primary mechanism of
reporting the results of the Awk
program to the user
• Extremely flexible number of
options available to format output
BEGIN {
state=“HAPPY”
}
{
For (i = 1; i <= 10; i++) {
print i
print i*i
printf (“%s\n”, state)
}
– Can do calculations within print
statement
– Can be frustrating to get it right.
{
• Two types of print statements
– print: no formatting, just prints the
value of the valuable
– printf: full range of formats
available
Functions
‒ Examples of different formatting options with printf
• Each variable needs a type definition:




%d - decimal
%s - string
%f – floating point
%e – floating point with scientific notation
• Formatting is “literal”
printf (“%s%s\n”, $1,$2)
– print all the characters in column 1 (%s) and column 2 (%s)
– \n print new line
– no spacing
» $1 = HAPPY and $2 = SAD the output would be HAPPYSAD
Functions
‒ Examples of different formatting options with printf
• Spacing , Tabs and justifications






The number of spaces between type definitions will be printed
\t – Tab, using system defined tab locations
\n – print new line
Can use any number or combination of tabs, spaces and new lines
Default printing is right justified
For left justification, place a – in front of the type classification (e.g. %-10s)
printf (“%s %s\n”, $1,$2)
– single space
» $1 = HAPPY and $2 = SAD the output would be HAPPY SAD
printf (“%s
\t%s\n\n”, $1,$2)
– five space then tab
» $1 = HAPPY and $2 = SAD the output would be HAPPY
» Followed by two new lines
SAD
Functions
‒ Examples of different formatting options with printf
• Precision Modifier
 “Fine tunes” how the variable is printed
 Defines both spacing and number of characters or significant figures printed
 Simply, place a number in front of the type classification (e.g. %5.3f)
printf (“%10s%5s\n”, $1,$2)
– 10 spaces for the first string and 5 spaces for second string
– Spaces include the number of characters in the string
»
»
»
»
$1 = HAPPY and $2 = SAD the output would be
HAPPY SAD
5 spaces in front of HAPPY (5 spaces + 5 characters in HAPPY = 10)
2 spaces in front of SAD ( 2 spaces + 3 characters in SAD = 5)
OR printing of $1 will end on column 10 and printing of $2 will end on column 15
printf (“%f %5.3f\n”, $1,$1)
»
»
»
»
$1 = 1/3, the output would be 0.333333 0.333
%f – all the characters are printed
5 in %5.3 indicates a total of 5 characters are printed (including decimal point)
3 in %5.3 indicates a total of 3 characters are printed to the left of decimal point
Functions
‒ Examples of different formatting options with printf
• Printing is “literal”
 Anything within the quotes is printed
printf (“%s HELLO %s\n”, $1,$2)
» $1 = HAPPY and $2 = SAD the output would be HAPPY HELLO SAD
printf (“Hello World\n”)
» Don’t need to print a variable
» The output would simply be: Hello World
• Print to a File
 Simply redirect the output of the print or printf statement to a file name
printf (“Hello World\n”) > helloworld.txt
Functions
‒ Examples of different formatting options with printf
• Can do Math within the print and printf statement
printf (“%d %d\n”, $1^2,sqrt($2))
» $1 = 1/3, the output would be 0.111111 0.577
• This is a general feature of Awk, functions can be imbedded within other
functions
• For More information on Awk, see
• The book “sed and awk” by Dale Dougherty O’Reilly and Associates
• The GNU Awk Users Guide: http://www.gnu.org/software/gawk/manual/gawk.html
• Effective Awk Programming: http://www.gnu.org/software/gawk/manual/
Linux & AWK – Final Thoughts
• These Lectures have only meant to serve as a general introduction
to both Linux and Awk
• There is a lot more detail and other topics that simply were not
covered. Entire courses are dedicated to these topics. I did not
present everything there is to know about Linux and Awk or
programming in general
• Mastering an operating system and computer programming will only
come from extensive effort and practice
• The best way to learn is by doing!!
Download