Click to add title - WEHI Bioinformatics

advertisement
Developing GUI Microarray Analysis Tools
Keith Satterley
Bioinformatics, WEHI, Nov. 15 2005
1
Developing GUI Microarray Analysis Tools
Overview.
1. R, Environment, tools & resources
2. Graphical tools.
3. LimmaGUI and AffylmGUI.
4. Example Analysis.
5. Resources available.
6. Future Developments.
The Walter and Eliza Hall Institute of Medical Research
2
Developing GUI Microarray Analysis Tools
The R Project for Statistical Computing
• R is language and environment for statistical
computing and graphics. R is released under the GNU
license.
• R is a free software environment for statistical computing
and graphics. It compiles and runs on a wide variety of
platforms including Unix variants, Windows and MacOS.
• S was developed by by John Chambers and colleagues
at Bell Labs. R can be considered as a different
implementation of S.
• R was initially written by Robert Gentleman and Ross
Ihaka of the Statistics Department of the University
• of Auckland.
• Since mid-1997 a large group of individuals have
contributed to R by sending code and bug reports.
• The R url is http://www.r-project.org/
The Walter and Eliza Hall Institute of Medical Research
3
Developing GUI Microarray Analysis Tools
The R Project for Statistical Computing
• R has an effective data handling and storage
facility,
• A suite of operators for calculations on arrays, in
particular matrices,
• Provides a vast number of useful statistical tools,
many of which have been painstakingly tested,
• R produces publication-quality graphics in a
variety of formats, including JPEG, postscript,
eps, pdf, and bmp,
• A well-developed, simple and effective
programming language.
The Walter and Eliza Hall Institute of Medical Research
4
Developing GUI Microarray Analysis Tools
The R Project for Statistical Computing
• R allows users to add additional
functionality by defining new functions.
• C, C++ and Fortran code can be linked
and called at run time.
• R can be extended (easily) via packages.
• There are about eight packages supplied
with the R distribution and many more are
available through the CRAN family of
Internet sites
The Walter and Eliza Hall Institute of Medical Research
5
Developing GUI Microarray Analysis Tools
Resources for R
• Frequently Asked Questions:
http://www.ci.tuwien.ac.at/%7Ehornik/R/R-FAQ.html
• Archives - CRAN see next.
• Mailing Lists
– r-help@lists.r-project.org:
– r-devel@lists.r-project.org:
– r-sig-mac@stat.math.ethz.ch.
• Bug-tracking System: http://bugs.r-project.org/
The Walter and Eliza Hall Institute of Medical Research
6
Developing GUI Microarray Analysis Tools
Resources for R
• CRAN = Comprehensive R Archive
Network.
• CRAN is a network of ftp and web servers
around the world that store identical, upto-date, versions of code and
documentation for R.
The Walter and Eliza Hall Institute of Medical Research
7
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Australia
–
http://cran.au.r-project.org/ PlanetMirror, Brisbane http://cran.ms.unimelb.edu.au/ University of Melbourne
Austria
–
http://cran.at.r-project.org/ Technische Universitaet Wien
Brasil
–
http://cran.br.r-project.org/ Universidade Federal do Parana?? http://www.insecta.ufv.br/CRAN/ Federal University of Vicosa http://cran.fiocruz.br/ Oswaldo Cruz Foundation, Rio de Jane
http://lmq.esalq.usp.br/CRAN/ University of Sao Paulo, Piracicaba http://www.vps.fmvz.usp.br/CRAN/ University of Sao Paulo, Sao Paulo
Canada
–
http://cran.stat.sfu.ca/ Simon Fraser University, Burnaby http://probability.ca/cran/ University of Toronto
China
–
http://www.lmbe.seu.edu.cn/CRAN/ Southeast University, Nanjing
Denmark
–
http://cran.dk.r-project.org/ dotsrc.org, Aalborg
France
–
http://cran.fr.r-project.org/ CICT, Toulouse http://cran.univ-lyon1.fr/ Dept. of Biometry & Evol. Biology, University of Lyon http://mirror.internet.tp/cran/ Boese Internet, Paris
Germany
–
http://cran.r-mirror.de/ Stefan Drees, Berlin http://pangora.org/cran/ Pangora GmbH, Hamburg http://cran.miscellaneousmirror.org/ Miscellaneousdata.de, Koeln http://umfragen.sowi.uni
mainz.de/CRAN/ University of Mainz http://cran.mirrorplus.org/ mirrorplus.org, Muenchen
Hungary
–
http://cran.hu.r-project.org/ Semmelweis University
Italy
–
http://cran.arsmachinandi.it/ Ars Machinandi, Arezzo http://microarrays.unife.it/CRAN/ Universita di Ferrara http://rm.mirror.garr.it/mirrors/CRAN/ Garr Mirror, Milano http://dssm.unipa.it/C
Universita degli Studi di Palermo
Israel
–
http://cran.active.co.il/ Activetech Ltd, Tel-Aviv
Japan
–
ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN University of Aizu http://cran.md.tsukuba.ac.jp/ University of Tsukuba
Korea
–
http://bibscvs.snu.ac.kr/R/ Seoul National University
Netherlands
–
http://cran.nedmirror.nl/ Nedmirror, Amsterdam
Poland
–
http://novum.am.lublin.pl/CRAN/ Skubiszewski Medical University, Lublin http://r.meteo.uni.wroc.pl/ University of Wroclaw
Portugal
–
http://cran.pt.r-project.org/ Universidade do Porto
Slovenia
–
http://www.fastmirrors.org/cran/ Fastmirrors.org, Besnica http://www.wsection.com/cran/ Wsection.com, Ljubljana
South Africa
–
http://cbio.uct.ac.za/CRAN/ University of Cape Town http://cran.za.r-project.org/ Rhodes University
Spain
–
http://cran.es.r-project.org/ Spanish National Research Network, Madrid
Switzerland
–
http://cran.ch.r-project.org/ ETH Zuerich http://www.imsv.unibe.ch/cran/ Universitaet Bern http://cran.prokmu.com/ Prokmu Hosting, Bern
Turkey
–
http://godel.cs.bilgi.edu.tr/mirror/cran/ Istanbul Bilgi University
Taiwan
–
http://cran.cs.pu.edu.tw/ Providence University, Taichung http://cran.csie.ntu.edu.tw/ National Taiwan University, Taipei
UK
–
http://cran.uk.r-project.org/ University of Bristol http://www.sourcekeg.co.uk/cran/ Sourcekeg, London
USA
–
http://cran.cnr.Berkeley.edu University of California, Berkeley, CA http://cran.stat.ucla.edu/ University of California, Los Angeles, CA http://cran.ssds.ucdavis.edu/ University of California
http://rh-mirror.linux.iastate.edu/CRAN/ Iowa State University, Ames, IA http://www.biometrics.mtu.edu/CRAN/ Michigan Technological University, Houghton, MI http://cran.wustl.edu/ Wa
University, St. Louis, MO http://www.ibiblio.org/pub/languages/R/CRAN/ University of North Carolina, Chapel Hill, NC http://cran.us.r-project.org/ Pair Networks, Pittsburgh, PA
http://lib.stat.cmu.edu/R/CRAN/ Statlib, Carnegie Mellon University, Pittsburgh, PA http://cran.hostingzero.com/ Hosting Zero, Dallas, TX http://cran.fhcrc.org/ Fred Hutchinson Cancer R
Center, Seattle, WA
Developing GUI Microarray Analysis Tools
The Walter and Eliza Hall Institute of Medical Research
8
Developing GUI Microarray Analysis Tools
CRAN Mirrors – 475 packages
The Walter and Eliza Hall Institute of Medical Research
9
Developing GUI Microarray Analysis Tools
Resources for R
• Features of R.
– Graphical abilities.
– Package System.
– Objects in R.
The Walter and Eliza Hall Institute of Medical Research
10
Developing GUI Microarray Analysis Tools
Graphical Capabilities in R
• On unix(inc. Mac OS X) X11 is used.
• On MS Windows it uses the MS windows system
commands.
• This is not a GUI, but a graphics device for
plotting and drawing.
• There are high level, low level and interactive
plotting commands.
• plot(x) is a high level command.
– If x is a time series, this produces a time-series plot.
– If x is a numeric vector, it produces a plot of the
values in the vector against their index in the vector.
– If x is a complex vector, it produces a plot of
imaginary versus real parts of the vector elements.
The Walter and Eliza Hall Institute of Medical Research
11
Developing GUI Microarray Analysis Tools
Graphical Capabilities in R
• Low-level plotting commands can be used
to add extra information (such as points,
lines or text) to the current plot.
• abline(a, b)
– Adds a line of slope b and intercept a to the
current plot.
• title(main, sub)
– Adds a title main to the top of the current plot
The Walter and Eliza Hall Institute of Medical Research
12
Developing GUI Microarray Analysis Tools
An R command line Example
•
•
•
•
•
•
•
•
•
•
•
library(limma)
setwd("C:/aaa-R/swirl/")
getwd()
list.files()
targets <- readTargets("SwirlTargetsFile.txt")
targets
RG <- read.maimages(targets$FileName,
source="spot")
RG
par(fg="yellow",bg="green")
plot(RG$R,lwd=3)
abline(2000,1,lwd=5,col ="black")
The Walter and Eliza Hall Institute of Medical Research
13
Developing GUI Microarray Analysis Tools
R Graphics
The Walter and Eliza Hall Institute of Medical Research
14
Developing GUI Microarray Analysis Tools
R Graphics (cont.)
10000
5000
0
Frequency
15000
PM Intensity distribution for PreS2
6
8
10
12
14
16
log2(PM Intensity)
The Walter and Eliza Hall Institute of Medical Research
15
Developing GUI Microarray Analysis Tools
Bioconductor Graphics
The Walter and Eliza Hall Institute of Medical Research
16
Developing GUI Microarray Analysis Tools
R Packages
• Packages provide a mechanism for
loading code and attached
documentation.
• Packaging automatically checks
and creates various documentation
files from one source
• Creates distributable
win.binary(.zip), mac.binary(.tgz) or
source files(.tar.gz).
• Packages can specify dependent or
suggested packages
The Walter and Eliza Hall Institute of Medical Research
17
Developing GUI Microarray Analysis Tools
R Packages(cont.)
• install.packages() can install a package
and all its dependencies (and their
dependencies…), either the essential ones
and/or the suggested ones (which maybe
needed for examples etc.)
The Walter and Eliza Hall Institute of Medical Research
18
Developing GUI Microarray Analysis Tools
Objects in R
• The entities R operates on are technically
known as objects.
• The class of an object determines how it
will be treated by what are known as
generic functions.
• For example print, plot or summary will
react according to what sort of object they
are called to work on.
The Walter and Eliza Hall Institute of Medical Research
19
Developing GUI Microarray Analysis Tools
Bioconductor
• Url is http://www.bioconductor.org/
• Bioconductor is an open source and open
development software project for the analysis
and comprehension of genomic data.
• The Bioconductor core team is based primarily
at the Fred Hutchinson Cancer Research
Center.
• Aims to promote high-quality documentation and
reproducible research.
• Aims to provide access to a wide range of
powerful statistical and graphical methods for
the analysis of genomic data.
The Walter and Eliza Hall Institute of Medical Research
20
Developing GUI Microarray Analysis Tools
Bioconductor
• R and the R package system are the main
vehicles for designing and releasing
software.
• Bioconductor has a commitment to full
open source discipline, All contributions
are expected to exist under an open
source license such as GPL2 or BSD.
The Walter and Eliza Hall Institute of Medical Research
21
Developing GUI Microarray Analysis Tools
Bioconductor
• Features of the Bioconductor site.
– Packages – code
– Packages – metadata
– Version management system
The Walter and Eliza Hall Institute of Medical Research
22
Developing GUI Microarray Analysis Tools
Bioconductor Packages
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
140 code packages listed
aCGH
affxparser
affy
affycomp
affydata
affylmGUI
affypdnn
affyPLM
affyQCReport
altcdfenvs
~~~~~~
limma
limmaGUI
~~~~~~
vsn
webbioc
widgetInvoke
widgetTools
xcms
1.4.0
1.2.0
1.8.1
1.6.0
1.6.0
1.4.0
1.4.0
1.6.0
1.8.0
1.4.0
Classes and functions for Array Comparative Genomic Hybridization data.
Affymetrix File Parsing SDK
Methods for Affymetrix Oligonucleotide Arrays
Graphics Toolbox for Assessment of Affymetrix Expression Measures
Affymetrix Data for Demonstration Purpose
GUI for affy analysis using limma package
Probe Dependent Nearest Neighbours (PDNN) for the affy package
Methods for fitting probe-level models
QC Report Generation for affyBatch objects
alternative cdfenvs
2.2.0
1.6.0
Linear Models for Microarray Data
GUI for limma package
1.8.0
1.2.0
1.2.0
1.6.0
1.2.0
Variance stabilization and calibration for microarray data
Bioconductor Web Interface
Evaluation widgets for functions
Creates an interactive tcltk widget
LC/MS and GC/MS Data Analysis
•
•
PLUS
250 metadata packages
•
•
•
•
•
•
From:
ag
agahomology
To:
zebrafishcdf
zebrafishprobe
1.10.0
1.10.0
Affymetrix Arabidopsis Genome Array Annotation Data (ag)
A data package containing annotation data for agahomology
1.10.0
1.10.0
zebrafishcdf
Probe sequence data for microarrays of type zebrafish
The Walter and Eliza Hall Institute of Medical Research
23
Developing GUI Microarray Analysis Tools
Bioconductor – use the Subversion version mgt. system
• Subversion! http://svnbook.red-bean.com/en/1.1/svn-book.html
• Subversion is a free/open-source version
control system. (replaces CVS).
• That is, Subversion manages files and
directories over time.
• Subversion clients can access their
repository across networks, which allows
the version repository to be accessed by
many users simultaneously.
The Walter and Eliza Hall Institute of Medical Research
24
Developing GUI Microarray Analysis Tools
Bioconductor – Version management system
it remembers every change ever written to it:
A client can ask historical questions like,
“What did this directory contain last
Wednesday?” or
“Who was the last person to change this file,
and what changes did they make?”
• Subversion uses a Copy-Modify-Merge solution,
rather than a Lock-Modify-Unlock procedure.
The Walter and Eliza Hall Institute of Medical Research
25
Developing GUI Microarray Analysis Tools
Graphical User Interfaces
• These items are known as widgets.
• Tcl/Tk is a tool for creating and interacting with
widgets.
• Tcl/Tk runs on unix, Windows and Mac OS X.
The Walter and Eliza Hall Institute of Medical Research
26
Developing GUI Microarray Analysis Tools
Tcl/Tk
• Tcl/Tk needs to be installed on the
computer as well as R.
• There are prewritten librarys of Tcl/Tk
tools- - for eg. TkTable.
• The R package tcltk needs to be installed
in R.
• The tcltk R package is an interface
between the R language and Tcl/Tk
commands.
The Walter and Eliza Hall Institute of Medical Research
27
Developing GUI Microarray Analysis Tools
GUI Programs
• On Windows Tcl/Tk talks to the MS Windows
graphical window system.
• On Unix(&Mac), Tcl/Tk talks to the X Windows
system, hence X11 must be started first.
• 1. Run X11 on Unix & Mac
• 2. load the R package tcltk using:
• library(tcltk)
• library(affylmGUI) for example,
(actually affylmGUI will automatically load tcltk)
The Walter and Eliza Hall Institute of Medical Research
28
Developing GUI Microarray Analysis Tools
R tcltk example
• This can be used to test if tcltk (or Tcl/Tk)
is working correctly:
• >library(tcltk)
• >tt <- tktoplevel()
• >lbl <- tklabel(tt, text="Hello, World!")
• >tkpack(lbl)
• >but <- tkbutton(tt, text="OK")
• >tkpack(but)
The Walter and Eliza Hall Institute of Medical Research
29
Developing GUI Microarray Analysis Tools
R tcltk testing tools
• To check the path that Tcl/Tk uses to find
libraries
– >tclvalue(“auto_path”)
–
–
–
–
[1] "{C:\\R\\rw2020\\R-2.2.0/Tcl/lib/tcl8.4}
C:/R/rw2020/R-2.2.0/Tcl/lib ./lib
C:/R/rw2020/R-2.2.0/Tcl/lib/tk8.4
C:/R/rw2020/R-2.2.0/library/tcltk/exec“
• To add an extra path to search, use:
– >addTclPath(“C:/bin”)
– >tclvalue(“auto_path”)
–
–
–
–
[1] "{C:\\R\\rw2020\\R-2.2.0/Tcl/lib/tcl8.4}
C:/R/rw2020/R-2.2.0/Tcl/lib ./lib
C:/R/rw2020/R-2.2.0/Tcl/lib/tk8.4
C:/R/rw2020/R-2.2.0/library/tcltk/exec C:/bin“
– For a list of package commands:
– >ls(package:tcltk)
The Walter and Eliza Hall Institute of Medical Research
30
Developing GUI Microarray Analysis Tools
Help Commands in R
• help(mean) #help window on mean function
• ?mean
#same as help(mean)
• help.search(“regression”) #Help files with
alias or concept or title matching 'regression' using fuzzy matching:
• help.start()
#Browser into R docs
• The Browser shows links into the R Language
Definition, Installation & Administration of R,
Package writing, Package documentation
FAQ’s etc.
The Walter and Eliza Hall Institute of Medical Research
31
Developing GUI Microarray Analysis Tools
Some Useful R Commands for the GUI user!
•
•
•
•
•
•
•
•
•
•
getwd() #Get working directory.
setwd() #Set working Directory.
list.files() #list files in working directory.
ls() #list objects in workspace.
rm(list=ls()) #Remove all objects (recommended at
start of a session).
savehistory(file=“History.txt”)
source(file="C:/path/to/filename/file.R", echo=T)
#reads commands from file.R and executes them.
installed.packages() #detailed info on all
packages installed.
summary(RG) #displays basic data about object RG.
library(limmaGUI) #loads limmaGUI package.
The Walter and Eliza Hall Institute of Medical Research
32
Developing GUI Microarray Analysis Tools
Cross Platform Issues
• Installation issues are varied
• MS Windows – able to be installed in C:\R
by ordinary user
• Unix – can be installed by user, but
duplications if multiple users do so.
• Mac OS X – special procedures necessary
The Walter and Eliza Hall Institute of Medical Research
33
Developing GUI Microarray Analysis Tools
LimmaGUI
• limmaGUI is a Graphical User Interface
(GUI) based on R-Tcl/Tk for the
exploration and linear modelling of data
from two-colour spotted microarray
experiments, especially the assessment of
differential expression in complex
experiments.
• Swirl Example Analysis.
The Walter and Eliza Hall Institute of Medical Research
34
Developing GUI Microarray Analysis Tools
AffylmGUI
• AffylmGUI enables the user to perform
quality assessment, low-level analysis and
linear modeling of data from Affymetrix
GeneChips®, with the ultimate goal of
identifying differentially expressed genes.
• Estrogen Example Analysis
The Walter and Eliza Hall Institute of Medical Research
35
Developing GUI Microarray Analysis Tools
WEHI website Resources
• WEHI Bioinformatics home page
http://bioinf.wehi.edu.au/
• Microarray Data Analysis
http://bioinf.wehi.edu.au/marray/index.html
LIMMA:Linear Models for Microarray Data
http://bioinf.wehi.edu.au/limma/index.html
limmaGUI: http://bioinf.wehi.edu.au/affylmGUI/
affylmGUI: http://bioinf.wehi.edu.au/affylmGUI/
James Wettenhall's Bioinformatics Home Page:
http://bioinf.wehi.edu.au/folders/james/
R-Tcl/Tk Examples, Worked Examples for limma/affylmGUI at
http://bioinf.wehi.edu.au/limmaGUI/R/library/limmaGUI/doc/DocIndex.html
The Walter and Eliza Hall Institute of Medical Research
36
Developing GUI Microarray Analysis Tools
Future Directions for AffylmGUI
• additional plots to aid in quality assessment of a
set of chips, including RNA degradation plots;
• calculation and display of QC parameters
recommended by Affymetrix (Affymetrix, 2004),
such as percent present, ratios of 3’/5’
expression for hybridization controls and the
like;
• fitting of mixed linear models where there is
technical replication;
• support for other single-channel platforms.
The Walter and Eliza Hall Institute of Medical Research
37
Developing GUI Microarray Analysis Tools
Future Directions for LimmaGUI
• additional plots to aid in quality
assessment of a set of chips;
• fitting of mixed linear models where there
is technical replication;
• fitting of mixed linear models where there
is biological replication;
• ?
The Walter and Eliza Hall Institute of Medical Research
38
Developing GUI Microarray Analysis Tools
Aknowledgments
• James Wettenhall
• Gordon Smyth
• Ken Simpson
• Terry Speed
• Bioinformatics – many seminars on
microarrays!
The Walter and Eliza Hall Institute of Medical Research
39
Developing GUI Microarray Analysis Tools
The Walter and Eliza Hall Institute of Medical Research
40
Download