Lecture 1 - introduction to course

Data Handling & Analysis
Andrew Jackson
Zoology, School of Natural Sciences
[email protected]
Statistics in science
• Analysis of data is central to science
– Metaphorically
– Literally
(Introduction -> methods -> results -> discussion)
• Underpins one’s own research
– Your own research project
• Essential in understanding others’ research
– To question what they did
– To incorporate their ideas in your own research
The scientific method
• Ask a question about the world around you
– Why are vultures the only obligate scavengers
among the extant terrestrial vertebrates?
The scientific method
• Decide what measurable outcome you will use to test a
specific hypothesis
– Physiology of vultures favours this mode of life
– Compare metabolic costs of flight across vertebrate taxa
• Design an experiment or field study to test this idea
• Use statistics to determine whether your predictions
• Frame your findings within the broader background of
the precedent science –introduction and discussion
Course Outline
• 8th Oct – 12th Oct
– Introduction to R and statistics
• 21st Jan – 25th Jan
– General Linear Models
• 29th April – 3rd May
– Generalised Linear Models
– Multivariate methods
• On the Friday ending each week, you will be
asked to submit either
– an assessment, or
– complete an online exam assessing your
proficiency in data analysis using R
Learning outcomes
• NB slightly different from course handbooks
• summarise and communicate quantitative results
graphically and textually to scientific standards.
• apply appropriate statistical analyses of
commonly encountered data types.
• discuss the context of the analyses within a
hypothesis driven framework of scientific logic.
• use the R statistical computing language for data
Course structure
• Series of Lecture / tutorials and computer
• Lectures will be as interactive as possible
• Computer practicals
– Use R to analyse data
– Follow video podcasts for instruction
– Demonstrators present to help
Week 1
• Lectures / Tutorials
– Monday to Thursday 10-12
• Computer sessions
– Monday to Thursday 14-16
– Botany Hut computer rooms
Summary of statistics covered
• Linear regression
• General linear models
– As a way to ask increasingly complex questions of
our data using a common framework (ANCOVA /
multiple regression)
• Generalised linear models
– Extending these concepts to deal with non-normal
data types (binary / surivival / count data)
Statistical software - R
• R is a command line interfaced software
Scary the first few times
Incredibly powerful and adaptable
Open development
• Time-tabled computer sessions
– Complete video-podcast and examples in your own
• When Googling for R related topics add “cran” to
your search terms
Delivery of course content
• Attendance at lectures/tutorials is
• Moodle website associated with course
– Lectures will be posted
– Web-based discussions
– Links to video-podcasts
• Statistics, An introduction using R. Michael
J Crawley. Wiley. ISBN 0-470-02298-1
Basic Experimental Design
For more details see Experimental
Design for the Life Sciences by Ruxton
and Colegrave
Relationship between hormone levels in male
chimpanzees and #females
• Measure hormone levels of
male chimps and then count
how many females are they
foraging with.
• Higher hormone levels are
expected when there are
more females to mate with.
• However, hormone levels are
influenced by age, diet, time
of day etc.
Male hormones and #females
• Hormone level difference could be due to age,
diet, time of day OR #females
Relationship between hormone levels in male
chimpanzees and #females
• All chimps are the same age, diet, and time of
day so hormone level difference ~ #females
Class Exercise
Come up with a scientific question
and plot your predictions
Computer Session
• Work through 3 podcasts on my website
– http://www.tcd.ie/Zoology/research/research/the
1. Opening R for the first time
2. Working with script files
3. Importing data into R