Short_course2 - Ganesha Associates

Competências Básicas de Investigação
Científica e de Publicação
Lecture 2: Hypotheses and Search
August 2014
13/08/2013
Ganesha Associates
Experimental vs. Observational studies
No modification of experimental variables
Useful to discover trends and associations
Cannot directly be used to infer causality
Compare responses different treatments
Designed to avoid misleading results
e.g. randomisation
Can be used to infer cause and effect
9 September 2013
Ganesha Associates CC BY 3.0
2
Experimental and observational types of
research
The scientific process involves making
models of how things work
• These evolving models are described in the
scientific literature
• Sometimes the models are wrong, often they are
incomplete
• Scientific progress is driven by the
communication and publication of the results of
new research, and the reinterpretation of older
work
• The tool which makes all of this possible is the
hypothesis
9 September 2013
Ganesha Associates CC BY 3.0
4
9 September 2013
Ganesha Associates CC BY 3.0
5
9 September 2013
Ganesha Associates CC BY 3.0
6
9 September 2013
Ganesha Associates CC BY 3.0
7
Main learning points
• Student projects fall into three categories
– No hypothesis, i.e. observational
– Weak hypothesis
– Strong hypothesis
• The work will be published in a
– National journal
– Low impact factor journal
– High impact factor journal
• Starting with strong hypothesis improves your
chances of getting published in a good journal
9 September 2013
Ganesha Associates CC BY 3.0
8
9 September 2013
Ganesha Associates CC BY 3.0
9
What is a strong hypothesis ?
• A strong hypothesis is based on a series of
premises – things that are already known with
some certainty
• Each premise must be supported by
references back to the (international) primary
literature
• So a strong hypothesis will be backed by
references to recent papers in high quality
journals
9 September 2013
Ganesha Associates CC BY 3.0
10
9 September 2013
Ganesha Associates CC BY 3.0
11
Coin-tossing - an example
• I wonder how many heads or tails I will get if I toss
this coin 100 times
– No model
• The frequency distribution of heads and tails will be
approximated by a binomial distribution with n=100
and p=0.5
– Simple model, based on symmetry
• A detailed analysis of the dynamics reveals that the
probability of a head is 0.51
– Complex model, based on asymmetry, aerodynamics, etc
9 September 2013
Ganesha Associates CC BY 3.0
12
Coin-tossing – impact on CV
1. None, or possibly negative
2. R. A. Fisher and others did perform this experiment in the early
days of biological statistics, before the advent of computers, as a proof
that the binomial distribution tended towards a normal one at high
levels of n.
Interestingly they all found that the probability of a head p was usually
slightly higher than 0.5, but this difference was ignored.
3. Persi Diacusis, Susan Holmes and Richard Montgomery (Stanford,
2004) publish a paper on the ‘Dynamical bias in the coin toss’ proving
that the lack of total symmetry in a coin means that the probability of
a head will always be slightly greater than 0.5.
9 September 2013
Ganesha Associates CC BY 3.0
13
Coin tossing - relevance
•
•
•
•
Children with unilateral hearing loss (UHL) have been found to have lower language
scores, and increased rate of speech therapy, grade failures, or needing Individualized
Education Plans . The objective of this study was to determine whether language
skills and educational performance improved or worsened over time in a cohort of
children with UHL.
To determine factors associated with physical therapy or occupational therapy
evaluation and speech or swallow therapy evaluation in hospitalized children with
traumatic brain injury; to describe when during the hospital stay the initial therapy
evaluations typically occur; and to quantify any between-hospital variation in therapy
evaluation.
Articulation disorders in young children are due to defects occurring at a certain
stage in sensory and motor development. Some children with functional articulation
disorders may also have sensory integration dysfunction (SID). We hypothesized that
speech therapy would be less efficacious in children with SID than in those without
SID
The present study provides data that support the hypothesis that children who
stutter and typically developing children differ on both composite temperament
factors and temperament scales. The findings were interpreted within existing
frameworks of temperament development, as well as with regard to previous studies
of temperament in CWS.
9 September 2013
Ganesha Associates CC BY 3.0
14
Case study: Hummingbird territorial
behaviour
9 September 2013
Ganesha Associates CC BY 3.0
15
Hummingbird territorial behaviour
Most hummingbird species demonstrate strong territorial
behavior
If a bluffing charge attack does not work, the resident bird
may engage the trespasser in a brief but intense physical
battle
So why do hummingbirds defend territories ?
H0: Hummingbirds are randomly distributed in
space and time.
9 September 2013
Ganesha Associates CC BY 3.0
16
Hummingbird territorial behaviour
H1:
If territory = F(energy), then behavior seasonal but not speciesdependent
H2:
If territory = F(mating), then behavior should be species and
sex dependent
H3:
If…
H4:
If…
9 September 2013
Ganesha Associates CC BY 3.0
17
Territorial behaviour: status 1971
• Time, Energy, and Territoriality of the Anna
Hummingbird (Calypte anna) Science 173 (1971) 818821.
• When territory quality decreases defenders may
switch to less expensive forms of defense because
the energy savings outweigh the loss of resources
• Augmented territorial defense during the breeding
season is made possible by increased feeding
efficiency due to the availability at this time of very
nectar-rich flowers.
• Individuals with large territories are more successful
reproductively.
9 September 2013
Ganesha Associates CC BY 3.0
18
Hummingbird territoriality since
• Hovering performance of hummingbirds in hyperoxic
gas mixtures. J Exp Biol. 2001 Jun;204(Pt 11):2021-7.
• Adipose energy stores, physical work, and the
metabolic syndrome: lessons from hummingbirds.
Nutr J. 2005 Dec 13;4:36.
• Neural specialization for hovering in hummingbirds:
hypertrophy of the pretectal nucleus Lentiformis
mesencephali. J Comp Neurol. 2007 Jan 10;500(2):211-21.
• Three-dimensional kinematics of hummingbird flight.
J Exp Biol. 2007 Jul;210(Pt 13):2368-82.
9 September 2013
Ganesha Associates CC BY 3.0
19
Hypothesis lecture learning points
• Hypotheses can be weak (observational) or
strong (mechanism-based)
• For example, a hypothesis which predicts that
a tossed coin will end up ‘heads’ 50% of the
time is much weaker than one that can predict
the exact sequence of ‘heads’ and ‘tails’
• So hypothesis ‘quality’ is important
• A quick test for quality?
9 September 2013
Ganesha Associates CC BY 3.0
20
Hypothesis lecture learning points
• Good hypotheses build directly onto previous
work
• So they need to become technically more
sophisticated over time moving from the
general to the particular
• A given problem can be associated with a
number of very different hypotheses – your
experiments should include tests to exclude
these alternative explanations
9 September 2013
Ganesha Associates CC BY 3.0
21
Search
13/08/2013
Ganesha Associates
Some sources of scientific content
•
•
•
•
•
•
Google
PubMed/Medline (NLM)
Scopus (Elsevier)
Web of Science (Thomson Reuters)
Google Scholar
PubMed Central, PubMed Central Europe
• SciELO, Biblioteca Virtual em Saude
• Science Direct, Ovid, SpringerLink, Wiley Online
Library, BiomedCentral, Public Library of Science,
SWETSwise…
• CAPES Portal de Periódicos
14 May 2013
Ganesha Associates
23
Each source is different
• Free
– Google, Google Scholar, Pubmed Central
• Subscription
– Scopus, ScienceDirect
• Abstracts and citations only
– PubMed, Web of Science
• Full text, single publisher
– SpringerLink
• Full text, many publishers
– Pubmed Central, SwetsWise Online Content
Classify sources of content
Abstract
only
Full
text
Free access Subscription
You can get access if…
• The journal is subscribed to by CAPES
• You have a personal subscription
• The journal is of the ‘Open Access’ type
– Note: some journals only make their content ‘Open Access’ after 6 or
longer months. Some journals contain a mixture of OA and non-OA
articles. See http://europepmc.org/journalList for more info.
• Journals in the ‘red’ categories are available anywhere.
• Most journals subscribed to by CAPES will be available from
more than one source.
• CAPES journals are only available from computers within the
University network unless you have remote access privileges.
14 May 2013
Ganesha Associates
26
So which sources should I use ?
• No single source contains all of the articles
relevant to your research
• Google has the broadest coverage, but not all
of the documents you find will be peerreviewed articles
• Scopus, WoS and PubMed give you the best
balance between quality and quantity, and, in
theory, should link to all the content
subscribed to by CAPES, plus OA content.
14 May 2013
Ganesha Associates
27
Components of a bibliographic database
• Content such as abstracts and full-text articles
[or a pointer to where these may be found]
• Metadata [data about data]
• Index
• Search engine
• Ranking/relevance algorithm
• Plus many additional features
14 May 2013
Ganesha Associates
28
Content (Basic PDF)
14 May 2013
Ganesha Associates
29
Content (HTML)
14 May 2013
Ganesha Associates
30
The basis of search: Indexing
• The purpose of an index is to optimize speed and performance
in finding relevant documents for a search query.
• Without an index, the search engine would have to scan every
document in the corpus, which would require considerable time
and computing power.
• Metadata helps the indexing algorithm to select different
classes of terminology from which to make an index, so a search
can be carried out on just the authors names, for example
24 August 2012
Ganesha Associates
31
Search: how the result list is ranked
• Date of publication
• Relevance
– Frequency with which search terms occur in the
document
– Proximity of search terms
• Google’s PageRank algorithm also uses "link
popularity”- a document is ranked higher if
there are more links to it
14 May 2013
Ganesha Associates
32
13/08/2013
Ganesha Associates
The question behind the query
• Search engines think in terms of words, but users
think in terms of sentences, specific problems!
– How do you spell Bousfield?
– What do we know about BRCA1?
– Given these symptoms, what is the most likely
diagnosis?
– What are the side effects of aspirin?
– Has this chemical structure been synthesized before?
• “Cancer causes X” vs. “Y causes cancer”
What real queries look like - Google
•
•
•
•
•
•
•
pharmacogenomics and disorders
bacteria growth casein media effect
waal pseudomonas
TRPM2 PCR mouse
Chitinases in carnivorous plants
glycerophosphoinositol 4-phosphate
Dai N, Gubler C, Hengstler P, Meyenberger C,
Bauerfeind P. Improved capsule endoscopy after
bowel preparation. Gastrointest Endosc 2005;61(1)
28-31.
24 August 2012
Ganesha Associates
35
Query changes people actually make
• Query series 1
–
–
–
–
–
latrunculin
latrunculin fm3a cell arrest
latrunculin fm3a arrest
latrunculin fm3a
latrunculin FM3A
• Query series 2
–
–
–
–
cytokinin signalling in arabidopsis
"cytokinin signalling in arabidopsis"
cytokinin delta
spindly arabidopsis
• Results
– Remember to look beyond the first page. Compare the results of
Query 1 in PubMed and Google (add the term PubMed)
24 August 2012
Ganesha Associates
36
13/08/2013
Ganesha Associates
13/08/2013
Ganesha Associates
Anatomy of a query - Pubmed
• invasive fungal infections in young children
• invasive[All Fields] AND ("mycoses"[MeSH
Terms] OR "mycoses"[All Fields] OR
("fungal"[All Fields] AND "infections"[All
Fields]) OR "fungal infections"[All Fields]) AND
("Young Child"[Journal] OR ("young"[All Fields]
AND "children"[All Fields]) OR "young
children"[All Fields])
14 May 2013
Ganesha Associates
40
Boolean terms
13/08/2013
Ganesha Associates
Improving search accuracy
• Wild card characters
– "a * saved is a * earned"
• Operators
– jaguar speed -car
– Pandas -site:wikipedia.org
– “ribosome”
• Synonyms
– MeSH terms
• Boolean terms
– AND, OR, NOT
• Faceted search
– GO terms
So…
• Using the same search terms will produce
different results in different databases
because:
– Content different
– Preparation of search terms will be different, e.g.
only Pubmed uses MeSH terms
– Indexing process, implementation of stemming,
removal of stop words will be different
– Ranking algorithms will be different
Quick tour
Learning points
• Google, Pubmed, Scopus and WoS are
good places from which to start building
an hypothesis
• Learn to use several information resources
because they are all different!
• Modify your search terms during the
course of a search session
• Understand how the results are ranked
and don’t just look on the first page
13/08/2013
Ganesha Associates