2015 Workshop Description

advertisement
GCAT-SEEKquence
The Genome Consortium
for Active Teaching
NextGen Sequencing
Undergraduate Education
Workshop
Juniata College,
Huntingdon, PA
June 8 to 12, 2015
Morgan State University
Baltimore, MD
June 15 to 19, 2015
Overview. Workshops will be organized into whole-group and small breakout sessions divided
by application type: transcriptomics, bacterial genomics, metagenomics, and eukaryotic
genomics. A keynote address on integration of genomics projects into the
undergraduate classroom will be provided for the whole group on the first day. Days
two through four will be divided into breakout sessions by application, where wet lab
sample preparation and bioinformatic approaches will be covered. Early evening
sessions will develop educational modules and assessment strategies that
incorporate the workshop experiences into the participants’ courses. Late evening
poster sessions will provide an opportunity for current and former participants and
instructors to present their research, with the goal of devising strategies to leverage
NextGen sequencing to advance the projects. On day five participants will briefly
present their customized teaching module to other members, giving all members an
overview of the alternative applications and teaching approaches.
Target Audience: Undergraduate educators and students that are novices with respect to
NextGen sequencing technology and bioinformatics.




We will assist in all key stages of experimental design through assessment
Our goal is to make it easier for faculty to integrate NextGen sequencing into
classes.
Workshop focuses on
o using raw data as catalyst for learning by workshop participants,
o publishing our teaching modules for broader benefit of the GCAT-SEEK
network.
Teaching modules developed will emphasize strategies for deep, rather than
superficial “black box” active learning.
GCAT-SEEK 2015 Workshop Description
Page 1
Costs: Support from the National Science Foundation and the Howard Hughes Medical
Institute (to Juniata College) funds the workshop, including housing and meals for all
participants and instructors, as well partial support for sequencing runs for samples
prepared at the workshop. Limited travel funding is also available and should cover most
costs.
Objectives: Upon completion of the workshop, participants will be able to:





Design experiments using next-generation sequencing technologies
Prepare nucleic acid samples and assess quality
Sequence and analyze their samples
Teach modules that integrate next-generation sequencing research into the classroom
Assess student learning goals and track outcomes
Who may apply?






Any GCAT-SEEK network member (see www.GCAT-SEEK.org to join) working with
undergraduates. Sorry, this workshop is not intended for graduate students or highschool teachers.
Pairs of faculty from the same institution across disciplines (Bio/Chem /IT/Math/Physics)
Pairs of faculty from the same discipline across institutions
Excellent students with leadership credentials or potential are invited to apply with a
sponsor. A maximum of 5 student/faculty pairs
Applications will be accepted by single researchers open to pairing with an individual
from another institution with a similar area of research interest (only one project will be
sequenced depending on funds available and size of project).
Individuals without projects or team-mates will be considered if we need more people to
fill the workshop.
What will be proposed?
 Nature of project (feasibility). You aren’t expected to know all the details, but the more
that is in there, the easier it will be to judge the potential of the project and whether the
project will fit within the teaching framework we developed below. Please see the
description of the workshops below to determine the kind of projects that would be a
closer fit for the planned educational content.
 Description of course in which the module will be integrated (feasibility for curricular
integration) including number of students in course per year. Describe the class, and
how you see the module fitting in, and when you think the modification will take place. Is
next-generation sequencing already a topic covered in the class?
Criteria for application evaluation: Potential to impact undergraduate education, directly and
through network adoption


Is the project scientifically sound?
Is the project feasible given our limited sequencing funds?
GCAT-SEEK 2015 Workshop Description
Page 2




How many students are involved in the class where it will be used, and will changes be
implemented soon?
Is the project likely to provide authentic research opportunities for many students
through subsequent bioinformatic projects?
If a student applicant is proposed, how strong are the student’s leadership qualifications
Does the investigator have a track record of accomplishment in research and education?
Why pairs?

Previous experience by GCAT-CHIP (M. Campbell pers. comm.) and HHMI more
generally found that attendance by faculty teams increases likelihood of successful
curricular integration once professors return to their home institution. Their sense of
isolation is greatly decreased, and enthusiasm for the challenge of change is
maintained. In particular faculty combinations of biology, information technology,
biochemistry, mathematical modeling, and statistics would be highly beneficial for
subsequent collaboration and integration among disciplines at the home institution.
Ability to communicate and collaborate across disciplines is a core competency identified
by the Vision and Change dialogues and the collaboration within an institution would
model such activities for students. Faculty pairs of biologists from different institutions
will facilitate a sense of community and collaboration that is not usually possible at small
colleges.

We plan to aggressively edit and customize teaching module templates at the workshop
to publishing on our web site for the network at large. Working in teams will make that
more feasible.

Up to five faculty may opt, rather than bringing a faculty colleague, to invite an excellent
undergraduate research student with established leadership credentials or potential to
attend the workshop. Student participation helps create student leaders by further
developing their knowledge, communication skills and credentials. Student participation
fosters a national community of student researchers. As was documented in the Vision
and Change Dialogues, the student perspective is important, often overlooked, and
ultimately deepens the conversation.
GCAT-SEEK 2015 Workshop Description
Page 3
Eukaryotic genomics breakout session.
The goals for the eukaryotic genome analysis section will be for participants to perform
de novo and/or reference-based assembly, automated annotation, and comparative genomics.
In advance of the workshop, participants will be asked to submit DNA samples from their
organism and complete a few basic computer tutorials. During the eukaryotic genomics
breakout session we will review different types of next-generation sequencer output, file types,
quality scores, linkers, barcodes, assembly principles, algorithms (de novo vs resequencing),
and programs. Participants will perform error correction and assembly of a practice GCATSEEK dataset using SOAP deNOVO and/or other assemblers on the Juniata College HHMI
cluster and/or iPlant Atmosphere. Participants will perform gene annotation using MAKER.
Additional analyses will include identification of SNPs, comparative analysis of orthologous and
paralogous gene clusters, pairwise alignment of syntenic regions from closely related species
that have available genome sequences, and RADseq analysis using Stacks and R.
Prokaryotic genomics breakout session.
The goals for the prokaryotic genome analysis section will be for participants to prepare
library-construction quality DNA and appropriate documentation from their organism of interest
and to perform de novo and/or reference-based assembly, and automated annotation of a real
dataset. In advance of the workshop, participants may be asked to submit samples of their
organism, provide any special growth information, and register to use specific bioinformatics
sites. The first day of the prokaryotic genomics breakout session will begin with an overview
and comparison of different approaches for gDNA isolation. Participants will then isolate gDNA
from the organism of interest and set up a quality-control PCR using 16S rRNA universal
primers. In the afternoon of the same day, participants will work in the computer lab to review
different types of next-generation sequencer output, file types, quality scores, linkers, barcodes,
assembly principles, algorithms (de novo vs resequencing), and programs. Participants will
begin assembly of a practice GCAT-SEEK dataset using either the NextGENe, Geneious,
and/or CLC Workbench suites on the Juniata College GCAT-SEEK cluster. While the
assemblies are running, participants will review annotation methods focusing on RAST (Rapid
Annotation with Subsystem Technologies), NCBI and DOE-JGI IMG (Integrated Microbial
Genome) tools. Upon completion, sequence assemblies will be reviewed, and finishing
strategies discussed. Assembled sequences and reference sequences (if not already present)
will then be loaded into RAST and/or IMG, and annotation run overnight.
During the second and third days of the prokaryotic genomics breakout session
participants will assess DNA quality by Qubit quantification and electrophoresis of gDNA and
PCR products. Participants will prepare documents to be sent with samples to the sequencing
facility. DNA samples that pass quality control standards will be packaged and sent for
sequencing. During the afternoon, participants will review the annotation results to determine
subsystems present and correlate these to the organism’s phenotypes. Genomes of related
organisms will be compared with several phylogenomic metrics such as average amino acid
identity (AAI), average nucleotide identity (ANI), and estimated DNA-DNA Hybridization value.
Gene content and order will be assessed to determine core and unique genes, and synteny.
GCAT-SEEK 2015 Workshop Description
Page 4
Metagenomics breakout session.
The goals for the metagenomic analysis workshop will be for participants to prepare their
high quality DNA samples for 16S/18S rRNA gene, ITS region, or functional gene sequencing
and to learn relevant bioinformatics analyses of these datasets. In advance of the workshop,
participants will be asked to submit high quality DNA extracts, provide the relevant metadata,
and register to use specific open-source bioinformatics tools.
The first day of the metagenomics breakout session will begin with an overview of
sample preparation for sequencing. Participants interested in targeted gene sequencing will
subsequently perform PCR amplification with the appropriate Illumina barcoded primers and
sequencing adaptors. In the afternoon of the same day we will have an introduction to working
in linux and compute cluster environments, followed by an introduction to the analysis of
16S/18S rRNA gene/ITS/functional gene sequences, using QIIME software. Briefly, this tutorial
will cover de-multiplexing, quality filtering, clustering and annotation of sequences, as well as an
introduction to multivariate statistical approaches to comparing different samples.
The morning of the second day, our PCR amplified libraries (or other single gene
libraries) will be quantified and quality checked using a Qubit Fluorometer and Agilent
Bioanalyzer. In the afternoon we will also continue working on data analysis and interpretation
within QIIME and provide participants with a brief overview of other statistics that can be
performed outside of QIIME.
On the third day of the workshop, we will have a brief lecture on preparation of shotgun
metagenomic libraries followed by a hands-on tutorial of shotgun metagenomic bioinformatics.
First, participants will be introduced to open-source metagenomics data analysis tools including
MG-RAST, IMG/M, CAMERA in addition to in-house pipelines available on the HHMI compute
cluster. Participants will also be given an introduction on how to perform metagenomics
analyses on the Amazon Elastic Compute Cloud.
GCAT-SEEK 2015 Workshop Description
Page 5
RNAseq breakout session.
The RNA-seq analysis breakout sessions will walk participants through the four major
phases of RNA-seq analysis: RNA isolation and library preparation, transcriptome assembly,
gene annotation, and analysis of expression differences. Prior to the workshop, participants will
be asked to submit their samples of interest, provide details about the samples including
treatment differences, and register for access to relevant bioinformatic tools. Following an
overview of RNA isolation methods and a discussion of alternative approaches, participants will
extract RNA from their samples, or participants may bring extracted samples to the workshop.
These RNA samples may be used for library preparation at the workshop.
Participants will then be introduced to the bioinformatic tools and approaches for RNAseq analysis using both the online Galaxy interface and basic linux command line approaches.
Assembly basics, including next-generation sequencer file types, quality scores, barcodes,
assembly principles, and assembly programs will be introduced. Participants will assemble a
practice GCAT-SEEK dataset using the Trinity, Velvet, OASES, and/or SeqMan NGEN
programs on the Juniata College GCAT-SEEK cluster. Participants will then review annotation
methods, focusing on alignment to NCBI protein databases and Blast2GO, and prepare to
annotate the assembled practice dataset.
Finally, participants will map the sequenced reads to the newly assembled
transcriptome and/or a reference transcriptome. The statistical basis for using coverage from
this mapping to measure gene expression will be discussed, and various tools for analysis will
be introduced. The mapping results will then be used to quantify gene expression differences
using DESeq to identify group differences, the stochastic expectation maximization algorithm
(SEM) to identify co-variation with phenotypes, and/or weighted gene correlation network
analysis (WGCNA) to identify gene networks, all in the R statistical environment. Strategies for
interpreting and visualizing the functional significance of the results (gene ontology analysis,
metabolic pathway analysis, polymorphisms etc.) will be explored using the practice dataset.
GCAT-SEEK 2015 Workshop Description
Page 6
Evening Sessions on Pedagogical Integration.
We propose to use informal evening homework sessions each night to focus on teaching
module design. This section of the workshop will be coordinated by Nancy Trun, and assisted
by the workshop facilitators specific to their breakout session. Teams will meet at a lounge area
in their dormitory each evening and adapt materials provided from the day’s workshop session
to fit the research project they are working on, and address goals of the course they identified in
their application. Workshop presenters will create a teaching modules aimed at lower division
students as laboratory or case-study, active-teaching experiences. Modules will contain goals,
technical details and protocols, student activities and assessments that can be used by
participants to teach NextGen technology. Some material and details may need to be adjusted
depending on the details of the experimental design and goals, and considerations such as
class size and level. The workshop participants will present to the entire group the outline of
their module on the last day of the workshop. The facilitators will make sure the information in
the presentation is accurate and that the presentation is designed to engage the other workshop
participants
The Day 1 evening session would involve editing biological context and experimental
design content. Days 2 and 3 would involve editing wet lab and bioinformatic protocols. Day 4
would involve editing the student activities, assessment tools, and revising the teaching module
as a whole. Day 5 AM will involve faculty teams presenting to the whole group for 15-20
minutes, and submitting their module to GCAT-SEEK for further peer review, editing, and
publishing. The information to be included in the presentation will focus on: i) how their
sequencing technology works, ii) what question they are using the technology for, iii) a brief
overview of how their data will be analyzed, and iv) limitations of the technology in addressing
the research goal. Presentations will be video recorded and published so that network
members can see how the modules are supposed to work from the designers themselves. The
evening homework sessions will have three purposes. First, they will reinforce what participants
have learned during the workshop. Second, they will give the workshop and network
participants an overview for all of the NextGen sequencing technologies. Third, they will
produce accessible draft teaching modules that can be widely adopted.
GCAT-SEEK 2015 Workshop Description
Page 7
Assessment group session.
A core objective of this project is to provide participating faculty with the skills and tools
necessary to incorporate research-based next generation sequencing pedagogy into their
classrooms. We hypothesize that the research-based pedagogical innovations fostered by this
project will enhance student learning, particularly in the core competencies outlined in the Vision
and Change 2011 report. We intend to test this hypothesis by applying the GCAT-SEEK
assessment instrument, which was developed based on these core competencies, across all
GCAT-SEEK projects. During the proposed workshops, participants will thus learn to analyze
student progress on both consortium and course-specific learning goals using the GCAT-SEEK
assessment instrument. The resulting data can then be used to assess student progress as
well as to modify and improve course modules. This assessment process will ensure that
student learning is continually improving and that, as technologies change, the GCAT-SEEK
consortium will ensure that faculty skills and course modules change with them.
By the end of the workshop, participants will have prepared their samples for
sequencing at the workshop and ideally have sent them off to be processed. They will have
practiced bioinformatic procedures on datasets similar to those they are producing (or your
actual data in some cases), and will have gained experience editing and customizing an online
student assessment tool. They will have produced education modules that will fit their research
interests and data when it arrives from the sequencing core. Network members will benefit from
these efforts in having available numerous completed, parallel projects by which they can
engage genomics students. Network members can optionally i) use the customized modules
and data without having to perform wet lab experiments themselves, or ii) use the generic
modules to perform customization themselves.
GCAT-SEEK 2015 Workshop Description
Page 8
Download