Bioinf proposal additional info 20101010

advertisement
Interdisciplinary Bioinformatics Ph.D. Program
Resources Required
1. Facilities
Bioinformatics and Biomedical Computing Laboratory: 1100 ft2 (Duthie Center, Room 238).
The Bioinformatics and Biomedical Computing Laboratory is located in the Department of
Computer Engineering and Computer Science at the University of Louisville. The facilities
within the lab will be available to students within the Bioinformatics Ph.D. program.
Equipment available:





Six generic workstations (dual AMD MP2800 2.13GHz, 2GB RAM, 120GB Hard Drive,
dual boot (Suse Linux 10.0; Windows® XP Professional)),
Four Alienware Area 51 workstations (Dual quad core IntelTM i7 CPU 2.80 GHz, dual
500GB Hard Drive, 6GB RAM),
Three Dell Precision T7400 Workstations (Dual quad core Intex Xeon 2.33GHz, 16GB
RAM, 1.5TB Hard Drive space, Dual Boot Windows® XP or Windows® 7 Professional
and Ubuntu 9.10 Linux),
1 TB network attached storage (NAS) storage and backup system that are student
accessible,
A large study table seating eight.
Health Sciences Center Bioinformatics Space. The Kentucky Biomedical Research Network
(KBRIN) has secured $250,000 in funding for renovating space at the Health Sciences Center
campus for Bioinformatics use. As soon as this space is identified and renovated, students
within the Bioinformatics Ph.D. program will have access to these facilities.
Major Equipment
Adelie Cluster. Adelie is an AMD operton cluster running SUSE Linux. It has 2 master nodes
and 27 batch nodes, 22 of which are dual core-dual processor nodes and the rest of which are
single core-dual processor nodes. Each node has 4GB of memory and utilizes a network RAID
array of 5 Terabytes and has a SCSI tape drive for backup purposes. The RAID storage is
managed by dual NAS heads connected by fiber channel to dual host raid controllers for high
availability. Adelie also contains 4 dual processor, dual core nodes with 24 GB of internal
memory for exclusive use of the biostatistics and bioinformatics department at the UL medical
school.
KYBRIN Beowulf Cluster. In August 2003, the Bioinformatics Lab acquired a 16 node, 32
CPU Beowulf cluster running RedHat Linux (kybrin.louisville.edu). Kybrin is a computational
cluster composed of two master (or login) nodes (one in the process of operating system
upgrade) and 16 "standard" job nodes, each containing dual AMD Athlon processors with 2GB
of memory connected by gigabit ethernet. A "web server" node provides access to a
development environment for internet-based applications. Two additional job nodes are being
added, each containing 24 GB of memory and AMD Opteron dual core, dual processor
motherboards. An older NAS system connects a 1.2 Terabyte Raid 5 array of disks to dual NAS
heads providing high availability and failover of the disk drive system. A newer set of disks has
just arrived and is being added to the system. The new RAID array contains 4 Terabytes of
space (raw) and will include active failover between NAS heads and raid controllers for both the
new and older RAID system. Both of these systems are powered through an industry grade
uninterruptable power supply with a diesel backup generator to minimize downtime and to
maintain other systems in the room during power outages. The room contains burglar alarms,
fire suppression equipment, water alarms and a separate dual-compressor air conditioning
system to provide protection against overheating. A temperature alarm phones key personnel in
case of air conditioning failure to prevent equipment damage and data loss. Home directory
data is fully backed up several times a week. In addition to the cluster, there are six dual-boot
(Linux RedHat/Windows XP) workstations available in the Bioinformatics laboratory connected
directly to the cluster through a gigabit Ethernet switch.
Cardinal Research Cluster. The University of Louisville’s central research-computing
infrastructure includes multiple systems serving the research needs of the entire university. The
Cardinal Research Cluster (CRC) includes a high-performance distributed-memory cluster, a
high-memory SMP machine, an informatics data management system, a visualization server,
and other support servers. The distributed-memory cluster has 312 IBM iDataplex nodes with a
total of 2496 processor cores. The SMP machine has 16 IBM Power6 CPUs and 128 GB of
memory. The visualization server contains two quad-core AMD Opteron processors and an
nVidia Quadro FX5600 graphics processing unit with 128 processor cores. The Informatics data
management system has 20 TB of dedicated storage and is optimized for transaction
processing in Oracle along with provided MySQL database management systems. All research
systems share 100 TB of data storage and archiving space. All the systems are housed in the
University's secure data center and are be administered by a team of specialized HPC system
administrators supported by a team of research computing consultants with experience in HPC
software and database design, development and optimization.
Linux Server. One dual processor AMD Athelon MP 2GHz compute server running Suse Linux
10.0 (kbrin.a-bldg.louisville.edu). The primary use for this server is the development of
microarray design and analysis software and the development of lab management database
systems. Software installed on the linux server includes EMBOSS; R; BioConductor; NCBI
Blast; WU-Blast; bioperl; as well as locally developed programs for microarray analysis. This
server is accessible via the web, and has the capability to serve as the web host for research
dissemination.
Windows Server. One dual processor running Windows 2000 (neurogene.spd.louisville.edu).
The main uses of the Windows server are for windows license managing as well as DNA and
protein sequence database development using Microsoft SQL server. Software installed on the
Windows server includes a dynamic site license for Insightful’s S-Plus version 6.2 and
ArrayAnalyzer as well as locally created databases for GenBank data sets.
Visualization Wall. 3x6 Dell visualization wall consisting of nine Dell PowerEdge 2950 servers
with Intel Xeon 5400 series processors and another PowerEdge 2950 server as the master
node. The 18 display units in the wall are Dell UltraSharp 3007WFP wide screen monitors. Dell
PowerConnect 6224 Ethernet switches provide the interconnect.
Software
The following software is available to students enrolled in the Bioinformatics Ph.D. program:
Informax Vector NTI and Vector XPression; R; bioconductor; NCBI Blast; WU-Blast; sim4;
bioperl; EMBOSS; JEMBOSS; EMBASSY; Gibbs sampler; ClustalW-MPI; mpiBlast; MEME;
Genesis; MaskerAid; RepeatMasker; matlab; Primer3
2. Library
The University Library has provided an assessment of their bioinformatics resources (see
attached). In addition, they request an additional $15,000 to be spent on monographs in the
subject area along with $4174 in year 1; $4382.70 in year 2, and $4601.84 in year three for
electronic access to the journals Bioinformatics, Briefings in Bioinformatics, and the Journal of
Computational Biology (See attachment from the library). In addition to the Library holdings, the
University of Louisville Bioinformatics Laboratory has over 90 bioinformatics monographs worth
nearly $6600. The complete bioinformatics laboratory library catalog can be browsed at:
[http://bioinformatics.louisville.edu/lab/library/opac/]. In addition, Drs. Kalbfleisch and Rouchka
are members of the International Society of Computational Biology and have available online
and print copies of the above mentioned journals at the reduced rate of $262 for Bioinformatics;
$295 for Briefings in Bioinformatics; and $170 for Journal of Computational Biology (online
access only). Dr. Rouchka and Dr. Kalbfleisch will subscribe to these journals using research
incentive funds (RIF). Current issues of print copies will be made available in the Bioinformatics
Laboratory. This leaves a gap of $2800 per year for purchase of new monographs. At this point
in time, the library resources are adequate to proceed with the bioinformatics Ph.D. program. In
the future, the gap in holdings may need to be readdressed as new advancements are made in
the field.
3. Faculty
Faculty identified for involvement with launching the bioinformatics Ph.D. program are listed in
Table 1. Dr. Eric Rouchka and Dr. Ted Kalbfleisch are involved more directly in the organization
of the program and sit on the Executive committee. The remaining faculty have been identified
as potential mentors for students entering the program. Select faculty will also serve on the
executive committee to handle curriculum, admissions, and recruitment issues.
Table 1: Faculty associated with launching Bioinformatics Ph.D. program.
Name
Eric Rouchka
Ted Kalbfleisch
Guy Brock
Nigel Cooper
Susmita Datta
Seong Ho Kim
Jiaxu Li
Hunter Moseley
Ming Ouyang
Xiang Zhang
Department
CECS
BMB
B&B
ASNB
B&B
B&B
MATH
CHEM
CECS
CHEM
School
Rank
SPEED
MEDICINE
PUBLIC HEALTH
MEDICINE
PUBLIC HEALTH
PUBLIC HEALTH
A&S
A&S
SPEED
A&S
Assistant Professor
Assistant Professor
Assistant Professor
Professor
Professor
Assistant Professor
Assistant Professor
Assistant Professor
Assistant Professor
Associate Professor
Percent
Effort
10%
5%
3%
3%
3%
3%
3%
3%
3%
3%
ROLE
Board Chair
Board Vice Chair
Mentor
Mentor
Mentor
Mentor
Mentor
Mentor
Mentor
Mentor
A current curriculum vitae is provided for each of the participating members as an attachment.
Expenditures
Personnel Costs. Faculty involved within the program will be supported by their home
departments as part of their annual work plan. Matriculating students will count towards
departmental graduation counts based on the home department of their dissertation advisor in
addition to counting towards the School of Interdisciplinary and Graduate Studies (SIGS)
graduation rates.
The School of Interdisciplinary and Graduate Studies will provide staff support for the program
at 10% FTE. This support is calculated at a rate of $3,500 per year, with 28.5% calculated
fringe rate allowing for a 3% increase per year. This cost comes to a total of $4,498 for Y01;
$4,633 for Y02; and $4772 for Y03.
Over the first three years of the program, it is anticipated that there will be four non-candidacy
students supported per year, after which time the students will be financially supported by a
faculty mentor. The total cost for Graduate Assistant Stipends will be $96,504 per year for four
stipends based on the per stipend rates of:



$22,000 stipend
$1946 health insurance
$180 fringe rate (fees)
Operating Costs. Operating costs for the first four years can be broken down as follows:



Supplies, including equipment maintenance: $10,000 annually
Library costs: $2200 per year for new monographs; $4174 for the first year, $4383 for
the second year, and $4602 for the third year for subscriptions to the journals
Bioinformatics; Briefings in Bioinformatics; and the Journal of Computational Biology.
Student support, tuition remission: based on four non-candidacy out-of-state tuition rates
of $25,362 per student for year one, $26,123 per student for year two, and $26,907 per
student for year three totaling to $101,488 for year one; $104,492 for year two; and
$107,628 for year three.
The total operating cost per year is:



Year 1: $117,862
Year 2: $121,075
Year 3: $124,430
Total program expenditures over the first three years of the program are anticipated to be:



$218,864 for 2010-2011
$222,212 for 2011-2012
$225,706 for 2012-2013
Sources of Revenue
For the first three years of the program, the KBRIN parent grant P20RR16481 will provide
$10,000 annually in equipment and maintenance. In addition, one student stipend will be
provided by this grant for Y01 and Y02 at an annual rate of $24,126 ($22,000 stipend; $1946
health insurance; $180 fringe). For Y03, the KBRIN parent grant will support two student
stipends at a rate of $24,126 for a total of $48,252.
For the first two years of the program, the KBRIN supplement P20RR16481S1 will provide one
graduate student stipend annually at a rate of $24,126 (see above).
For the first two years of the program, the DOE earmark DE-EM0000197 will provide two
student stipends annually at a rate of $24,126 for a total of $48,252 annually. In addition,
$50,724 will be provided in Y01 to cover out-of-state tuition for two non-candidacy students at a
rate of $25,362 per student. $52,246 will be provided in Y02 to cover two students at a rate of
$26,123 per student, allowing for a 3% increase per year.
The School of Interdisciplinary and Graduate Studies (SIGS) will provide $50,724 in Y01,
$52,246 in Y02, and $53,814 ($26,907 per student) in Y03 to cover out-of-state tuition for noncandidacy students. In Y03, and subsequent out years. SIGS will provide two fellowships
covering a stipend of $22,000 and out-of-state tuition fees. For Y03, this will total to $102,066.
In addition, SIGS will provide staff support for the program at 10% FTE. This support is
calculated at a rate of $3,500 per year, with 28.5% calculated fringe rate allowing for a 3%
increase per year. This cost comes to a total of $4,498 for Y01; $4,633 for Y02; and $4,772 for
Y03.
Download