Text-mining System for Disease Associated Gene
目標 Object
Researchers are finding it more and more difficult to follow the changing status of
disease candidate genes due to the exponential increase in gene mapping studies.
The Text-mined Hypertension, Obesity, and Diabetes candidate gene database (THOD), are developed to help trace existing research on three complex diseases
(hypertension, obesity, and diabetes), by regularly and semi-automatically extracting
HOD-related genes from new published literature.
T-HOD employed state-of-art text-mining technologies including a named entity
recognition-gene normalization (GN) system and a disease-gene relation extraction
system. Unlike manually constructed disease gene databases, such as the genetic
association database (GAD), the content of T-HOD is regularly updated by our textmining system and verified by domain experts.
系統 System
The flowchart of T-HOD database construction.
使用介面 Interface
User interface of T-HOD
database. We divided T-HOD
user interface into four
regions for introducing
function of interface clearly.
The Network viewer shows
the network viewer that
presents a graphic-based
gene-gene network for a
selected candidate gene.
The statistics of the extracted
hypertension candidate
genes. The blue bars
indicate the number of genes
extracted each year, while
the yellow bars show the
number of new discovered
genes each year.
運用資料探勘技術從事廣泛的基因篩選及 endophenotype 構建
許聞廉 研究員
Intelligent Agent Systems Lab
