Document 14671543

advertisement
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
154
Bioinformatics based confirmatory test for identification of
Disease Putative Genes*
BIKRAM NAYAK (Author)
Email: vn.jobdb@gmail.com
ABSTRACT
In this paper ,several bioinformatics based approaches and methodologies are deployed to get confirmatory classification on
genes of mouse chromosome 11 in the region from 69mb to 104Mb for lethality or viability, duplicacy or Singleton character
and how their location determine their properties. DL genes found within these above mentioned regions are AK144590,
AL591436, X63190, DQ832277, AL591436, X51983, X07750, X07751, X07752, BC046795, AL590963, CH466556, AK078233,
AL590963, CH466556, AK078233, AL590963, CH466556 .
The gene id having MGI ID 2448712 is not available in genetrap nor the 7 genes
AF465352,AK039558,AK170258,BC052502,BC052734,CH466596,AL845465 having GO ID of 005737 are available in GO ontology.so these are disease unknown gene.
IJOART
No Matching Record for MGI:2137026 is also available at genetrap. But the go ID 005887 are available in plasma membrane so
these are grouped in disease viable gene and they have very few or less than 1 or 2 edges available at PPI network. The max
binding protein having id no 109150 starting at 74644422 and ending at 74659227 and entrez gene id of 17428 is purely a disease
lethal gene as it’s go ontology id 0005634 suggest that it is located at nucleus and having tumerigenic property and listed as adenocarcinoma at MeSH dictionary and there are more than 5 edge connected to different hub. All genes
AK144590,AL591436,X63190,DQ832277,AL591436,X51983,X07750,X07751,
X07752,BC046795,AL590963,CH466556,AK078233,AL590963,CH466556,AK078233,AL590963, CH466556 having gene starting
position and ending position is megablasted against Human/mouse and by freeing an e value (1020 ) highlighted duplicacy
from human/mouse.
Keywords : mutagenesis, duplicacy,lethality,viability, PPi Network,MeSH dictionary,e value, fdr
1 INTRODUCTION
Generally there are two types of genes. Essential disease and non essential disease genes. An Essential gene is one that is
necessary for the organism’s survival. Essential disease genes are those gene if the knockout of its mouse orthologs confers
lethality and non essential disease genes are those genes where a mouse knock out is viable at birth and if there is no available
data found at mouse knock out data that is treated as disease unknown gene. So these essential disease genes are termed as Disease Lethal and non essential disease gene as disease viable gene.
2. Procedure:Tools and methods for accessing wet lab datasets:When we collected the mouse gene from the following url
(http://www.mouse-genome.bcm.tmc.edu/Bioinformatics/MouseGeneSearchList2.asp) by filling the submission form we got
both 793 known and unknown locus starting with gene id O08826 to Mapt gene.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
155
Materials and Methods:
The human–mouse orthology and protein coding genes data from Mouse Genome Informatics
(http://www.informatics.jax.org) are obtained from biomart product.Biomart is a simple and robust data
integration system for large scale data querying and warehouse data extraction server..These data are an appropriate proxy for gene essentiality in humans and are herein mentioned as viable and lethal.
http://biomart.informatics.jax.org/biomart/martview/28d343acfd5d3bf0896340a4965d54a9
If a gene id is same and we got 4 different transcript factor id .it was assume that it has 4 predicted transcript sites in its gene.
Dataset
Mus musculus genes (NCBIM37)
Filters
Chromosome: 11
Gene Start (bp): 69000000
Gene End (bp): 104000000
with EMBL ID(s): Only
Ensembl Gene ID(s): [ID-list specified]
Gene type : protein_coding
Source : ensembl
Status (gene) : KNOWN
Evidence code (GO Cellular component) : IC
Orthologous Human Genes: Only
Attributes
Ensembl Gene ID
Ensembl Transcript ID
Chromosome Name
Ensembl Protein ID
Gene Start (bp)
Gene End (bp)
GO Term Accession
EntrezGene ID
EMBL (Genbank) ID
MGI ID
IJOART
After getting all the dataset ,all the genes were evaluated & analyzed according to the following functional parameters.
a.
Cellular Localization/Function
b.
Biological Function
c.
Physiological Function
d.
Protein protein interaction
e.
Mode of inheritance
f.
Evolutionary history/gene age
g. Singleton/duplication event
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
156
1. Disease genes localise to different cellular compartments.
Disease viable and disease lethal genes vary in the cellular compartments to which they are localised .DL genes are highly available in the nucleus. But DV genes are enriched for localisation to the plasma membrane. & in the extracellular region. That’s
why DL genes show a greater number of PPIs due to their higher probability of localisation within the nucleus.
Eg. All 18 genes AK144590,AL591436,X63190,DQ832277,AL591436,X51983,X07750,X07751,
X07752,BC046795,AL590963,CH466556,AK078233,AL590963,CH466556,AK078233,AL590963,
CH466556 having go Id 0005634 are present in nucleus and suggested as DL genes while gene bearing GO ID 0005887 are present in plasma membrane and the rest of the gene are disease unknown.
IJOART
2. Disease viable and disease lethal genes perform different Biological Functions
As we all know that the function of a protein is fully dependent on its cellular localisation. for example transcription
factors must be present in the nucleus to activate gene expression. GO annotations suggest essential genes localise to the nucleus, DL are enriched for nucleic acid binding when compared to all genes.for ex from our biomart output gene MGI ID of
2150020 has nucleotide binding property. DV genes are enriched for calcium binding refers itself a role in signal transduction,
DV are over-represented in signal transduction functions along with hydrolase activity than any of the other metabolic function
categories indicates that hydrolase activity is a specific feature of DV genes. But DL genes are enriched for involvement in embryonic development as suggested by biological process annotations.
Eg. As found in genetrap column/GO database
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
157
IJOART
[Disease Lethal and Disease Viable gene involved in Biological Processes]
[Differentiation on the basis of Molecular Function of Disease Lethal and Disease Viable gene]
3.Disease viable and disease lethal genes perform different Physiological Functions
Disease symptoms generally show an irregular element in particular organ systems or physiological processes.
Disease lethal genes are statistically over-represented for expression and behaves/work as an cancerous gene
directly affecting cell growth and death mechanisms. DL genes have a higher representation in skeletal
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
158
disorders, bone disease. This difference may be due to essential genes being involved in developmental
patterning of the body axis, and skeletal system, but not being involved in bone metabolism. However, the DV
are also associated with some diseases but differ from DL gene .They are involve in nutritional, psychiatric and
neurological disorders. And also are enriched in psychiatric and immune system diseases, but under-represented
among cardiovascular diseases.
4. Protein protein interaction network distinguish Disease Lethal and Disease viable gene.
Disease lethal genes are highly connected in Protein Protein network while disease viable genes have fewer
connections. To create the human protein-protein interaction network, data were derived from BioGRID,
BIND,HPRD, GeneRIF from ncbi and properly viewed and analyzed in “Cytoscape” and “Navigator” tool for
portein protein interaction network and number of hubs and hub-hub connections in the network are recorded.
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
159
IJOART
[Disease lethal gene cluster as graphically represented at PPI network from Cytoscape/netscape]
Above graphical representation of PPI network taken from the various java based plugin of cytoscape/netscape suggest that
Diseased Lethal groups have more complex networks than DV genes, with more interactions, few fragmentations and the rate of
edge is more in the highest hub when interacted from the same datasets.
Below is the interaction map of DV genes separated from DL group. Here the DV gene is less interconnected, more fragmented
and the rate of edge is less/null in the hub.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
160
IJOART
[ DV genes are viewed in PPI network in Cytoscape/netscape plugin ]
5. Functional parameters:-Mode of Inheritance
Disease lethal genes express a dominant mode of inheritance than DV.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
161
IJOART
It was observed that the DL gene set showed a higher proportion of autosomal dominant mutations While DV genes are
overrepresented for autosomal recessive inheritance.
6. Disease genes vary in their evolutionary history/gene age/phylogenetic distance.
The DL genes would have the oldest evolutionary history. Using reference genomes representative of each taxonomy category,
orthologs are identified for all disease genes, representing the earliest ancestor gene for each human gene. The taxonomy categories are distributed according to evolutionary distance, with H. sapiens as the closest and Fungi/Metazoa as the category with
the most distant evolutionary origin. DL genes show a higher frequency of orthologs originating in the most distant Fungi/Metazoa or Bilateria classes. As compared to all human genes, the DL genes have a much higher proportion with the oldest
ancestor in the chordata class or earlier. However, DV have a higher proportion of genes with the oldest ortholog originating in
one of the evolutionarily more recent categories: Tetrapoda, Amniota, Mammalia, Theria and Eutheria. When compared to all
human genes, all genes in our annotated categories do not have the oldest ortholog arising in the most recent evolutionary lineages, such as Euarchontoglires, Primates, Catarrhini, Hominidae, Hominanae, or Homo sapiens .so finally DL genes have a
more ancient evolutionary origin, and a greater number of orthologs, than the other gene classes analysed.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
162
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
163
IJOART
Gene age:
A lot of behavioral study and investigation in the context of essentiality is going on to provide insights for candidate gene analysis to identify new disease loci. One such eg is on gene age that was measured using the phylogenetic breadth of the distribution of homologous genes among different lineages.ex. old genes are those that are present in more distantly related species
whereas young genes are those that are present only in the closely related species like chimpanzee and macaque.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
164
IJOART
[The comparison sheet of the gene age of Human beings and other lower ordered organisms ]
When all the gene with MGI accession are again searched at genetrap database a lot of phenotypic characters having accession
id of mammalian phenotypic browser,their orthology counterparts,ontological evidence on their location,omim relation etc are
arranged.
For example mgi:99423 gene suggests it’s expression is associated with tumor cell invasion and metastasis and go ic suffix
GO:0005634 [nucleus] evidence: IC suggest that it is disease lethal gene.
MGI:2150020 gene also show protein coding biotype.it has also gene tree in newwick format
(((((((((((((ENSSTOP00000000315_Stri_:0.0330,
ENSDORP00000000165_Dord_:0.1161):0.0048,
((ENSMICP00000005418_Mmur_:0.0116,
ENSOGAP00000007601_Ogar_:0.0509):0.0047,
ENSTBEP00000013197_Tbel_:0.0487):0.0053):0.0000,
((((((ENSPTRP00000051614_Ptro_:0.0000,
ENSP00000372088_Hsap_:0.0013):0.0171,
ENSGGOP00000013922_Ggor_:0.0506):0.0000,
(ENSMMUP00000033829_Mmul_:0.0094,
ENSCJAP00000038748_Cjac_:0.0157):0.0140):0.1059,
ENSPPYP00000007198_Ppyg_:0.0000):0.0327,
ENSTSYP00000000841_Tsyr_:0.0302):0.0016,
((((ENSDNOP00000013522_Dnov_:0.0076,
ENSCHOP00000000528_Chof_:0.0292):0.0051,
((ENSPCAP00000000302_Pcap_:0.0307,
ENSLAFP00000007349_Lafr_:0.0346):0.0072,
ENSETEP00000002471_Etel_:0.0494):0.0072):0.0076,
(((((((ENSSSCP00000005138_Sscr_:0.0000,
ENSSSCP00000005140_Sscr_:0.0749):0.0251,
ENSBTAP00000003788_Btau_:0.0210):0.0032,
ENSTTRP00000010038_Ttru_:0.0127):0.0048,
ENSCAFP00000013557_Cfam_:0.0348):0.0000,
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
165
(((ENSMLUP00000003893_Mluc_:0.0278,
ENSPVAP00000001739_Pvam_:0.0280):0.0000,
ENSFCAP00000013649_Fcat_:0.0830):0.0022,
ENSVPAP00000003573_Vpac_:0.0463):0.0000):0.0000,
ENSSARP00000002591_Sara_:0.0708):0.0018,
ENSECAP00000015324_Ecab_:0.0119):0.0147):0.0057,
((ENSOCUP00000014527_Ocun_:0.0243,
ENSOPRP00000014620_Opri_:0.0500):0.0179,
ENSCPOP00000012705_Cpor_:0.0458):0.0069):0.0024):0.0063):0.0153,
((ENSRNOP00000053270_Rnor_:0.0305,
ENSRNOP00000041615_Rnor_:0.1258):0.0000,
ENSMUSP00000028795_Mmus_:0.0332):0.0374):0.0592,
(ENSMEUP00000013631_Meug_:0.0268,
ENSMODP00000000325_Mdom_:0.1465):0.0753):0.0737,
ENSOANP00000014708_Oana_:0.2186):0.0000,
(((ENSGALP00000032107_Ggal_:0.0011,
ENSMGAP00000002659_Mgal_:0.0296):0.0878,
ENSTGUP00000007594_Tgut_:0.1159):0.0332,
ENSACAP00000013598_Acar_:0.1269):0.0290):0.0631,
ENSXETP00000034829_Xtro_:0.1406):0.0470,
(((ENSTRUP00000010366_Trub_:0.0517,
ENSTNIP00000022389_Tnig_:0.0569):0.0251,
(ENSGACP00000016197_Gacu_:0.0667,
ENSORLP00000022299_Olat_:0.0997):0.0455):0.0658,
(ENSDARP00000060705_Drer_:0.0000,
ENSDARP00000102254_Drer_:0.2574):0.1346):0.0524):0.1086,
ENSCSAVP00000002327_Csav_:0.3665):0.0401,
FBpp0084956_Dmel_:0.3818):0.0988,
((((((ENSDARP00000053846_Drer_:0.2500,
ENSGACP00000006597_Gacu_:0.4557):0.0000,
((((((((ENSSTOP00000014167_Stri_:0.0520,
ENSCPOP00000009649_Cpor_:0.0895):0.0076,
ENSTBEP00000008426_Tbel_:0.0614):0.0024,
(ENSPCAP00000012864_Pcap_:0.0390,
ENSLAFP00000012636_Lafr_:0.0471):0.0285):0.0035,
(((((ENSBTAP00000025252_Btau_:0.0220,
ENSSSCP00000002496_Sscr_:0.0599):0.0149,
ENSECAP00000011815_Ecab_:0.0389):0.0045,
(ENSCAFP00000024255_Cfam_:0.0519,
ENSEEUP00000001317_Eeur_:0.0934):0.0066):0.0000,
(((ENSFCAP00000007091_Fcat_:0.0588,
ENSSARP00000001222_Sara_:0.1050):0.0077,
ENSPVAP00000011016_Pvam_:0.0468):0.0030,
(ENSTTRP00000014685_Ttru_:0.0213,
ENSVPAP00000009741_Vpac_:0.0445):0.0107):0.0014):0.0183,
ENSMICP00000009918_Mmur_:0.0348):0.0065):0.0000,
((((((((ENSPTRP00000039267_Ptro_:0.0000,
ENSP00000419881_Hsap_:0.0000):0.0026,
ENSPPYP00000006740_Ppyg_:0.0091):0.0026,
ENSMMUP00000020875_Mmul_:0.0092):0.0108,
ENSCJAP00000027894_Cjac_:0.0198):0.0241,
ENSETEP00000008672_Etel_:0.1043):0.0017,
(ENSDORP00000010143_Dord_:0.1035,
ENSOPRP00000004397_Opri_:0.1484):0.0086):0.0000,
(((ENSMUSP00000078490_Mmus_:0.0478,
ENSRNOP00000016459_Rnor_:0.0548):0.1360,
ENSOCUP00000000520_Ocun_:0.1224):0.0242,
ENSTSYP00000006542_Tsyr_:0.0439):0.0022):0.0023,
ENSMLUP00000000626_Mluc_:0.0776):0.0000):0.0078,
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
166
ENSCHOP00000012287_Chof_:0.0365):0.0795,
ENSMODP00000012669_Mdom_:0.1692):0.0647,
(((ENSMGAP00000012489_Mgal_:0.0076,
ENSGALP00000015437_Ggal_:0.0793):0.0629,
ENSTGUP00000011847_Tgut_:0.0781):0.1183,
ENSACAP00000005463_Acar_:0.1986):0.0749):0.3462):1.0094,
(((((ENSTNIP00000013172_Tnig_:0.0561,
ENSTRUP00000026857_Trub_:0.1996):0.0783,
ENSGACP00000027559_Gacu_:0.1800):0.0383,
ENSORLP00000019089_Olat_:0.1402):0.1668,
ENSDARP00000029740_Drer_:0.2887):0.1059,
((((ENSTGUP00000017411_Tgut_:0.0041,
ENSTGUP00000005057_Tgut_:0.0070):0.1025,
(ENSGALP00000003458_Ggal_:0.0182,
ENSMGAP00000004024_Mgal_:0.1614):0.1141):0.1267,
(((((((((ENSMUSP00000018985_Mmus_:0.0262,
ENSRNOP00000036257_Rnor_:0.0300):0.1150,
ENSCPOP00000017272_Cpor_:0.0890):0.0204,
((((ENSGGOP00000000358_Ggor_:0.0019,
ENSP00000378090_Hsap_:0.0551):0.0194,
ENSPTRP00000015369_Ptro_:0.0057):0.0109,
ENSPPYP00000009193_Ppyg_:0.0097):0.0086,
ENSCJAP00000027151_Cjac_:0.0261):0.0390):0.0065,
ENSVPAP00000011247_Vpac_:0.1432):0.0005,
(((ENSSTOP00000005497_Stri_:0.0783,
ENSOCUP00000012289_Ocun_:0.1399):0.0040,
((ENSECAP00000012028_Ecab_:0.0522,
ENSMLUP00000011374_Mluc_:0.0698):0.0053,
((ENSFCAP00000000230_Fcat_:0.0271,
ENSCAFP00000027051_Cfam_:0.0436):0.0358,
ENSBTAP00000025404_Btau_:0.0815):0.0060):0.0086):0.0017,
((ENSEEUP00000001339_Eeur_:0.1162,
ENSSARP00000000796_Sara_:0.1880):0.0030,
(ENSOGAP00000008835_Ogar_:0.0559,
ENSOPRP00000010741_Opri_:0.1132):0.0216):0.0042):0.0016):0.0008,
((((ENSTBEP00000011110_Tbel_:0.0687,
ENSMICP00000001136_Mmur_:0.0808):0.0253,
ENSPVAP00000003694_Pvam_:0.0811):0.0023,
ENSTSYP00000002243_Tsyr_:0.1576):0.0037,
ENSTTRP00000014683_Ttru_:0.0614):0.0281):0.0276,
ENSDNOP00000003927_Dnov_:0.0568):0.0029,
((ENSLAFP00000015357_Lafr_:0.0294,
ENSPCAP00000006649_Pcap_:0.0888):0.0099,
ENSETEP00000009798_Etel_:0.1368):0.0158):0.1028,
(ENSMODP00000023889_Mdom_:0.0358,
ENSMEUP00000007724_Meug_:0.0805):0.1221):0.1544):0.0638,
ENSXETP00000019650_Xtro_:0.5555):0.1321):1.5927):0.3322,
((((((ENSMODP00000018294_Mdom_:0.0380,
ENSMEUP00000006550_Meug_:0.1082):0.0896,
(((ENSOGAP00000004411_Ogar_:0.0574,
ENSOPRP00000000782_Opri_:0.0755):0.0076,
ENSCHOP00000006029_Chof_:0.0557):0.0000,
(((((((ENSP00000336701_Hsap_:0.0000,
ENSPTRP00000016062_Ptro_:0.0039):0.0027,
ENSPPYP00000009244_Ppyg_:0.0039):0.0084,
ENSMMUP00000011981_Mmul_:0.0051):0.0192,
ENSTSYP00000008137_Tsyr_:0.0329):0.0024,
((((ENSECAP00000003008_Ecab_:0.0270,
ENSMLUP00000005665_Mluc_:0.0469):0.0012,
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
167
(ENSTTRP00000016289_Ttru_:0.0202,
ENSSSCP00000018702_Sscr_:0.0425):0.0023):0.0000,
(((ENSVPAP00000000334_Vpac_:0.0307,
ENSBTAP00000020012_Btau_:0.0327):0.0072,
ENSPVAP00000014496_Pvam_:0.0226):0.0016,
ENSCAFP00000025883_Cfam_:0.0391):0.0027):0.0048,
(ENSLAFP00000002822_Lafr_:0.0493,
ENSDNOP00000003626_Dnov_:0.0580):0.0140):0.0069):0.0000,
(ENSDORP00000001895_Dord_:0.0672,
ENSTBEP00000013404_Tbel_:0.0695):0.0185):0.0000,
(((((ENSCJAP00000011941_Cjac_:0.0020,
ENSCJAP00000020026_Cjac_:0.0040):0.0058,
ENSCJAP00000035050_Cjac_:0.0079):0.0266,
ENSMICP00000014682_Mmur_:0.0356):0.0000,
ENSEEUP00000001708_Eeur_:0.1161):0.0000,
(((ENSOCUP00000001958_Ocun_:0.0000,
ENSOCUP00000017133_Ocun_:0.0196):0.0597,
ENSPCAP00000002180_Pcap_:0.2204):0.0000,
((((ENSRNOP00000008846_Rnor_:0.0309,
ENSMUSP00000007790_Mmus_:0.1641):0.0791,
ENSSTOP00000002592_Stri_:0.0494):0.0000,
ENSCPOP00000004133_Cpor_:0.0808):0.0120,
ENSETEP00000001932_Etel_:0.0886):0.0046):0.0027):0.0000):0.0044):0.1051):0.0621,
ENSOANP00000013505_Oana_:0.1106):0.0631,
(((ENSGALP00000038538_Ggal_:0.0067,
ENSMGAP00000007189_Mgal_:0.0264):0.0699,
(ENSTGUP00000007629_Tgut_:0.0050,
ENSTGUP00000015360_Tgut_:0.0138):0.0967):0.0858,
ENSACAP00000010810_Acar_:0.2435):0.0733):0.0699,
ENSXETP00000002461_Xtro_:0.3031):0.1191,
((((ENSTRUP00000004854_Trub_:0.0975,
ENSTNIP00000018013_Tnig_:0.1402):0.1259,
ENSORLP00000024828_Olat_:0.2694):0.0330,
ENSGACP00000000375_Gacu_:0.0683):0.2500,
ENSDARP00000090614_Drer_:0.2005):0.1508):0.6278):0.2354,
FBpp0084486_Dmel_:1.9909):1.2160,
Y43C5A.6a_Cele_:0.3502):0.2385):0.0000,
YER095W_Scer_:0.4878):0.0000;
IJOART
Or after the clustalw program if we prepare the cladogram sheet with distance node the gene age,inheritance would be
measured and we could easily isolate the dl genes from dv.
http://www.ensembl.org/Mus_musculus/Gene/Compara_Ortholog?db=core;g=ENSMUSG00000007646
http://www.ensembl.org/Mus_musculus/Gene/Compara_Tree?db=core;g=ENSMUSG00000007646
Below is the graphical format of it’s gene tree.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
168
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
169
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
170
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
171
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
172
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
173
IJOART
Duplication and retention
To examine the gene duplication events in our categories of disease genes, we have used similarity methods to identify
paralogs of all human disease genes. The proportion of genes with paralogs, or duplicates, was analysed for each gene category.
All the DL genes are much more likely to be duplicates . while DV genes are singletons. The high proportion of singleton genes
in the DV class suggests a difference in retention in the human genome following whole genome duplications for these genes,
with many duplicates or paralogs being lost after duplication.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
174
Duplicate and singleton identification method
Sequences are retrieved from biomart from Ensembl. BLAT v.32 was used for sequence similarity search. Freeing an evalue threshold of 1020 was used to identify duplicates and singleton genes.
All genes AK144590,AL591436,X63190,DQ832277,AL591436,X51983,X07750,X07751,
X07752,BC046795,AL590963,CH466556,AK078233,AL590963,CH466556,AK078233,AL590963,
CH466556 having gene starting position and ending position is megablasted against Human/mouse and by freeing an e value
10 pow(20) highlights duplicate gene from human/mouse.
IJOART
[ Figure represents the frequency of duplicate/singleton nature of DL/DV ]
MGI:2150020
Chr.11(-): 87190152-87218268 [NCBI37]
Entrez Gene114714 Chr.11(-): 87190146-87217940 [NCBI37]
GO:0000166 [nucleotide binding] evidence: IEA
GO:0048476 [Holliday junction resolvase complex] evidence: IC lethality-prenatal/perinatal
mgi 98742 not found suggest that it is a disease unknown gene according to genetrap.
But it’s nucleotide binding property suggest it as Disease lethal gene.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
175
7.1 Appendices:
Below are all the mouse mutagenesis for development defect from the 69mb to 104mb of chromosome 11 of the Mus Musculus.
•
Category
craniofacial
eye
fertility
growth
lethal
Mutant
30
3
8
80
89
Category
metabolism
neurological
skeletal
skin and coat
undefined
Mutant
10
76
32
31
3
Category
urogenital
Mutant
1
CRANIOFACIAL:
MGI Accession
•
Phenotype
crf
m02Jus
craniofacial, affected testers are smaller, have shorter snouts.
MGI:2671741.
crf
m05Jus
craniofacial, patchy hair loss.
MGI:2671742.
crfm06Jus
MGI:2671743.
crf
m08Jus
craniofacial, testers have a subtle short snout phenotype.
MGI:2671832.
crf
m18Jus
smaller head, short snout, not completely penetrant.
MGI:3046702.
crfm26Jus
testers are smaller, hydrocephalous, do not live very long
past 8 or 12 weeks.
MGI Accession
Lab Name
Phenotype
MGI:2671711.
infm02Jus
male infertility, low sperm count, normal morphology. ref.
clark et. al., biology of reproduction 70, 1317-1324, 2004..
MGI:2671710.
infm03Jus
female infertility. ref. clark et. al., biology of reproduction
70, 1317-1324, 2004..
MGI:2671707.
inf
m04Jus
male infertility, low sperm count, not motile, unusual morphology. ref. clark et. al., biology of reproduction 70, 13171324, 2004. also in the same complementation group as inf08
and inf09..
MGI:2671706.
inf
m05Jus
male infertility, ref. clark et. al., biology of reproduction 70,
1317-1324, 2004..
MGI:2671699.
infm07Jus
female infertility, ref. clark et. al., biology of reproduction
70, 1317-1324, 2004..
MGI:2671697.
infm08Jus
in the same complementation group as inf04 and inf09. ref.
clark et. al., biology of reproduction 70, 1317-1324, 2004..
MGI:2671691.
infm09Jus
in the same complementation group as inf04 and inf08. ref.
clark et. al., biology of reproduction 70, 1317-1324, 2004..
Lab Name
Phenotype
MGI:2671740.
•
Lab Name
FERTILITY:
craniofacial, testers have shorter faces and are smaller than
carrier siblings.
IJOART
GROWTH:
MGI Accession
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
•
176
m01Jus
small animals seen in small litters of 5/4 pups. small runs
about 7-9 gm, while other pups are 13-16gm size..
gro
m22Jus
affected testers 3/4 size of normal littermates, low serum
cholesterol.
MGI:2671721.
grom40Jus
testers are 1/2 size of unaffected siblings.
MGI:2671722.
grom41Jus
2 of 5 testers 3/4 size of carrier sibs, appears to be on chromosome 11 but not completely penetrant.
MGI:2671723.
grom42Jus
testers small, 3/4 size of carrier siblings.
MGI:3046714.
m79Jus
MGI:2671718.
gro
MGI:2671720.
gro
small, 3/4, 10gm vs 16gm at n1f1.
LETHAL:
MGI Accession
Lab Name
Time of Death
MGI:2671871.
l11Jus01
5.5 - 8.5 dpc.
MGI:2671872.
l11Jus02
5.5 - 8.5 dpc.
MGI:2671873.
l11Jus03
5.5 - 8.5 dpc.
MGI:2671874.
l11Jus04
5.5 - 8.5 dpc.
MGI:2671876.
l11Jus05
9.5 - 12.5 dpc.
MGI:2671877.
l11Jus06
9.5 - 12.5 dpc.
MGI:2671878.
l11Jus07
5.5 - 8.5 dpc.
MGI:2671879.
l11Jus08
9.5 - 12.5 dpc.
MGI:2671880.
l11Jus09
9.5 - 12.5 dpc.
MGI:2671881.
l11Jus10
peri-natal lethal.
MGI:2671882.
l11Jus11
5.5 - 8.5 dpc.
MGI:2671883.
l11Jus12
5.5 - 8.5 dpc.
MGI:2671884.
l11Jus13
peri -natal lethal.
MGI:2671885.
l11Jus14
9.5 - 12.5 dpc.
MGI:2671886.
l11Jus15
13.5 - 18.5 dpc.
MGI:2671887.
l11Jus16
peri-natal lethal.
MGI:2671888.
l11Jus17
9.5 - 12.5 dpc.
MGI:2671889.
l11Jus18
9.5 - 12.5 dpc.
MGI:2671890.
l11Jus19
9.5 - 12.5 dpc.
MGI:2671891.
l11Jus20
9.5 - 12.5 dpc.
MGI:2671892.
l11Jus21
peri-natal lethal.
MGI:2671893.
l11Jus22
peri-natal lethal.
MGI:2671894.
l11Jus23
peri-natal lethal.
MGI:2671896.
l11Jus24
peri-natal lethal.
MGI:2671897.
l11Jus25
post-natal lethal.
MGI:2671898.
l11Jus26
post-natal lethal.
MGI:2671899.
l11Jus27
9.5 - 12.5 dpc.
MGI:2671900.
l11Jus28
9.5 - 12.5 dpc.
MGI:2671901.
l11Jus29
13.5 - 18.5 dpc.
MGI:2671902.
l11Jus30
peri-natal lethal.
MGI:2671903.
l11Jus31
peri-natal lethal.
MGI:2671904.
l11Jus32
peri-natal lethal.
IJOART
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
177
MGI:2671906.
l11Jus33
peri-natal lethal.
MGI:2671907.
l11Jus34
5.5 - 8.5 dpc.
MGI:2671908.
l11Jus35
5.5 - 8.5 dpc.
MGI:2671909.
l11Jus36
9.5 - 12.5 dpc.
MGI:2671910.
l11Jus37
9.5 - 12.5 dpc.
MGI:2671911.
l11Jus38
5.5 - 8.5 dpc.
MGI:2671912.
l11Jus39
9.5 - 12.5 dpc.
MGI:2671913.
l11Jus40
peri-natal lethal.
MGI:2671914.
l11Jus41
9.5 - 12.5 dpc.
MGI:2671915.
l11Jus42
5.5 - 8.5 dpc.
MGI:2671916.
l11Jus43
peri-natal lethal.
MGI:2671917.
l11Jus44
peri-natal lethal.
MGI:2671918.
l11Jus45
9.5 - 12.5 dpc.
MGI:2671919.
l11Jus46
9.5 - 12.5 dpc.
MGI:2671920.
l11Jus47
9.5 - 12.5 dpc.
MGI:2671921.
l11Jus48
5.5 - 8.5 dpc.
MGI:3034009.
l11Jus49
5.5 - 8.5 dpc.
MGI:3034010.
l11Jus50
after 12.5 dpc.
MGI:2671922.
l11Jus51
Post-natal lethal.
MGI:2671923.
l11Jus52
Post-natal lethal.
MGI:2671924.
l11Jus53
post-natal lethal.
MGI:2671925.
l11Jus54
post-natal lethal.
MGI:2671926.
l11Jus55
post-natal lethal.
MGI:2671927.
l11Jus56
post-natal lethal.
MGI:2671928.
l11Jus57
post-natal lethal.
MGI:3034011.
l11Jus58
9.5 - 12.5 dpc.
MGI:3043663.
l11Jus59
still to be determined.
Lab Name
Phenotype
IJOART
•
METABOLISM:
MGI Accession
MGI:2671716.
•
m04
hem
low rbc/hgb/hct 11/17/52.
MGI:2671712.
hem1
low wbc, neutrophilic blasts.
MGI:2671713.
hem2
low wbc, cf c: 1-2 vs 8.
MGI:2671715.
hem3
low rbc/hgb/hct 6/10/32.
NEUROLOGICAL:
MGI Accession
Lab Name
Phenotype
MGI:2671725.
nurm01Jus
tester animals are hyperactive, nervousness, tremors. previously known as jittery 1..
MGI:2671727.
nurm02Jus
testers are hyperactive. previously known as shaky 3..
MGI:2671729.
nur
m03Jus
small, hyperactive, some affecteds show craniofacial abnormalities. previously known as small hyper 3..
MGI:2671730.
nurm04Jus
affected animals have a quivering phenotype, the phenotype
was late onset. previously known as shaky 4..
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
•
MGI:2671731.
nurm05Jus
affected animals have a quivering phenotype, noticeable
when they move. phenotype is late onset, do not see it until
at least 3 months old. previously called shaky 5..
MGI:2671732.
nurm06Jus
affected animals hyperactive and 1/2 size of siblings, 6 of 9
testers, 2 of 24 carriers affected, may be outside of inversion
on 11. previously called small hyper 5..
MGI:2671733.
nurm07Jus
smaller, lethargic, testers have reduced open field activity,
develop late onset tremors upon movement. previously
known as small lethargic..
MGI:2671734.
nurm08Jus
hyper, seizing, previously known as flicker.
MGI:2671735.
nur
m09Jus
hyperactive, jittery weaving gait, hearing loss. previously
known as jittery 2..
SKIN AND COAT:
MGI Accession
4
178
Lab Name
MGI:2671724.
skc
MGI:3038892.
Skcm02Jus
CONCLUSION
m01Jus
Phenotype
greasy looking hair, previously known as greasy coat.
scruffy, hair sticks out straight, previously known as pete
rose hair.
IJOART
Since so many years, mapping & identification of disease-causing genes in humans is being carried out in so many laboratories
with different methodologies. Today, classical map-based gene discovery has been augmented by the sequence-based gene discovery, given that the human genome project has produced high-precision tools for disease gene locus mapping and identification.
So far, the characterization of genetic defects has been successfully accomplished in more than 1600 human monogenic diseases.
Mapping common & genetically complex human disease traits has proved more difficult but even in these more complex cases, a
no of mutations associated with human complex diseases have been identified.Like most of the confirmatory test for different
types of radicals in chemistry lab, there must be a systematic procedure for identification of disease gene. During many of the
cases, mouse knockout gene dataset is taken as an alternative option to understand the role of disease genes in human because
chromosome 11 of mouse is similar to the gene of human chromosome 17, so in this article it was taken as a suitable proxy model to find out disease loci. Here bioinformatics based analysis are carried out for classifying genes from mouse chromosome 11 in
the mutagenesis screen (69mb-104Mb) and also other factors like how many genes are participating for duplicacy, Where is
there actual position? Is there any function of the gene be affected with their location?
These classification of DL and DV genes,their underexpressed and overexpressed characters help us in finding human disease,ageing and biosenescence etc. But bioinformatics based classification along with few statistical parameters help us to predict absolutely in accurate way.
ACKNOWLEDGMENT
I cordially thankful to all my students, laboratory staffs and especially to the Director Mr P.K.Boss.chemistry Head and prof
John Pejjulo,PhD in biostatistics for his online support during my work.
Copyright © 2014 SciResPub.
IJOART
International Journal of Advancements in Research & Technology, Volume 3, Issue 3, March-2014
ISSN 2278-7763
179
Glossary:
1. Cladogram:- Phylogenetic tree showing the relationship between species.
2. Homologues:- 2 genes are homologous if they evolved from the same common ancestor.
3. Pattern:-Conserved residues that one can use as a functional signature.
4. TrEMBL:-Translated EMBL which contains all the putative protein sequence contained in the nucleotide
databases. It’s US counterpart is Non redundant.
5. E-value:-expected value. i.e how likely the similarity between your sequence and database sequence
due to a chance.Less e value more suitable for research.Evalue of 10 to the power -32 is better than 10
to the power -4
6. P-value:-In statistical testing, the p value indicates whether some effect (like whether the difference in
the average value of some quantity is different between two groups, or whether one numerical variable
is correlated with another numeric variable) is statistically significant. Statistically significant generally
means that the results one observed (the difference in some average between two groups) is very unlikely to have arisen only from random fluctuations in your observed sample, if there is truly no differ-
IJOART
ence between the two groups in the whole population. The p value is the probability of getting results
at least as convincing as what one actually get, if there's really nothing going on, but only random fluctuations. If this p value is less than some small number (often set at 0.05), then the results are said to
be statistically significant.
References
1.
Hentges, K.E., Pollock, D.D., Liu, B. and Justice, M.J. (2007) Regional variation in the density of essential genes in mice.
PLoS Genet.
2
McKusick, V. (1998) Mendelian Inheritance in Man., A Catalog of Human Genes and Genetic Disorders.
3.
Bult, C.J., Eppig, J.T., Kadin, J.A., Richardson, J.E. and Blake, J.A. (2008) The Mouse Genome Database (MGD): mouse
biology and model systems. Nucleic Acids Res, 36, D724-8.
4.
Maere, S., Heymans, K. and Kuiper, M. (2005) BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics, 21, 3448-9.
5.
Lowe, H.J. and Barnett, G.O. (1994) Understanding and using the medical subject heading (MeSH) vocabulary
6.
Smedley, D., Haider, S., Ballester, B., Holland, R., London, D., Thorisson, G. and Kasprzyk, A. (2009) BioMart-biological queries made easy. BMC Genomics, 10, 22.
7.
Kent, W.J. (2002) BLAT--the BLAST-like alignment tool. Genome Res, 12, 656-64.
Copyright © 2014 SciResPub.
IJOART
Download