R Tutorial

advertisement
Tutorial
Gene-Screening Strategies
Tova Fuller, Steve Horvath
Correspondence: suprtova@ucla.edu, shorvath@mednet.ucla.edu
Abstract
Here we identify genes potentially involved in mouse obesity by using GSweight
(absolute correlation of a gene with mouse body weight), GSSNP19 (absolute correlation
of a gene with an mQTL on chromosome 19), and kME (module eigengene-based
intramodular connectivity) to prioritize genes inside the blue module of a previously
studied BxH F2 mouse intercross.
This work is in press, and appears in Table 3 of:
Tova Fuller, Anatole Ghazalpour, Jason Aten, Thomas A. Drake, Aldons J. Lusis,
Steve Horvath (2007) Weighted gene coexpression network analysis strategies
applied to mouse weight. Mamm Genome, in press.
The data are described in:
Anatole Ghazalpour, Sudheer Doss, Bin Zang, Susanna Wang,Eric E. Schadt,
Thomas A. Drake, Aldons J. Lusis, Steve Horvath (2006) Integrating Genetics and
Network Analysis to Characterize Genes Related to Mouse Weight. PloS Genetics
This document and data files can be found at the following webpage:
http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/DifferentialNetworkAn
alysis
More material on weighted network analysis can be found here:
http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/
Method Description:
The data are described in the PLoS article cited above.
We utilize four criteria for choosing genes which each identify 8-9 candidate genes. We
review the genes selected by each method, noting putative links to obesity-related
syndromes such as hypertension, hypercholesterolemia and insulin resistance after both
an initial gene ontology database search and a brief literature search. Below is an outline
of the initial four criteria for choosing genes:

Criteria 1: Strict GSweight thresholding (8 genes identified)
o GSweight threshold at the 97.5%ile value
o GSSNP19 threshold at the 75%ile value
o kME threshold at the 75%ile value

Criteria 2: Strict GSSNP19 thresholding (9 genes identified)
1
o GSweight threshold at the 75%ile value
o GSSNP19 threshold at the 90%ile value
o kME threshold at the 75%ile value

Criteria 3: Strict kME thresholding (8 genes identified)
o GSweight threshold at the 75%ile value
o GSSNP19 threshold at the 75%ile value
o kME threshold at the 95%ile value

Criteria 4: Balanced thresholding (9 genes identified) – This corresponds
with Table 3 of Fuller et al.
o GSweight threshold at the 85%ile value
o GSSNP19 threshold at the 85%ile value
o kME threshold at the 85%ile value
We create a fifth and final criteria that included all of the most relevant genes found in
the previous four criteria.

Criteria 5: Strict GSweight & kME thresholding, relaxed GSSNP19
thresholding (16 genes identified)
o GSweight threshold at the 85%ile value
o GSSNP19 threshold at the 75%ile value
o kME threshold at the 85%ile value
Appendix 1 contains tables of genes that meet each of these criteria. Appendix 2 contains
gene ontology information for all genes recovered in these screening strategies.
CODE TUTORIAL
# In this tutorial, I will demonstrate how to determine the genes with the highest
# significance for chromosome 19. GSSNP, the SNP based gene significance measure, is
# the absolute value of the correlation between expression of a gene and SNP value (0, 1
# or 2).
# Please adapt the following paths
setwd("/Users/TovaFuller/Documents/HorvathLab2007/MouseProject2.0/GeneS
earch/")
source("/Users/TovaFuller/Documents/HorvathLab2006/NetworkFunctions/Net
workFunctions.txt")
# read in the R libraries.
library(MASS)
library(class)
library(cluster)
library(sma)
library(impute) #
library(faraway)
model diagnostics
# standard, no need to install
# standard, no need to install
# install it for the function plot.mat
install it for imputing missing value
# this library is useful for some of the linear
2
# Read in expression data related to the blue module
dat1=read.csv("/Users/TovaFuller/Documents/HorvathLab2006/MouseTutorial
s/Tutorial4/BluemoduleGenesWeightandSNPs.csv",header=T)
# this data frame contains annotation and other information on the genes
datSummary= data.frame(dat1[-c(1:10), c(1:8, 144:158)])
# this data frame contains the gene expression data (rows are samples, columns are
# genes)
datExprBlue=data.frame(t(dat1[-c(1:10), c(9:143)]))
# This vector contains the contains the module color (blue) for each gene
color1=rep("blue",dim(datExprBlue)[[2]] )
# This defines the module eigengene
PC1=ModulePrinComps1(datExprBlue,color1)[[1]]$PCblue
# This data frame contains the SNP markers of the mQTLs for the mice
# Rows are mQTL SNP markers and columns are female mouse liver samples
SNP= data.frame(dat1[1:9, c(9:143) ])
dimnames(SNP)[[1]]=as.character(dat1[1:9,1])
# body weight of each mouse
weight=as.numeric(dat1[10, c(9:143) ])
# This defines the weight based gene significance measure
GSweight=as.numeric(abs(cor(weight, datExprBlue,use="p")))
# This defines the SNP (mQTL) based gene significance measure
GSSNP=data.frame(matrix(NA,
nrow=dim(SNP)[[1]],ncol=dim(datExprBlue)[[2]] ))
for (i in c(1:dim(SNP)[[1]]) ){GSSNP[i,]=
as.numeric(abs(cor(as.numeric(SNP[i,]), datExprBlue,use="p")))}
dimnames(GSSNP)[[1]]=paste("GS",as.character(dat1[1:9,1]),sep="")
dimnames(GSSNP)[[2]]=paste(as.character(dat1[-c(1:10),1]))
dim(GSSNP)
GSSNP19 = GSSNP[9,]
dimnames(GSSNP19)[[2]]=paste(as.character(dat1[-c(1:10),1]))
# This defines the intramodular connectivity
# Note that this assumes beta=6 used for the power adjacency function
kIN=as.numeric(apply(abs(cor(datExprBlue,use="p"))^6,2,sum))
# This defines the module eigengene based connectivity measure.
kME= as.numeric(abs(cor(PC1,datExprBlue,use="p")))
# Note that kME and kIN are highly correlated, which is always true for module genes.
# See Horvath, Dong, Yip 2006
cor.test(kME,kIN)
#
Pearson's product-moment correlation
# data:
kME and kIN
3
#
#
#
#
#
#
#
t = 38.4575, df = 533, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.8331551 0.8783076
sample estimates:
cor
0.8573722
# There are different criteria we might use for choosing genes: 1. GSweight (the
# correlation between expression and our phenotype of interest), 2. GSSNP (the
# correlation between expression and SNP numerical value), 3. kME or kIN (our
# connectivity measures). A priori there is no hard and fast mathematical precedent for
# choosing genes based on these criteria. As such, we might try different methods, then
# look to the literature available to date to determine which method selects the most
# biologically relevant subgroup.
# We could attempt to try 4 different methods: 1. stringent GSweight, 2. GSSNP, 3.
# stringent connectivity and 4. a balanced approach. After finding genes with each
# method, we will analyze each subgroup subjectively for biological relevance of genes
# chosen. I have also added another method - a balanced approach with more relaxed
# thresholds.
# Now we wish to find the genes with highest GSSNP for mQTL19.0147. This is in row
# 9 of the data frame GSSNP.
quantile(probs=c(0.75,0.8, 0.825, 0.85,0.9,0.95,0.975), GSweight)
# 75%
80%
82.5%
85%
90%
95%
97.5%
# 0.4959823 0.5123367 0.5260120 0.5407020 0.5663083 0.6034069 0.6319880
quantile(GSSNP19)
# Error in sort(x, partial = unique(c(lo, hi))) :
#
'x' must be atomic
# Because there was an error in finding the quantiles, I used the following work-around
535*0.25
# [1] 133.75 - For determining 75%ile, round down to 134
535*0.225
# [1] 120.325 - For determining 80%ile, round down to 120
535*0.15
# [1] 80.25 - For determining 85%ile, round down to 80
535*0.1
# [1] 53.5 - For determining 90%ile, round down to 53
quantile(probs=c(0.75,0.8,0.825,0.85,0.9,0.95), kME)
# 75%
80%
82.5%
85%
90%
95%
# 0.7111583 0.7245374 0.7347036
0.7514269 0.7745479 0.8024132
# Thresholding:
# 1. Stringent GSweight cutoff: Let's choose the 97.5%ile for GSweight, the 75%ile for
# GSSNP19 and the 75%ile for kME as preliminary thresholds.
criteria1=GSweight > 0.631 & rank(-GSSNP19)<=134 & kME > 0.711
table(criteria1)
# criteria1
4
# FALSE
#
527
TRUE
8
# 2. Stringent GSSNP19 cutoff: Let's choose the 75%ile for GSweight, the 90%ile for
# GSSNP19 and the 75%ile for kME as preliminary thresholds.
criteria2=GSweight > 0.496 & rank(-GSSNP19)<=53 & kME > 0.711
table(criteria2)
# criteria2
# FALSE TRUE
#
526
9
# 3. Stringent kME cutoff: Let's choose the 75%ile for GSweight, the 75%ile for
# GSSNP19 and the 95%ile for kME as preliminary thresholds.
criteria3=GSweight > 0.496 & rank(-GSSNP19)<=134 & kME > 0.802
table(criteria3)
# criteria3
# FALSE TRUE
#
527
8
# 4. Balanced threshold: Let's choose the 85%ile for for all three variables. This
# corresponds to Table 3 of Fuller, et al.
criteria4=GSweight > 0.541 & rank(-GSSNP19)<=80 & kME > 0.751
table(criteria4)
# criteria4
# FALSE TRUE
#
526
9
# 5. Strict GSweight and kME threshold, relaxed GSSNP19 threshold: Let's choose a
85%ile threshold for for GSweight and kME, but a 75%ile for GSSNP19 threshold for
variables.
criteria5=GSweight > 0.541 & rank(-GSSNP19)<=134 & kME > 0.751
table(criteria5)
# criteria5
# FALSE TRUE
#
519
16
# We might be interested in the actual value of GSSNP19 that is a cutoff in this
# circumstance.
GSSNP19[rank(-GSSNP19)==134]
# MMT00025527
# GSmQTL19.047
0.1939032
GSSNP19[rank(-GSSNP19)==135]
# MMT00029178
# GSmQTL19.047
0.1927229
# So, the cutoff for GSSNP19 in this case is somewhere between 0.1927 and 0.1939.
# Correlation between expression and trait (GSweight):
signedGSweight=as.numeric(cor(weight, datExprBlue,use="p"))
# Correlation between SNP and expression (GSSNP):
5
signedGSSNP=data.frame(matrix(NA,
nrow=dim(SNP)[[1]],ncol=dim(datExprBlue)[[2]] ))
for (i in c(1:dim(SNP)[[1]]) ){signedGSSNP[i,]=
as.numeric(cor(as.numeric(SNP[i,]), datExprBlue,use="p"))}
dimnames(signedGSSNP)[[1]]=paste("GS",as.character(dat1[1:9,1]),sep="")
dimnames(signedGSSNP)[[2]]=paste(as.character(dat1[-c(1:10),1]))
dim(signedGSSNP)
signedGSSNP19 = signedGSSNP[9,]
dimnames(signedGSSNP19)[[2]]=paste(as.character(dat1[-c(1:10),1]))
# Correlation between SNP and trait (COR.weight):
signedCOR.weight=data.frame(matrix(NA,
nrow=dim(SNP)[[1]],ncol=length(weight)))
for (i in c(1:dim(SNP)[[1]])){ signedCOR.weight[i,]=
as.numeric(cor(as.numeric(SNP[i,]), weight,use="p"))}
signedCOR.weight19=signedCOR.weight[9,]
# Now we check to make sure that the sign of GSSNP*GSweight is the same as the sign
# of COR.weight.
# signedCOR.weight19 is positive
MRgenes=signedGSSNP19[criteria5]*signedGSweight[criteria5]>0
table(MRgenes)
MRgenes
TRUE
16
# We create tables displaying the genes chosen for each criteria.
# Criteria 1
Criteria1Col1= t(GSSNP19[criteria1])
Criteria1Col2= GSweight[criteria1]
Criteria1Col3= kME[criteria1]
Criteria1Col4 = kIN[criteria1]
Criteria1Col5 = as.character(datSummary$genesymbol[criteria1])
Criteria1Col6 = as.character(datSummary$cytogeneticLoc[criteria1])
Criteria1Col7 = datSummary$CHROMOSOME[criteria1]
Criteria1Table=data.frame(cbind(Criteria1Col1, Criteria1Col2,
Criteria1Col3, Criteria1Col4, Criteria1Col5, Criteria1Col6,
Criteria1Col7))
colnames(Criteria1Table)=c("GSmQTL19","GSweight","kME","kIN","Symbol","
Locus","Chr")
write.csv(Criteria1Table,file="Criteria1Table.csv")
# Criteria 2
Criteria2Col1= t(GSSNP19[criteria2])
Criteria2Col2= GSweight[criteria2]
Criteria2Col3= kME[criteria2]
Criteria2Col4 = kIN[criteria2]
Criteria2Col5 = as.character(datSummary$genesymbol[criteria2])
Criteria2Col6 = as.character(datSummary$cytogeneticLoc[criteria2])
Criteria2Col7 = datSummary$CHROMOSOME[criteria2]
Criteria2Table=data.frame(cbind(Criteria2Col1, Criteria2Col2,
6
Criteria2Col3, Criteria2Col4, Criteria2Col5, Criteria2Col6,
Criteria2Col7))
colnames(Criteria2Table)=c("GSmQTL19","GSweight","kME","kIN","Symbol","
Locus","Chr")
write.csv(Criteria2Table,file="Criteria2Table.csv")
# Criteria 3
Criteria3Col1= t(GSSNP19[criteria3])
Criteria3Col2= GSweight[criteria3]
Criteria3Col3= kME[criteria3]
Criteria3Col4 = kIN[criteria3]
Criteria3Col5 = as.character(datSummary$genesymbol[criteria3])
Criteria3Col6 = as.character(datSummary$cytogeneticLoc[criteria3])
Criteria3Col7 = datSummary$CHROMOSOME[criteria3]
Criteria3Table=data.frame(cbind(Criteria3Col1, Criteria3Col2,
Criteria3Col3, Criteria3Col4, Criteria3Col5, Criteria3Col6,
Criteria3Col7))
colnames(Criteria3Table)=c("GSmQTL19","GSweight","kME","kIN","Symbol","
Locus","Chr")
write.csv(Criteria3Table,file="Criteria3Table.csv")
# Criteria 4
Criteria4Col1= t(GSSNP19[criteria4])
Criteria4Col2= GSweight[criteria4]
Criteria4Col3= kME[criteria4]
Criteria4Col4 = kIN[criteria4]
Criteria4Col5 = as.character(datSummary$genesymbol[criteria4])
Criteria4Col6 = as.character(datSummary$cytogeneticLoc[criteria4])
Criteria4Col7 = datSummary$CHROMOSOME[criteria4]
Criteria4Table=data.frame(cbind(Criteria4Col1, Criteria4Col2,
Criteria4Col3, Criteria4Col4, Criteria4Col5, Criteria4Col6,
Criteria4Col7))
colnames(Criteria4Table)=c("GSmQTL19","GSweight","kME","kIN","Symbol","
Locus","Chr")
write.csv(Criteria4Table,file="Criteria4Table.csv")
# Criteria 5
Criteria5Col1= t(GSSNP19[criteria5])
Criteria5Col2= GSweight[criteria5]
Criteria5Col3= kME[criteria5]
Criteria5Col4 = kIN[criteria5]
Criteria5Col5 = as.character(datSummary$genesymbol[criteria5])
Criteria5Col6 = as.character(datSummary$cytogeneticLoc[criteria5])
Criteria5Col7 = datSummary$CHROMOSOME[criteria5]
Criteria5Table=data.frame(cbind(Criteria5Col1, Criteria5Col2,
Criteria5Col3, Criteria5Col4, Criteria5Col5, Criteria5Col6,
Criteria5Col7))
colnames(Criteria5Table)=c("GSmQTL19","GSweight","kME","kIN","Symbol","
Locus","Chr")
7
write.csv(Criteria5Table,file="Criteria5Table.csv")
# Code ends here
8
APPENDIX 1: Gene Tables
Table 1: Criteria 1 - stringent GSweight threshold. Shaded cells in tables 1-4 demonstrate a gene is
discovered by two or more criteria (excluding criteria 5).
Symbol
Anxa2
F7
Kng2
9430028I06Rik
Slc43a1
Tubb2
Apom
Avpr1a
ID
MMT00067823
MMT00078851
MMT00065159
MMT00078732
MMT00061313
MMT00006300
MMT00030931
MMT00031229
GSmQTL19
0.199332005
0.264531063
0.238095673
0.251091211
0.220163199
0.196766683
0.283209065
0.228304509
GSweight
0.649756971
0.667600913
0.657918826
0.677714137
0.684326146
0.65886256
0.683942018
0.63565915
kME
0.858735713
0.852072156
0.813741354
0.775352538
0.780452876
0.714639559
0.733975103
0.749624652
kIN
27.46522635
27.39089814
21.10660724
17.04628191
15.28572522
13.10651526
12.43188671
10.9795002
Locus
9_37.0_cM
8_7.0_cM
0
0
0
13_16.0_cM
0
0
Chr
9
8
16
3
2
13
17
10
kME
0.852072156
0.797604763
0.785751252
0.775352538
0.795856801
0.754988325
0.7709886
0.733975103
0.712211385
kIN
27.39089814
19.225064
18.99907753
17.04628191
17.01403746
15.47686791
15.33582603
12.43188671
10.86948446
Locus
8_7.0_cM
Chr
8
16
4
3
14
6
13
17
13
Table 2: Criteria 2 - stringent GSSNP19 threshold
Symbol
F7
Pdir
Slc30a2
9430028I06Rik
Ang1
Fsp27
Gpld1
Apom
C86987
ID
MMT00078851
MMT00008463
MMT00071411
MMT00078732
MMT00064235
MMT00039459
MMT00016835
MMT00030931
MMT00018643
GSmQTL19
0.264531063
0.253024502
0.25034153
0.251091211
0.286826688
0.305643173
0.273250987
0.283209065
0.337407792
GSweight
0.667600913
0.617847121
0.584915878
0.677714137
0.605352022
0.612743555
0.543108251
0.683942018
0.547482326
0
4_65.7_cM
0
14_18.0_cM
0
13_13.0_cM
0
0
Table 3: Criteria 3 - stringent kME threshold
Symbol
Anxa2
F7
Anxa5
AI324046
Kng2
0
Msx2
Fetub
ID
MMT00067823
MMT00078851
MMT00056866
MMT00026028
MMT00065159
MMT00081689
MMT00028683
MMT00067079
GSmQTL19
0.199332005
0.264531063
0.219243631
0.196410814
0.238095673
0.210815145
0.204991147
0.195641933
GSweight
0.649756971
0.667600913
0.602986463
0.536962981
0.657918826
0.542111685
0.505218479
0.562950147
kME
0.858735713
0.852072156
0.840368035
0.820409646
0.813741354
0.814462318
0.803718296
0.811471942
kIN
27.46522635
27.39089814
25.12158728
22.5415714
21.10660724
20.06842564
18.81991381
17.66042261
Locus
9_37.0_cM
8_7.0_cM
3_19.2_cM
0
0
0
13_32.0_cM
0
Chr
9
8
3
12
16
12
13
16
Table 4: Criteria 4 - Balanced threshold
Symbol
F7
Kng2
Pdir
Slc30a2
9430028I06Rik
Ang1
Fsp27
Gpld1
Sh3d4
ID
MMT00078851
MMT00065159
MMT00008463
MMT00071411
MMT00078732
MMT00064235
MMT00039459
MMT00016835
MMT00013759
GSmQTL19
0.264531063
0.238095673
0.253024502
0.25034153
0.251091211
0.286826688
0.305643173
0.273250987
0.237280683
GSweight
0.667600913
0.657918826
0.617847121
0.584915878
0.677714137
0.605352022
0.612743555
0.543108251
0.604388051
9
kME
0.852072156
0.813741354
0.797604763
0.785751252
0.775352538
0.795856801
0.754988325
0.7709886
0.788981054
kIN
27.39089814
21.10660724
19.225064
18.99907753
17.04628191
17.01403746
15.47686791
15.33582603
14.93406009
Locus
8_7.0_cM
0
0
4_65.7_cM
0
14_18.0_cM
0
13_13.0_cM
14_34.5_cM
Chr
8
16
16
4
3
14
6
13
14
Table 5: Criteria 5 - stringent GSweight & kME thresholds, but relaxed GSSNP19
threshold. Genes that are in shaded boxes were elucidated by at least two of the first four
criteria. Genes with names in red were found by only one of the first four criteria.
Unshaded genes with names in black are new genes found by Criteria 5.
Symbol
Anxa2
F7
Anxa5
Kng2
0
Itih1
Pdir
Slc30a2
Fetub
9430028I06Rik
Ang1
Fsp27
Gpld1
Slc43a1
Sh3d4
Mat1a
ID
MMT00067823
MMT00078851
MMT00056866
MMT00065159
MMT00081689
MMT00081331
MMT00008463
MMT00071411
MMT00067079
MMT00078732
MMT00064235
MMT00039459
MMT00016835
MMT00061313
MMT00013759
MMT00013203
GSmQTL19
0.199332005
0.264531063
0.219243631
0.238095673
0.210815145
0.216393156
0.253024502
0.25034153
0.195641933
0.251091211
0.286826688
0.305643173
0.273250987
0.220163199
0.237280683
0.220571538
GSweight
0.649756971
0.667600913
0.602986463
0.657918826
0.542111685
0.583683222
0.617847121
0.584915878
0.562950147
0.677714137
0.605352022
0.612743555
0.543108251
0.684326146
0.604388051
0.556804531
10
kME
0.858735713
0.852072156
0.840368035
0.813741354
0.814462318
0.78349104
0.797604763
0.785751252
0.811471942
0.775352538
0.795856801
0.754988325
0.7709886
0.780452876
0.788981054
0.76276254
kIN
27.46522635
27.39089814
25.12158728
21.10660724
20.06842564
19.31456836
19.225064
18.99907753
17.66042261
17.04628191
17.01403746
15.47686791
15.33582603
15.28572522
14.93406009
14.1241317
Locus
9_37.0_cM
8_7.0_cM
3_19.2_cM
0
0
0
0
4_65.7_cM
0
0
14_18.0_cM
0
13_13.0_cM
0
14_34.5_cM
0
Chr
9
8
3
16
12
14
16
4
16
3
14
6
13
2
14
14
APPENDIX 2 – Gene Ontology Information
All information below was retrieved from Gene Ontology Classifications on the Mouse
Genomics Informatics website [i].
Criteria 1 - Stringent GSweight
Anxa2 (annexin A2):
 processes: angiogenesis, collagen fibril organization, fibrinolysis
 function: calcium ion binding, calcium-dependent phosphlipid binding,
cytoskeletal protein binding, phospholipase inhibitor activity, protein binding
F7 (coagulation factor 7):
 processes: blood coagulation, metabolism, proteolysis
 function: calcium ion binding, coagulation factor VIIa activity, hydrolase activity,
oxidoreductase activity, peptidase activity, serine-type endopeptidase activity
Kng2 (kininogen 2): no ontology data available on this website
9430028I06Rik (Lrrc39, leucine rich repeat containing 39):
 function: transferase activity
Slc43a1 (solute carrier family 43, member 1):
 processes: amino acid transport, L-amino acid transport, transport
 function: amino acid transporter activity, L-amino acid transporter activity
Tubb2 (tubulin, beta) - Note: this name as ambiguous as it should be either Tubb2a or
Tubb2b. Gene ontology information below is derived from Tubb2a listings.
 processes: microtubule-based process
 function: GTP binding, nucleotide binding, structural constituent of cytoskeleton
Apom (apolipoprotein M)
 processes: lipid transport, transport
 function: binding, lipid transporter activity
Avpr1a (arginine vasopressin receptor 1a)
 processes: G-protein coupled receptor protein signaling pathway, signal
transduction
 function: G-protein coupled receptor activity, receptor activity, rhodopsin-like
receptor activity, signal transducer activity, vasopressin receptor activity
11
Criteria 2 - Stringent GSSNP19
F7 (coagulation factor 7):
 processes: blood coagulation, metabolism, proteolysis
 function: calcium ion binding, coagulation factor VIIa activity, hydrolase activity,
oxidoreductase activity, peptidase activity, serine-type endopeptidase activity
Pdir (not found)
Slc30a2 (solute carrier family 30 - zinc transporter, member 2):
 processes: biological process unknown
 function: molecular function unknown
9430028I06Rik (Lrrc39, leucine rich repeat containing 39):
 processes: none available
 function: transferase activity
Ang1 (angiogenin, ribonuclease A family, member 1)
 processes: angiogenesis, cell differentiation, development, negative regulation of
protein biosynthesis
 function: endonuclease activity, hydrolase activity, nuclease activity, nucleic acid
binding, pancreatic ribonuclease activity
Fsp27 (aka Cidec, cell death-inducing DFFA-like effector c)
 processes: apoptosis, induction of apoptosis
 function: protein binding
Gpld1 (glycosylphosphatidylinositol specific phospholipase)
 processes: GPI anchor release
 function: glycosylphosphatidylinositol phospholipase D activity, hydrolase
activity, lipid transporter activity, phospholipase D activity
Apom (apolipoprotein M)
 processes: lipid transport, transport
 function: binding, lipid transporter activity
C86987 (aka Ung2, uracil DNA glycosylase 2) - no ontology data available on this
website
12
Criteria 3 - Stringent kME
Anxa2 (annexin A2):
 processes: angiogenesis, collagen fibril organization, fibrinolysis
 function: calcium ion binding, calcium-dependent phosphlipid binding,
cytoskeletal protein binding, phospholipase inhibitor activity, protein binding
F7 (coagulation factor 7):
 processes: blood coagulation, metabolism, proteolysis
 function: calcium ion binding, coagulation factor VIIa activity, hydrolase activity,
oxidoreductase activity, peptidase activity, serine-type endopeptidase activity
Anxa5 (annexin A5):
 processes: blood coagulation, negative regulation of coagulation
 function: calcium ion binding, calcium-dependent phospholipid binding
AI324046:
 processes: none available
 function: antigen binding
Kng2 (kininogen 2): no ontology data available on this website
Msx2 (homeo box, msh-like 2):
 processes: development, embryonic limb morphogenesis, regulation of
transcription, regulation of transcription, DNA-dependent.
 function: DNA binding, protein binding, sequence-specific DNA binding,
transcription factor activity
Fetub (fetuin beta)
 processes: none available
 function: cysteine protease inhibitor activity
13
Criteria 4 - Balanced thresholds
F7 (coagulation factor 7):
 processes: blood coagulation, metabolism, proteolysis
 function: calcium ion binding, coagulation factor VIIa activity, hydrolase activity,
oxidoreductase activity, peptidase activity, serine-type endopeptidase activity
Kng2 (kininogen 2): no ontology data available on this website
Pdir (not found)
Slc30a2 (solute carrier family 30 - zinc transporter, member 2):
 processes: biological process unknown
 function: molecular function unknown
9430028I06Rik (Lrrc39, leucine rich repeat containing 39):
 function: transferase activity
Ang1 (angiogenin, ribonuclease A family, member 1)
 processes: angiogenesis, cell differentiation, development, negative regulation of
protein biosynthesis
 function: endonuclease activity, hydrolase activity, nuclease activity, nucleic acid
binding, pancreatic ribonuclease activity
Fsp27 (aka Cidec, cell death-inducing DFFA-like effector c)
 processes: apoptosis, induction of apoptosis
 function: protein binding
Gpld1 (glycosylphosphatidylinositol specific phospholipase)
 processes: GPI anchor release
 function: glycosylphosphatidylinositol phospholipase D activity, hydrolase
activity, lipid transporter activity, phospholipase D activity
Sh3d4 (aka sorbin and SH3 domain containing 3)
 processes: cell adhesion, cell-substrate adhesion, negative regulation of
transcription from RNA polymerase II promoter, positive regulation of MAPKKK
cascade, transport
 function: protein binding, transcription factor binding
14
Criteria 4 - Stringent GSweight and kME thresholds, relaxed GSSNP19 threshold
Anxa2 (annexin A2):
 processes: angiogenesis, collagen fibril organization, fibrinolysis
 function: calcium ion binding, calcium-dependent phosphlipid binding,
cytoskeletal protein binding, phospholipase inhibitor activity, protein binding
F7 (coagulation factor 7):
 processes: blood coagulation, metabolism, proteolysis
 function: calcium ion binding, coagulation factor VIIa activity, hydrolase activity,
oxidoreductase activity, peptidase activity, serine-type endopeptidase activity
Anxa5 (annexin A5):
 processes: blood coagulation, negative regulation of coagulation
 function: calcium ion binding, calcium-dependent phospholipid binding
Kng2 (kininogen 2): no ontology data available on this website
Itih1 (inter-alpha (globulin) inhibitor, H1 polypeptide, Intin1, Itih-1)
 processes: hyaluronan metabolism
 function: copper ion binding, endopeptidase inhibitor activity, serine-type
endopeptidase inhibitor activity
Pdir (not found)
Slc30a2 (solute carrier family 30 - zinc transporter, member 2):
 processes: biological process unknown
 function: molecular function unknown
Fetub (fetuin beta)
 processes: none available
 function: cysteine protease inhibitor activity
9430028I06Rik (Lrrc39, leucine rich repeat containing 39):
 processes: none available
 function: transferase activity
Ang1 (angiogenin, ribonuclease A family, member 1)
 processes: angiogenesis, cell differentiation, development, negative regulation of
protein biosynthesis
 function: endonuclease activity, hydrolase activity, nuclease activity, nucleic acid
binding, pancreatic ribonuclease activity
Fsp27 (aka Cidec, cell death-inducing DFFA-like effector c)
15


processes: apoptosis, induction of apoptosis
function: protein binding
Gpld1 (glycosylphosphatidylinositol specific phospholipase)
 processes: GPI anchor release
 function: glycosylphosphatidylinositol phospholipase D activity, hydrolase
activity, lipid transporter activity, phospholipase D activity
Slc43a1 (solute carrier family 43, member 1):
 processes: amino acid transport, L-amino acid transport, transport
 function: amino acid transporter activity, L-amino acid transporter activity
Sh3d4 (aka sorbin and SH3 domain containing 3)
 processes: cell adhesion, cell-substrate adhesion, negative regulation of
transcription from RNA polymerase II promoter, positive regulation of MAPKKK
cascade, transport
 function: protein binding, transcription factor binding
Mat1a (methionine adenosyltransferase I, alpha)
 processes: one-carbon compound metabolism
 function: ATP binding, magnesium ion binding, metal ion binding, methionine
adenosyltransferase activity, nucleotide binding, potassium ion binding,
transferase activity
i
Mouse Genomics Informatics [url= http://www.informatics.jax.org/, last accessed
9/25/06].
16
Download