cancer - Springer Static Content Server

advertisement
Pathway analysis of genome-wide association study and transcriptome
data highlights new biological pathways in colorectal cancer
Abstract
Colorectal cancer (CRC) is a common malignancy that meets the definition of a complex
disease. Genome-wide association study (GWAS) has identified several loci of weak
predictive value in CRC, however these do not fully explain the occurrence risk. Recently,
gene set analysis has allowed enhanced interpretation of GWAS data in CRC, identifying a
number of metabolic pathways as important for disease pathogenesis. Whether there are
other important pathways involved in CRC, however, remains unclear. We present a
systems analysis of KEGG pathways in CRC using (1) a human CRC GWAS dataset and (2)
a human whole transcriptome CRC case-control expression dataset. Analysis of the GWAS
dataset revealed significantly enriched KEGG pathways related to metabolism, immune
system and diseases, cellular processes, environmental information processing, genetic
information processing, and neurodegenerative diseases. Altered gene expression was
confirmed in these pathways using the transcriptome dataset. Taken together, these findings
not only confirm previous work in this area, but also highlight new biological pathways
whose deregulation is critical for CRC. These results contribute to our understanding of
disease-causing mechanisms and will prove useful for future genetic and functional studies
in CRC.
Keywords: colorectal cancer, GWAS, pathway analysis, transcriptome, metabolic
pathways
1
Introduction
Colorectal cancer (CRC), also called colon cancer or large bowel cancer, is the third most
common form of cancer and the second leading cause of cancer-related death in the
Western world with a lifetime risk in the United States of approximately 7%. CRC can be
considered a complex disease, with a combination of genetic variants and environmental
factors contributing to the illness as a whole (1). Relevant genetic variants have not been
completely defined, however, genome-wide association study (GWAS) has recently
revealed several novel CRC susceptibility loci (http://www.genome.gov/gwastudies/). The
newly identified variants exert only very small risk effects and cannot fully explain the
underlying CRC genetic risk. A large proportion of the heritability of CRC is therefore yet
to be explained.
To overcome one of the well-documented limitations of GWAS (2-3), namely its tendency
to find large numbers of relatively weak predictors, investigators have developed a means
of analyzing GWAS data that aggregates information over sets of related genes, such as
genes in common pathways, to identify gene sets that are enriched for variants associated
with disease. This “gene set” analysis strategy has yielded important new insights into the
genetic mechanisms of many complex diseases (4), such as Alzheimer’s disease (5,6),
rheumatoid arthritis (7,8), Crohn’s disease, celiac disease, type 1 diabetes, multiple
sclerosis (9–12), schizophrenia (13,14), bipolar disorder (13), and cancers of the bladder
(15), breast (16) and lung (17).
In an attempt to further enhance the power of gene set analysis in situations in which there
are a large number of predictors compared with sample size, Chen et al. developed an
algorithm termed ‘gene set ridge regression in association studies (GRASS). When the
GRASS algorithm was validated on CRC GWAS data, the top two enriched pathways were
nicotinate and nicotinamide metabolism and transforming growth factor beta (TGF-beta)
2
signaling (Kyoto Encyclopedia of Genes and Genomes identifiers hsa00760 and hsa04350,
respectively) (18). Several other pathways related to metabolism were also found to be
enriched including glycosphingolipid biosynthesis – lacto and neolacto series (hsa00601),
beta-Alanine metabolism (hsa00410), phenylalanine metabolism (hsa00360) and inositol
phosphate metabolism (hsa00562) (18). In addition to GRASS, analysis of the CRC GWAS
data using the existing gene set method proposed by Wang et al. (3) revealed a further 15
significant pathways, of which 14 were related to metabolism (18).
To date, therefore, evidence from pathway analysis of CRC GWAS data has predominantly
implicated the involvement of altered metabolic pathways in CRC occurrence. To better
investigate whether there are other important pathways involved in CRC, we conducted a
systems analysis of KEGG pathways, using (1) a CRC GWAS dataset and (2) a human
whole genome CRC case-control gene expression dataset.
Materials and methods
CRC GWAS dataset
The data were obtained from the Colorectal Tumour Gene Identification (CoRGI)
consortium (19). This study included 922 cases and 927 controls. The CRC cases had at
least one first-degree relative affected by CRC and one or more of the following
phenotypes: CRC at age 75 or below; any colorectal adenoma at age 45 or below; three or
more colorectal adenomas at age 75 or below; or a large or aggressive adenoma at age 75
or below. Controls were spouses or partners unaffected by cancer and without a personal
family history of colorectal neoplasia. All cases and controls were of European ancestry
and from the UK. A total of 555,352 SNPs were genotyped using the Illumina Hap550
BeadChip Array. On analysis, satisfactory data was obtained from 550,163 SNPs (99.1%),
with mean individual sample call rates of 99.7 and 99.8% in cases and controls,
3
respectively. Of the SNPs satisfactorily genotyped, 2516 were monomorphic, leaving
547,647 SNPs for which genotype data were informative. After quality control, we used the
summary genotype data from autosome 1-22, which included 534,050 SNPs. The
Chi-square test (2 × 2 allele) was used to investigate the association between each
polymorphism and CRC. All the chi-square tests were conducted using the R software
environment (http://www.r-project.org/).
Human CRC whole genome expression dataset
The expression dataset was originally analyzed by Lascorz et al. (20). They collected 23
CRC gene expression profiling studies and performed a meta-analysis. In the 23
independent GEP datasets, 1897 different gene identifiers were reported to be differentially
expressed (p < 0.05). The authors describe 1475 uniquely mapped genes including 603
up-regulated and 794 down-regulated genes from the original study. We used IDconverter
software to convert the gene symbols to Entrez gene IDs (21). In the end, we selected 575
up-regulated genes and 763 down-regulated genes with unique Entrez gene IDs for our
subsequent analysis.
Gene-based GWAS data analysis
ProxyGeneLD software was used to perform a gene-based test (22). This software is
designed to flexibly consider the complex linkage disequilibrium (LD) patterns of the
human genome and correct for the inflation of significance caused by gene length. The
program uses the LD structures in the HapMap genotyping data (CEU samples of HapMap
phase II, release 22). If a group of markers are in high LD in HapMap (r 2 > 0.8), they are
tied to a “proxy cluster” and taken as a single signal. Next, each marker in the AD GWAS
with statistically significant evidence of association is evaluated to determine (a) whether it
4
belongs to any proxy cluster and (b) whether the marker itself or any marker in the cluster
is located in a genetic region. If a marker or cluster overlaps with a region extending across
a gene, it is assigned as a signal showing the possible association of that gene. Finally, a
p-value was assigned for each gene (22). Genes with p < 0.05 were considered to be
significant.
Gene set (pathway) analysis
The WebGestalt toolkit (http://bioinfo.vanderbilt.edu/webgestalt/) was used to perform a
pathway analysis (23). For a given KEGG pathway, a hypergeometric test was used to
detect an overrepresentation of the CRC-related genes among all the genes in the pathway
(23). The p-value of observing more than K CRC-related genes in the pathway was
calculated by
S N S
(
)(
)
K
i mi
P  1 
,
N
i 0
( )
m
where N is the total number of genes that are of interest, S is the number of all CRC-related
genes, m is the number of genes in the pathway, and K is the number of CRC-related genes
in the pathway. The Benjamini & Hochberg (BH) method was used to correct for multiple
testing. Pathways with adjusted p < 0.05 was considered to be significant. To reduce the
multiple-testing issue, and to avoid testing overly narrow or broad pathways, we selected
pathways that contained at least 20 and at most 300 genes for subsequent analysis.
Results
Pathway analysis of the CRC GWAS dataset
Using ProxyGeneLD, we identified 981 CRC genes with p < 0.05. Pathway analysis was
5
conducted of all 981 significant genes. We identified 34 significant KEGG pathways
(adjusted p < 0.05) that each included at least five CRC genes identified in the gene-based
method (Table 1). According to the KEGG classifications, the functions of these pathways
included metabolism (n = 8), cellular processes (n = 5), environmental information
processing (n = 5), genetic information processing (n = 5), neurodegenerative diseases (n =
3) and the immune system (n = 2). The eight metabolic pathways in which genes showed
significant
enrichment
included
oxidative
phosphorylation,
glycerophospholipid
metabolism, fructose and mannose metabolism, purine metabolism, amino sugar and
nucleotide sugar metabolism, arginine and proline metabolism, arachidonic acid
metabolism and retinol metabolism. Detailed information is described in supplementary
Table 1.
Table 1
Pathway analysis of the human CRC whole genome expression dataset
For the expression dataset, to achieve internal comparison, we analyzed the significantly
up-regulated and down-regulated genes (CRC cases vs controls) separately. Using the
up-regulated genes, we identified 92 significant KEGG pathways (adjusted p < 0.05) that
each included at least five CRC genes. These pathways were mainly related to immune
system and diseases (n = 18), environmental information processing (n = 12), infectious
diseases (n = 11), cellular processes (n = 10), cancers (n = 9), metabolism (n = 5) and
cardiovascular diseases (n = 4). Importantly, we observed an enrichment of significantly
up-regulated genes in 25 of 34 significant KEGG pathways identified by analysis of the
GWAS data (Table 2). These results are provided in detail in supplementary Table 2.
6
Using the down-regulated genes, we identified 105 significant KEGG pathways (adjusted p
< 0.05) that included at least five CRC genes. These pathways were mainly related to
metabolism (n = 15), immune system and diseases (n = 13), cancers (n = 13), cellular
processes (n = 12), environmental information processing (n = 11), genetic information
processing (n = 10) infectious diseases (n = 9), neurodegenerative diseases (n = 5), and
cardiovascular diseases (n = 4). As with the over-expressed genes, we observed an
enrichment of significantly down-regulated genes in the majority of the significant KEGG
pathways identified by the GWAS analysis (27 of 34; Table 3). Detailed results are
provided in supplementary Table 3.
Table 2
Discussion
CRC is a complex disease that is likely to caused by a combination of genetic and
environmental factors. Recently, Chen et al. applied two pathway analysis methods to CRC
GWAS data and identified several potentially important metabolic pathways (18). To
investigate whether deregulation of other pathways is critical for the development of CRC,
we conducted a systems analysis using GWAS and whole transcriptome data.
Our approach identified numerous hitherto unrecognized cellular pathways important for
CRC occurrence, and also confirmed the substantial involvement of metabolic pathways in
CRC that was first reported by Chen et al. (18). Evidence from recent epidemiologic
studies also supports an association with metabolic processes, showing a link between
metabolic syndrome (MS) and the risk for CRC. Kim et al. analyzed 2531 subjects
including 731 CRC cases and 1800 controls and found that the prevalence for MS was 17%
in patients with colorectal adenoma but only 11% in the control group (24). In keeping with
7
the latter findings, a similar study screened 1771 CRC patients and 4667 controls and
observed that metabolic risk factors such as high waist circumference, blood pressure, and
serum triglyceride levels were associated with an increased risk of CRC (25). In the latter
cohort MS was also associated with an increased risk of adenoma (OR = 1.44, 95% CI =
1.23–1.70) (25). MS increased the risk of right colon adenomas (OR = 1.50, 95% CI =
1.22–1.85), left colon adenomas (OR = 1.36, 95% CI = 1.05–1.76), and adenomas in
multiple anatomical locations (OR = 1.59, 95% CI = 1.19–2.12) (25).
Recently, Esposito et al. conducted a systematic review and meta-analysis to assess the
association between MS and the risk of different cancers (26). They analyzed 116 datasets
from 43 articles, encompassing 38,940 cases of cancer. The presence of MS was associated
with increased incidence of CRC in men (relative risk 1.25 (RR), p < 0.001) and in women
(relative risk 1.34, p = 0.006) (26). Aleksandrova et al. performed a nested case-control
study using 1093 CRC cases and 1093 controls (27). The results showed that among
individual components of MS, abdominal obesity (RR = 1.51; 95% CI: 1.16–1.96) and
abnormal glucose metabolism (RR=2.05; 95% CI 1.57–2.68) were associated with CRC
(27).
Teo et al. recently assessed the association between genetic risk factors of MS or related
conditions and clinical outcome in stage II CRC patients (28). They analyzed the
expression levels of several genes related to MS and associated alterations in two
equivalent but independent sets of stage II CRC patients. The results showed that a gene
expression profile constituted by genes previously related to MS was significantly
associated with clinical outcome of stage II CRC patients.
Despite the abundant evidence from epidemiological studies of an association between MS
and CRC, such reports provide little information on the precise metabolic mechanisms that
affect CRC risk. Our results therefore provide clues that may help to explain the link
8
between these two conditions and provide rational targets for potential therapeutic
intervention.
In addition to the metabolic pathways identified, we highlighted the involvement of
pathways related to cellular processes, environmental information processing, genetic
information processing, neurodegenerative diseases and the immune system in the
development of CRC. Indeed, the RNA transport pathway, one of the KEGG-defined
genetic information processing pathways, was the most significant pathway from analysis
of CRC GWAS data (Table 1). In keeping with this finding we also observed an enrichment
of significantly down-regulated genes in this pathway (Table 2). RNA transport from the
nucleus to the cytoplasm is fundamental for gene expression. The different RNA species
that are produced in the nucleus are exported through the nuclear pore complexes via
mobile export receptors. However, general mRNA export is mechanistically different.
Export of transcripts can be modulated in response to cellular signaling or stress (29).
mRNA export has been found to be consistently dysregulated in primary material from
many different forms of cancer. Aberrant expression of export factors can alter the export
of specific transcripts encoding proteins involved in proliferation, survival, and
oncogenesis. Thus, like transcription and translation, mRNA export may also play a critical
role in cancer genesis and maintenance (29), and strategies to target different aspects of this
pathway are now undergoing early-stage clinical trials.
A variety of software tools are available for gene set (pathway) analysis of GWAS data (30).
Some, such as SNP ratio test (31), GenGen (3), GRASS (18), and PLINK set-test (32),
accept raw genotype datasets as input data. Others including ProxyGeneLD (22),
ALIGATOR, i-GSEA4GWAS, and GESBAP (30) are used to perform initial calculation of
the SNP p-values prior to subsequent gene-set analysis. In the present study we selected
ProxyGeneLD for the initial gene-based test because we did not have access to raw CRC
9
genotype data. This software is capable of adjusting for gene length and LD patterns in the
human genome, reducing the sources of bias and increasing the reliability of the pathway
analysis.
In this study, we used the pathways from the KEGG database, but not the GO database
based on the following considerations. First, the KEGG database is manually compiled on
the basis of biological evidence and does not have a hierarchical structure (33-34), whereas
the GO database is based mainly on computer predictions as well as human annotation and
has a hierarchical structure. GO analysis typically assumes that each functional category is
independent, and less than 1% of the GO annotations have been confirmed experimentally
(33-34). Second, Chen et al. performed a pathway analysis of CRC GWAS using KEGG
pathways (18). To compare our findings with that from Chen et al., we limited our
pathways from KEGG.
For the expression dataset, there are two strategies for enrichment analysis of pathways: the
analysis of all differentially expressed genes together or the analysis of up- and
down-regulated genes separately (35). Recently, Hong, et al. examined the rationales of
these enrichment analysis strategies using gene expression profiles from five types of
tumors (35). They concluded that the separate analysis of up- and down-regulated genes
could identify more pathways that are genuinely pertinent to phenotypic difference than
analyzing all of the differentially expressed genes together (35). Based on the latter
findings, we analyzed the significantly up-regulated (CRC case vs. controls) and
down-regulated genes (CRC case vs. controls) separately in pathway analysis.
Despite these interesting results, we recognize some limitations in our study. First, our
study suffered from the usual drawback of gene set enrichment analysis resulting from use
of the hypergeometric distribution to assess enrichment in pathways. This assumes that
signals (i.e., pathways) are independent, which is unlikely to be the case (33-34, 36).
10
Second, multiple-testing corrections may not be sufficient to account for all biases. Ideally
the results from the CRC GWAS should be adjusted using a permutation test. However, the
original SNP genotype data for each individual were not originally available to us. Our
intention in future is to obtain this data, and to perform additional pathway analysis using
SNP ratio test (31), GenGen (3), GRASS (18), and PLINK set-test (32), which can be used
to analyze the SNP genotype data, and to conduct a permutation test. Additionally, as with
all findings obtained from GWAS data, further analyses are required from independent
cohorts to confirm our findings.
In summary, we not only provide strong evidence of the involvement of specific metabolic
pathways in the development of CRC, but also highlight the involvement of pathways
related to cellular processes, environmental information processing, genetic information
processing, neurodegenerative diseases and the immune system. We believe that our results
may advance the understanding of CRC mechanisms and will be highly useful for future
genetic and clinical studies in CRC.
11
Table 1. KEGG pathways with P<0.05 by pathway analysis of CRC GWAS
Pathway ID
Pathway Name
KEGG classification
C
O
E
R
rawP
adjP
hsa03013
RNA transport
Genetic Information Processing
151
18
3.42
5.26
1.14E-08
3.13E-07
hsa04145
Phagosome
Cellular Processes
153
16
3.47
4.62
4.52E-07
8.29E-06
hsa05016
Huntington's disease
Neurodegenerative diseases
183
15
4.15
3.62
2.05E-05
3.00E-04
hsa00190
Oxidative phosphorylation
Metabolism
132
12
2.99
4.01
4.96E-05
5.00E-04
hsa05012
Parkinson's disease
Neurodegenerative diseases
130
12
2.95
4.07
4.27E-05
5.00E-04
hsa04350
TGF-beta signaling pathway
Environmental Information Processing
84
9
1.9
4.73
1.00E-04
7.00E-04
hsa04141
Protein processing in endoplasmic reticulum
Genetic Information Processing
165
13
3.74
3.48
1.00E-04
7.00E-04
hsa04144
Endocytosis
Cellular Processes
201
14
4.55
3.07
2.00E-04
1.00E-03
hsa04360
Axon guidance
Development
129
11
2.92
3.76
2.00E-04
1.00E-03
hsa03040
Spliceosome
Genetic Information Processing
127
11
2.88
3.82
2.00E-04
1.00E-03
hsa00564
Glycerophospholipid metabolism
Metabolism
80
8
1.81
4.41
5.00E-04
2.10E-03
hsa04810
Regulation of actin cytoskeleton
Cellular Processes
213
13
4.83
2.69
1.20E-03
4.40E-03
hsa00051
Fructose and mannose metabolism
Metabolism
36
5
0.82
6.13
1.20E-03
4.40E-03
hsa05110
Vibrio cholerae infection
Infectious diseases: Bacterial
54
6
1.22
4.9
1.40E-03
4.80E-03
hsa03008
Ribosome biogenesis in eukaryotes
Genetic Information Processing
80
7
1.81
3.86
2.30E-03
7.40E-03
hsa04080
Neuroactive ligand-receptor interaction
Environmental Information Processing
272
14
6.16
2.27
4.00E-03
1.16E-02
hsa00230
Purine metabolism
Metabolism
162
10
3.67
2.72
4.00E-03
1.16E-02
hsa00520
Amino sugar and nucleotide sugar metabolism
Metabolism
48
5
1.09
4.6
4.50E-03
1.24E-02
12
hsa05010
Alzheimer's disease
Neurodegenerative diseases
167
10
3.78
2.64
5.00E-03
1.31E-02
hsa00330
Arginine and proline metabolism
Metabolism
54
5
1.22
4.09
7.50E-03
1.88E-02
hsa04260
Cardiac muscle contraction
Circulatory system
77
6
1.74
3.44
8.10E-03
1.94E-02
hsa04010
MAPK signaling pathway
Environmental Information Processing
268
13
6.07
2.14
8.70E-03
1.99E-02
hsa00590
Arachidonic acid metabolism
Metabolism
59
5
1.34
3.74
1.08E-02
2.38E-02
hsa03015
mRNA surveillance pathway
Genetic Information Processing
83
6
1.88
3.19
1.15E-02
2.43E-02
hsa05215
Prostate cancer
Cancers: Specific types
89
6
2.02
2.98
1.58E-02
3.04E-02
hsa04540
Gap junction
Cellular Processes
90
6
2.04
2.94
1.66E-02
3.04E-02
hsa00830
Retinol metabolism
Metabolism
64
5
1.45
3.45
1.50E-02
3.04E-02
hsa04510
Focal adhesion
Cellular Processes
200
10
4.53
2.21
1.64E-02
3.04E-02
hsa05211
Renal cell carcinoma
Cancers: Specific types
70
5
1.59
3.15
2.13E-02
3.66E-02
hsa03320
PPAR signaling pathway
Endocrine system
70
5
1.59
3.15
2.13E-02
3.66E-02
hsa04630
Jak-STAT signaling pathway
Environmental Information Processing
155
8
3.51
2.28
2.54E-02
4.23E-02
hsa04612
Antigen processing and presentation
Immune system
76
5
1.72
2.9
2.92E-02
4.46E-02
hsa04620
Toll-like receptor signaling pathway
Immune system
102
6
2.31
2.6
2.87E-02
4.46E-02
hsa04370
VEGF signaling pathway
Environmental Information Processing
76
5
1.72
2.9
2.92E-02
4.46E-02
C, number of reference genes in the category; O, number of genes in the gene set and also in the category; E, expected number in the category; R, ratio of enrichment, rawP, p
value from hypergeometric test; adjP, p value adjusted by the multiple test adjustment.
13
Table 2. KEGG pathways with P<0.05 using human CRC up-regulated genes
CRC up-regulated genes
Pathway ID
CRC down-regulated genes
Pathway Name
C
O
E
R
rawP
adjP
C
O
E
R
rawP
adjP
hsa03013
RNA transport
151
N
N
N
N
N
151
13
2.65
4.9
3.00E-06
8.92E-06
hsa04145
Phagosome
153
18
2.03
8.85
2.84E-12
8.99E-11
153
17
2.69
6.32
2.04E-09
1.41E-08
hsa05016
Huntington's disease
183
14
2.43
5.76
1.91E-07
7.56E-07
183
21
3.22
6.53
1.42E-11
3.04E-10
hsa00190
Oxidative phosphorylation
132
11
1.75
6.27
1.68E-06
4.99E-06
132
16
2.32
6.9
1.69E-09
1.29E-08
hsa05012
Parkinson's disease
130
12
1.73
6.95
1.88E-07
7.56E-07
130
17
2.28
7.44
1.57E-10
1.68E-09
hsa04350
TGF-beta signaling pathway
84
7
1.12
6.27
1.00E-04
2.00E-04
84
11
1.48
7.45
2.67E-07
1.10E-06
hsa04141
Protein processing in endoplasmic reticulum
165
8
2.19
3.65
1.70E-03
2.20E-03
165
19
2.9
6.55
1.25E-10
1.49E-09
hsa04144
Endocytosis
201
11
2.67
4.12
8.79E-05
2.00E-04
201
21
3.53
5.94
8.48E-11
1.30E-09
hsa04360
Axon guidance
129
10
1.71
5.83
9.48E-06
2.50E-05
129
6
2.27
2.65
2.68E-02
2.68E-02
hsa03040
Spliceosome
127
5
1.69
2.96
2.76E-02
2.79E-02
127
13
2.23
5.82
4.18E-07
1.60E-06
hsa04810
Regulation of actin cytoskeleton
213
17
2.83
6.01
5.02E-09
3.67E-08
213
14
3.74
3.74
2.82E-05
6.16E-05
hsa05110
Vibrio cholerae infection
54
6
0.72
8.36
8.06E-05
1.00E-04
54
5
0.95
5.27
2.60E-03
2.90E-03
hsa04080
Neuroactive ligand-receptor interaction
272
9
3.61
2.49
1.11E-02
1.17E-02
272
N
N
N
N
N
hsa00230
Purine metabolism
162
6
2.15
2.79
2.16E-02
2.21E-02
162
9
2.85
3.16
2.40E-03
2.70E-03
hsa05010
Alzheimer's disease
167
13
2.22
5.86
4.26E-07
1.40E-06
167
17
2.94
5.79
7.80E-09
4.39E-08
hsa00330
Arginine and proline metabolism
54
N
N
N
N
N
54
7
0.95
7.38
4.36E-05
7.78E-05
hsa04260
Cardiac muscle contraction
77
7
1.02
6.84
7.59E-05
1.00E-04
77
10
1.35
7.39
1.01E-06
3.49E-06
14
hsa04010
MAPK signaling pathway
268
16
3.56
4.49
7.60E-07
2.41E-06
268
23
4.71
4.88
5.64E-10
5.03E-09
hsa03015
mRNA surveillance pathway
83
N
N
N
N
N
83
5
1.46
3.43
1.56E-02
1.61E-02
hsa05215
Prostate cancer
89
10
1.18
8.46
3.15E-07
1.07E-06
89
9
1.56
5.75
2.77E-05
6.16E-05
hsa04540
Gap junction
90
5
1.2
4.18
7.10E-03
7.80E-03
90
9
1.58
5.69
3.03E-05
6.48E-05
hsa00830
Retinol metabolism
64
N
N
N
N
N
64
6
1.12
5.33
9.00E-04
1.10E-03
hsa04510
Focal adhesion
200
28
2.66
10.54
2.15E-20
2.04E-18
200
19
3.52
5.41
3.31E-09
1.97E-08
hsa05211
Renal cell carcinoma
70
6
0.93
6.45
3.00E-04
5.00E-04
70
11
1.23
8.94
3.88E-08
2.08E-07
hsa03320
PPAR signaling pathway
70
5
0.93
5.38
2.40E-03
3.00E-03
70
N
N
N
N
N
hsa04630
Jak-STAT signaling pathway
155
10
2.06
4.86
4.65E-05
9.40E-05
155
9
2.72
3.3
1.80E-03
2.10E-03
hsa04612
Antigen processing and presentation
76
5
1.01
4.95
3.50E-03
4.10E-03
76
14
1.34
10.48
6.00E-11
1.07E-09
hsa04620
Toll-like receptor signaling pathway
102
11
1.36
8.12
1.24E-07
5.61E-07
102
10
1.79
5.58
1.32E-05
3.07E-05
hsa04370
VEGF signaling pathway
76
5
1.01
4.95
3.50E-03
4.10E-03
76
8
1.34
5.99
5.78E-05
9.82E-05
C, number of reference genes in the category; O, number of genes in the gene set and also in the category; E, expected number in the category; R, ratio of enrichment, rawP, p
value from hypergeometric test; adjP, p value adjusted by the multiple test adjustment. N, Null.
15
Download