Entering edit mode
António Miguel de Jesus Domingues
▴
510
@antonio-miguel-de-jesus-domingues-5182
Last seen 9 months ago
Germany
Dear Bioconductor list,
I have a list of genes from a mouse array (custom design) for which I
want
to perform an analysis with topGO. The package example is running fine
and
I have read the vignettes (though I've probably missed something) but
when
running my own data an error is generated that seems to be related to
my
custom Gene-to-GO map.
The results are a table with several annotations and custom measure of
significance. I've created a named vector (list) containing all the
genes
present in the array (ensembl IDs) with the corresponding measure of
significance - geneList.
geneList <- abs(data[ ,2])
names(geneList) <- data[ ,1]
geneList[1:5]
ENSMUSG00000025903 ENSMUSG00000025903 ENSMUSG00000025903
ENSMUSG00000025903
ENSMUSG00000033813
0.11 0.36 0.32 0.07
0.08
is(geneList)
[1] "numeric" "vector" "atomic"
"EnumerationValue" "numeric or NULL" "vectorORfactor"
summary(geneList)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0100 0.0600 0.2000 0.4568 0.5600 18.1600
# a function was then defined to select the significant genes - as in
the
vignette
topDiffGenes <- function(allScore) {
return(allScore > 1)
}
x <- topDiffGenes(geneList)
sum(x)
# so far so good
# because this is a custom array the GO annotation was extracted from
ensemble using BiomaRt.
# ensembl61 was used because of the gene format in my results
ensembl61=useMart('ENSEMBL_MART_ENSEMBL',dataset='mmusculus_gene_ensem
bl',
host='feb2011.archive.ensembl.org')
test.GO.BP <- getBM(attributes = c("ensembl_gene_id",
"go_biological_process_id"), filters = "ensembl_gene_id", values =
All.genes.Ens,
mart = ensembl61)
head(test.GO.BP)
ensembl_gene_id go_biological_process_id
1 ENSMUSG00000054310 GO:0006355
2 ENSMUSG00000054728
3 ENSMUSG00000021368 GO:0032313
4 ENSMUSG00000021368 GO:0031398
5 ENSMUSG00000051335 GO:0055114
6 ENSMUSG00000051335 GO:0008152
# but when creating the topGO object a problem appears:
GOdata <- new("topGOdata",
description = "GO analysis Test",
ontology = "BP",
allGenes = geneList,
geneSel = topDiffGenes,
annot = annFUN.gene2GO,
nodeSize = 5,
gene2GO = test.GO.BP)
Building most specific GOs ..... ( 0 GO terms found. )
Build GO DAG topology .......... ( 0 GO terms and 0 relations. )
Error in if is.na(index) || index < 0 || index > length(nd))
stop(paste("selected vertex", :
missing value where TRUE/FALSE needed
>From reading the vignette I think that the object test.GO.BP, a
data.frame,
needs to be convert to a list in which each gene corresponds to
several GO
terms:
List of 6
$ 068724: chr [1:5] "GO:0005488" "GO:0003774" "GO:0001539"
"GO:0006935" ...
$ 119608: chr [1:6] "GO:0005634" "GO:0030528" "GO:0006355"
"GO:0045449" ...
$ 049239: chr [1:13] "GO:0016787" "GO:0017057" "GO:0005975"
"GO:0005783" ...
$ 067829: chr [1:16] "GO:0045926" "GO:0016616" "GO:0000287"
"GO:0030145" ...
$ 106331: chr [1:10] "GO:0043565" "GO:0000122" "GO:0003700"
"GO:0005634" ...
$ 214717: chr [1:7] "GO:0004803" "GO:0005634" "GO:0008270"
"GO:0003677" ...
Is this what I need to do next? If how to do it? Or is it something
else?
Any help will be appreciated.
Session info:
> sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C/en_US.UTF-8/C/C/C/C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] plyr_1.7.1 genefilter_1.36.0
hgu95av2_2.2.0 hgu95av2.db_2.6.3
[5] org.Hs.eg.db_2.6.4 affyio_1.22.0
affydata_1.11.15 affy_1.32.1
[9] multtest_2.10.0 ALL_1.4.11
topGO_2.6.0 SparseM_0.91
[13] GO.db_2.6.1 graph_1.32.0
mogene10sttranscriptcluster.db_8.0.1 org.Mm.eg.db_2.6.4
[17] RSQLite_0.11.1 DBI_0.2-5
AnnotationDbi_1.16.19 Biobase_2.14.0
[21] BiocInstaller_1.2.1 biomaRt_2.10.0
Biostrings_2.22.0 GenomicRanges_1.6.7
[25] IRanges_1.12.6
loaded via a namespace (and not attached):
[1] MASS_7.3-17 RColorBrewer_1.0-5 RCurl_1.91-1
XML_3.9-4 annotate_1.32.3 colorspace_1.1-1
dichromat_1.2-4
[8] digest_0.5.2 ggplot2_0.9.0 grid_2.14.2
lattice_0.20-6 memoise_0.1 munsell_0.3
preprocessCore_1.16.0
[15] proto_0.3-9.2 reshape2_1.2.1 scales_0.2.0
splines_2.14.2 stringr_0.6 survival_2.36-12
tools_2.14.2
[22] xtable_1.7-0 zlibbioc_1.0.1
--
--
António Miguel de Jesus Domingues, PhD
Neugebauer group
Max Planck Institute of Molecular Cell Biology and Genetics, Dresden
Pfotenhauerstrasse 108
01307 Dresden
Germany
e-mail: domingue@mpi-cbg.de
tel. +49 351 210 2481
The Unbearable Lightness of Molecular Biology
[[alternative HTML version deleted]]