Hi,
I am trying to run a HyerGTest with GOstats on a mouse genome entrez
IDs.
The Ids I have imported from biomart:
entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"), filters=
"mgi_id", values = as.character(data_1$MGI),mart = mart)
head(entrez_data_1)
entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id",
"entrezgene"), filters ="mgi_id", values
=as.character(MaxQuant18$MGI))
entrezID_Universe
params <- new("GOHyperGParams", geneIds =
as.character(entrez_data_1[,2]),
universeGeneIds = as.character(entrezID_Universe[,2]), annotation =
"org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional =
FALSE,
testDirection = "over")
I Than tried to run the HyperGTest command with success
MmOverBP <- hyperGTest(paramsBP)
MmOverBP
Gene to GO BP test for over-representation
3146 GO BP ids tested (118 have p < 0.05)
Selected gene set size: 1006
Gene universe size: 2935
Annotation package: org.Mm.eg
but than:
summary(MmOverBP)
> summary(MmOverBP)
Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :
value for "GO:2000021" not found
As far as I know, I have the latest version of both packages. I looked
in
AmiGO whether this GO Id exists: it does.
AccessionGO:2000021OntologyBiological ProcessSynonymsrelated:
regulation of
electrolyte homeostasis related: regulation of negative regulation of
crystal biosynthesisrelated: regulation of negative regulation of
crystal
formation Is there a way of putting/annotating this specific item
manually,
so that I can see it?
If not-
Is there a way of extracting this GO ID from the list of GO
categories, so
that I can see the results?
Thanks a lot
Assa
> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] splines grid stats graphics grDevices utils
datasets
[8] methods base
other attached packages:
[1] GO.db_2.4.1 org.Mm.eg.db_2.4.6 biomaRt_2.6.0
[4] Heatplus_1.20.0 gplots_2.8.0 caTools_1.11
[7] bitops_1.0-4.1 gdata_2.8.1 gtools_2.6.2
[10] siggenes_1.24.0 multtest_2.7.1 Rgraphviz_1.29.0
[13] xtable_1.5-6 annotate_1.28.1 GOstats_2.16.0
[16] RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0
[19] Category_2.16.0 AnnotationDbi_1.12.0 Biobase_2.10.0
loaded via a namespace (and not attached):
[1] genefilter_1.32.0 GSEABase_1.12.1 MASS_7.3-11 RBGL_1.26.0
[5] RCurl_1.5-0 survival_2.36-5 tcltk_2.12.2 tools_2.12.2
[9] XML_3.2-0
[[alternative HTML version deleted]]
Hi Assa,
The error that you reported suggests that the GO ID you have mapped to
an entrez gene ID inside of your org.Mm.eg.db (which is where that
GO2ALL mapping is from) is not present in your GO.db package so I
think
we should start by looking to see if your GO.db package is up to date.
Looking at your sessionInfo() I can see that you have an old stale
version of GO.db (2.4.1). You should be using GO.db version 2.4.5 if
you want to use org.Mm.eg.db version 2.4.6. The annotations that are
released for each version of Bioconductor are meant to be used as a
matched set. You can avoid having to worry about all of this using
biocLite() to install all of the packages that you plan to use.
biocLite() should always install the appropriate version of a given
Bioconductor package for whichever version of R you happen to be
running.
You can read about biocLite() here on our website where we explain how
to install and update packages for Bioconductor:
http://www.bioconductor.org/install/
Marc
On 04/07/2011 08:22 AM, Assa Yeroslaviz wrote:
> Hi,
>
> I am trying to run a HyerGTest with GOstats on a mouse genome entrez
IDs.
>
> The Ids I have imported from biomart:
> entrez_data_1<- getBM(attributes=c("mgi_id","entrezgene"), filters=
> "mgi_id", values = as.character(data_1$MGI),mart = mart)
> head(entrez_data_1)
> entrezID_Universe<-getBM(mart = mart, attributes = c("mgi_id",
> "entrezgene"), filters ="mgi_id", values
=as.character(MaxQuant18$MGI))
> entrezID_Universe
> params<- new("GOHyperGParams", geneIds =
as.character(entrez_data_1[,2]),
> universeGeneIds = as.character(entrezID_Universe[,2]), annotation =
> "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional =
FALSE,
> testDirection = "over")
> I Than tried to run the HyperGTest command with success
> MmOverBP<- hyperGTest(paramsBP)
> MmOverBP
> Gene to GO BP test for over-representation
> 3146 GO BP ids tested (118 have p< 0.05)
> Selected gene set size: 1006
> Gene universe size: 2935
> Annotation package: org.Mm.eg
> but than:
> summary(MmOverBP)
>> summary(MmOverBP)
> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
> value for "GO:2000021" not found
>
> As far as I know, I have the latest version of both packages. I
looked in
> AmiGO whether this GO Id exists: it does.
> AccessionGO:2000021OntologyBiological ProcessSynonymsrelated:
regulation of
> electrolyte homeostasis related: regulation of negative regulation
of
> crystal biosynthesisrelated: regulation of negative regulation of
crystal
> formation Is there a way of putting/annotating this specific item
manually,
> so that I can see it?
> If not-
> Is there a way of extracting this GO ID from the list of GO
categories, so
> that I can see the results?
>
> Thanks a lot
> Assa
>
>
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines grid stats graphics grDevices utils
datasets
> [8] methods base
>
> other attached packages:
> [1] GO.db_2.4.1 org.Mm.eg.db_2.4.6 biomaRt_2.6.0
> [4] Heatplus_1.20.0 gplots_2.8.0 caTools_1.11
> [7] bitops_1.0-4.1 gdata_2.8.1 gtools_2.6.2
> [10] siggenes_1.24.0 multtest_2.7.1 Rgraphviz_1.29.0
> [13] xtable_1.5-6 annotate_1.28.1 GOstats_2.16.0
> [16] RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0
> [19] Category_2.16.0 AnnotationDbi_1.12.0 Biobase_2.10.0
>
> loaded via a namespace (and not attached):
> [1] genefilter_1.32.0 GSEABase_1.12.1 MASS_7.3-11
RBGL_1.26.0
> [5] RCurl_1.5-0 survival_2.36-5 tcltk_2.12.2
tools_2.12.2
> [9] XML_3.2-0
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
Hi Assa,
As far as I am aware, if the GO term comes up in your list, then there
should be genes annotated to it. I did a simple test to verify that
the GO term does exist:
crud <- as.list(GOTERM)
> crud$'GO:2000021'
GOID: GO:2000021
Term: regulation of ion homeostasis
Ontology: BP
Definition: Any process that modulates the frequency, rate or extent
of ion homeostasis.
Synonym: regulation of electrolyte homeostasis
Synonym: regulation of negative regulation of crystal biosynthesis
Synonym: regulation of negative regulation of crystal formation
So far so good. Now lets look to see what genes are annotated to it:
> library(org.Mm.eg.db)
> mget('GO:2000021',org.Mm.egGO)
Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
value for "GO:2000021" not found
> mget('GO:2000021',org.Mm.egGO2EG)
Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
value for "GO:2000021" not found
> mget('GO:2000021',org.Mm.egGO2ALLEGS)
$`GO:2000021`
ISO ISO ISO ISO IGI IGI IMP
IGI ISO ISO IMP ISO ISO IDA
"11517" "11684" "11998" "12000" "12018" "12028" "12028"
"12043" "12061" "12257" "12291" "12349" "12372" "12389"
ISO ISO ISO ISO ISO IMP ISO
ISO IDA IMP IMP IGI IGI ISO
"12424" "12558" "13167" "13489" "13617" "13666" "14062"
"14126" "14225" "14225" "14226" "14629" "14630" "14652"
ISO IDA IDA ISO IDA ISO IC
ISO IMP IMP IDA IMP ISO ISO
"15171" "15978" "16818" "16867" "16963" "17096" "17131"
"18429" "18439" "18764" "19264" "20190" "21333" "21336"
ISO ISO IMP ISO ISO TAS IDA
ISO ISO ISO ISO ISO ISO ISO
"21803" "21808" "21819" "21838" "22041" "22784" "23832"
"24111" "26361" "50849" "54140" "76055" "76757" "108837"
ISO IMP ISO ISO IMP ISO
"217369" "225908" "233081" "238276" "259277" "317757"
BTW, this was all using GO.db_2.4.5
>From this information, there are no genes that are directly annotated
to your GO term, only indirect annotations. I know this doesn't help
your current situation, but it points towards the problem at least. I
thought, however, when the summary was being prepared that it used the
GO2ALLEGS mapping, and not the direct one. Perhaps someone more
knowledgeable can figure out where in the code the error is likely to
be?
-Robert
Robert M. Flight, Ph.D.
University of Louisville Bioinformatics Laboratory
University of Louisville
Louisville, KY
PH 502-852-1809 (HSC)
PH 502-852-0467 (Belknap)
EM robert.flight at louisville.edu
EM rflight79 at gmail.com
Williams and Holland's Law:
? ? ?? If enough data is collected, anything may be proven by
statistical methods.
On Thu, Apr 7, 2011 at 11:22, Assa Yeroslaviz <frymor at="" gmail.com="">
wrote:
> Hi,
>
> I am trying to run a HyerGTest with GOstats on a mouse genome entrez
IDs.
>
> The Ids I have imported from biomart:
> entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"), filters=
> "mgi_id", values = as.character(data_1$MGI),mart = mart)
> head(entrez_data_1)
> entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id",
> "entrezgene"), filters ="mgi_id", values
=as.character(MaxQuant18$MGI))
> entrezID_Universe
> params <- new("GOHyperGParams", geneIds =
as.character(entrez_data_1[,2]),
> universeGeneIds = as.character(entrezID_Universe[,2]), annotation =
> "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional =
FALSE,
> testDirection = "over")
> I Than tried to run the HyperGTest command with success
> MmOverBP <- hyperGTest(paramsBP)
> MmOverBP
> Gene to GO BP ?test for over-representation
> 3146 GO BP ids tested (118 have p < 0.05)
> Selected gene set size: 1006
> ? ?Gene universe size: 2935
> ? ?Annotation package: org.Mm.eg
> but than:
> summary(MmOverBP)
>> summary(MmOverBP)
> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
> ?value for "GO:2000021" not found
>
> As far as I know, I have the latest version of both packages. I
looked in
> AmiGO whether this GO Id exists: it does.
> AccessionGO:2000021OntologyBiological ProcessSynonymsrelated:
regulation of
> electrolyte homeostasis related: regulation of negative regulation
of
> crystal biosynthesisrelated: regulation of negative regulation of
crystal
> formation Is there a way of putting/annotating this specific item
manually,
> so that I can see it?
> If not-
> Is there a way of extracting this GO ID from the list of GO
categories, so
> that I can see the results?
>
> Thanks a lot
> Assa
>
>
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C
> ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8
> ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8
> ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C
> ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines ? grid ? ? ?stats ? ? graphics ?grDevices utils ? ?
datasets
> [8] methods ? base
>
> other attached packages:
> ?[1] GO.db_2.4.1 ? ? ? ? ?org.Mm.eg.db_2.4.6 ? biomaRt_2.6.0
> ?[4] Heatplus_1.20.0 ? ? ?gplots_2.8.0 ? ? ? ? caTools_1.11
> ?[7] bitops_1.0-4.1 ? ? ? gdata_2.8.1 ? ? ? ? ?gtools_2.6.2
> [10] siggenes_1.24.0 ? ? ?multtest_2.7.1 ? ? ? Rgraphviz_1.29.0
> [13] xtable_1.5-6 ? ? ? ? annotate_1.28.1 ? ? ?GOstats_2.16.0
> [16] RSQLite_0.9-4 ? ? ? ?DBI_0.2-5 ? ? ? ? ? ?graph_1.28.0
> [19] Category_2.16.0 ? ? ?AnnotationDbi_1.12.0 Biobase_2.10.0
>
> loaded via a namespace (and not attached):
> [1] genefilter_1.32.0 GSEABase_1.12.1 ? MASS_7.3-11 ? ? ?
RBGL_1.26.0
> [5] RCurl_1.5-0 ? ? ? survival_2.36-5 ? tcltk_2.12.2 ? ?
?tools_2.12.2
> [9] XML_3.2-0
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
Well well,
I am ashamed to say that it is now working.
Apparently all I needed to do was to update the packages.
I installed the new version of GO.db and GOstats
and it is working now.
Also I am still getting this error when trying to find which genes are
attached to it.
> mget('GO:2000021',org.Mm.egGO)
Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :
value for "GO:2000021" not found
> mget('GO:2000021',org.Mm.egGO2EG)
Error in .checkKeys(value, Rkeys(x), x@ifnotfound) :
value for "GO:2000021" not found
So I guess the earlier error message as nothing to do with the fact
that
there are no genes from the mouse genome mapped to this GO category
When I checked in AmiGo to see if there are no genes from mouse under
this
category, I found 83 genes.
Can anyone tell me than what's the meaning of this error?
Is there a way of manually update the GO data set, so that I can map
these
genes?
Thanks
Assa
> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] splines grid stats graphics grDevices utils
datasets
[8] methods base
other attached packages:
[1] GSEABase_1.12.1 org.Mm.eg.db_2.4.6 biomaRt_2.6.0
[4] Heatplus_1.20.0 ggplot2_0.8.9 proto_0.3-9.1
[7] reshape_0.8.4 plyr_1.4 gplots_2.8.0
[10] caTools_1.11 bitops_1.0-4.1 gdata_2.8.1
[13] gtools_2.6.2 siggenes_1.24.0 multtest_2.7.1
[16] Rgraphviz_1.29.0 xtable_1.5-6 annotate_1.28.1
[19] GO.db_2.4.5 GOstats_2.16.0 RSQLite_0.9-4
[22] DBI_0.2-5 graph_1.28.0 Category_2.16.0
[25] AnnotationDbi_1.12.0 Biobase_2.10.0
loaded via a namespace (and not attached):
[1] genefilter_1.32.0 MASS_7.3-11 RBGL_1.26.0 RCurl_1.5-0
[5] survival_2.36-5 tools_2.12.2 XML_3.2-0
On Thu, Apr 7, 2011 at 18:49, Robert M. Flight <rflight79@gmail.com>
wrote:
> Hi Assa,
>
> As far as I am aware, if the GO term comes up in your list, then
there
> should be genes annotated to it. I did a simple test to verify that
> the GO term does exist:
>
> crud <- as.list(GOTERM)
> > crud$'GO:2000021'
> GOID: GO:2000021
> Term: regulation of ion homeostasis
> Ontology: BP
> Definition: Any process that modulates the frequency, rate or extent
> of ion homeostasis.
> Synonym: regulation of electrolyte homeostasis
> Synonym: regulation of negative regulation of crystal biosynthesis
> Synonym: regulation of negative regulation of crystal formation
>
> So far so good. Now lets look to see what genes are annotated to it:
>
> > library(org.Mm.eg.db)
> > mget('GO:2000021',org.Mm.egGO)
> Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :
> value for "GO:2000021" not found
>
> > mget('GO:2000021',org.Mm.egGO2EG)
> Error in .checkKeys(value, Rkeys(x), x@ifnotfound) :
> value for "GO:2000021" not found
> > mget('GO:2000021',org.Mm.egGO2ALLEGS)
> $`GO:2000021`
> ISO ISO ISO ISO IGI IGI IMP
> IGI ISO ISO IMP ISO ISO IDA
> "11517" "11684" "11998" "12000" "12018" "12028" "12028"
> "12043" "12061" "12257" "12291" "12349" "12372" "12389"
> ISO ISO ISO ISO ISO IMP ISO
> ISO IDA IMP IMP IGI IGI ISO
> "12424" "12558" "13167" "13489" "13617" "13666" "14062"
> "14126" "14225" "14225" "14226" "14629" "14630" "14652"
> ISO IDA IDA ISO IDA ISO IC
> ISO IMP IMP IDA IMP ISO ISO
> "15171" "15978" "16818" "16867" "16963" "17096" "17131"
> "18429" "18439" "18764" "19264" "20190" "21333" "21336"
> ISO ISO IMP ISO ISO TAS IDA
> ISO ISO ISO ISO ISO ISO ISO
> "21803" "21808" "21819" "21838" "22041" "22784" "23832"
> "24111" "26361" "50849" "54140" "76055" "76757" "108837"
> ISO IMP ISO ISO IMP ISO
> "217369" "225908" "233081" "238276" "259277" "317757"
>
> BTW, this was all using GO.db_2.4.5
>
> From this information, there are no genes that are directly
annotated
> to your GO term, only indirect annotations. I know this doesn't help
> your current situation, but it points towards the problem at least.
I
> thought, however, when the summary was being prepared that it used
the
> GO2ALLEGS mapping, and not the direct one. Perhaps someone more
> knowledgeable can figure out where in the code the error is likely
to
> be?
>
> -Robert
>
> Robert M. Flight, Ph.D.
> University of Louisville Bioinformatics Laboratory
> University of Louisville
> Louisville, KY
>
> PH 502-852-1809 (HSC)
> PH 502-852-0467 (Belknap)
> EM robert.flight@louisville.edu
> EM rflight79@gmail.com
>
> Williams and Holland's Law:
> If enough data is collected, anything may be proven by
> statistical methods.
>
>
>
> On Thu, Apr 7, 2011 at 11:22, Assa Yeroslaviz <frymor@gmail.com>
wrote:
> > Hi,
> >
> > I am trying to run a HyerGTest with GOstats on a mouse genome
entrez IDs.
> >
> > The Ids I have imported from biomart:
> > entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"),
filters=
> > "mgi_id", values = as.character(data_1$MGI),mart = mart)
> > head(entrez_data_1)
> > entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id",
> > "entrezgene"), filters ="mgi_id", values
=as.character(MaxQuant18$MGI))
> > entrezID_Universe
> > params <- new("GOHyperGParams", geneIds =
> as.character(entrez_data_1[,2]),
> > universeGeneIds = as.character(entrezID_Universe[,2]), annotation
=
> > "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional
=
> FALSE,
> > testDirection = "over")
> > I Than tried to run the HyperGTest command with success
> > MmOverBP <- hyperGTest(paramsBP)
> > MmOverBP
> > Gene to GO BP test for over-representation
> > 3146 GO BP ids tested (118 have p < 0.05)
> > Selected gene set size: 1006
> > Gene universe size: 2935
> > Annotation package: org.Mm.eg
> > but than:
> > summary(MmOverBP)
> >> summary(MmOverBP)
> > Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :
> > value for "GO:2000021" not found
> >
> > As far as I know, I have the latest version of both packages. I
looked in
> > AmiGO whether this GO Id exists: it does.
> > AccessionGO:2000021OntologyBiological ProcessSynonymsrelated:
regulation
> of
> > electrolyte homeostasis related: regulation of negative regulation
of
> > crystal biosynthesisrelated: regulation of negative regulation of
crystal
> > formation Is there a way of putting/annotating this specific item
> manually,
> > so that I can see it?
> > If not-
> > Is there a way of extracting this GO ID from the list of GO
categories,
> so
> > that I can see the results?
> >
> > Thanks a lot
> > Assa
> >
> >
> >> sessionInfo()
> > R version 2.12.2 (2011-02-25)
> > Platform: x86_64-pc-linux-gnu (64-bit)
> >
> > locale:
> > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> > [9] LC_ADDRESS=C LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] splines grid stats graphics grDevices utils
datasets
> > [8] methods base
> >
> > other attached packages:
> > [1] GO.db_2.4.1 org.Mm.eg.db_2.4.6 biomaRt_2.6.0
> > [4] Heatplus_1.20.0 gplots_2.8.0 caTools_1.11
> > [7] bitops_1.0-4.1 gdata_2.8.1 gtools_2.6.2
> > [10] siggenes_1.24.0 multtest_2.7.1 Rgraphviz_1.29.0
> > [13] xtable_1.5-6 annotate_1.28.1 GOstats_2.16.0
> > [16] RSQLite_0.9-4 DBI_0.2-5 graph_1.28.0
> > [19] Category_2.16.0 AnnotationDbi_1.12.0 Biobase_2.10.0
> >
> > loaded via a namespace (and not attached):
> > [1] genefilter_1.32.0 GSEABase_1.12.1 MASS_7.3-11
RBGL_1.26.0
> > [5] RCurl_1.5-0 survival_2.36-5 tcltk_2.12.2
tools_2.12.2
> > [9] XML_3.2-0
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
[[alternative HTML version deleted]]
Hi Assa,
The reason you are getting no genes is that there are no genes
"directly" annotated to this term. I had the same error when I tried
to look up your GO term of interest using GO or GO2EG. you need to use
"org.Mm.egGO2ALLEGS" in this case to find the genes that are
indirectly annotated to this term via other terms. Also keep in mind
that Amigo is updated regularly, the Bioconductor packages are updated
every 6 months. This may lead to some discrepancy in the results from
Amigo and Bioconductor.
-Robert
On Fri, Apr 8, 2011 at 01:43, Assa Yeroslaviz <frymor at="" gmail.com="">
wrote:
> Well well,
> I am ashamed to say that it is now working.
>
> Apparently all I needed to do was to update the packages.
>
> I installed the new version of GO.db and GOstats
> and it is working now.
>
> Also I am still getting this error when trying to find which genes
are
> attached to it.
>> mget('GO:2000021',org.Mm.egGO)
> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
> ? value for "GO:2000021" not found
>> mget('GO:2000021',org.Mm.egGO2EG)
> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
> ? value for "GO:2000021" not found
>
> So I guess the earlier error message as nothing to do with the fact
that
> there are no genes from the mouse genome mapped to this GO category
>
> When I checked in AmiGo to see if there are no genes from mouse
under this
> category, I found 83 genes.
> Can anyone tell me than what's the meaning of this error?
>
> Is there a way of manually update the GO data set, so that I can map
these
> genes?
>
> Thanks
> Assa
>
>> sessionInfo()
> R version 2.12.2 (2011-02-25)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> ?[1] LC_CTYPE=en_US.UTF-8?????? LC_NUMERIC=C
> ?[3] LC_TIME=en_US.UTF-8??????? LC_COLLATE=en_US.UTF-8
> ?[5] LC_MONETARY=C????????????? LC_MESSAGES=en_US.UTF-8
> ?[7] LC_PAPER=en_US.UTF-8?????? LC_NAME=C
> ?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines?? grid????? stats???? graphics? grDevices utils????
datasets
> [8] methods?? base
>
> other attached packages:
> ?[1] GSEABase_1.12.1????? org.Mm.eg.db_2.4.6?? biomaRt_2.6.0
> ?[4] Heatplus_1.20.0????? ggplot2_0.8.9??????? proto_0.3-9.1
> ?[7] reshape_0.8.4??????? plyr_1.4???????????? gplots_2.8.0
> [10] caTools_1.11???????? bitops_1.0-4.1?????? gdata_2.8.1
> [13] gtools_2.6.2???????? siggenes_1.24.0????? multtest_2.7.1
> [16] Rgraphviz_1.29.0???? xtable_1.5-6???????? annotate_1.28.1
> [19] GO.db_2.4.5????????? GOstats_2.16.0?????? RSQLite_0.9-4
> [22] DBI_0.2-5??????????? graph_1.28.0???????? Category_2.16.0
> [25] AnnotationDbi_1.12.0 Biobase_2.10.0
>
> loaded via a namespace (and not attached):
> [1] genefilter_1.32.0 MASS_7.3-11?????? RBGL_1.26.0??????
RCurl_1.5-0
> [5] survival_2.36-5?? tools_2.12.2????? XML_3.2-0
>
> On Thu, Apr 7, 2011 at 18:49, Robert M. Flight <rflight79 at="" gmail.com=""> wrote:
>>
>> Hi Assa,
>>
>> As far as I am aware, if the GO term comes up in your list, then
there
>> should be genes annotated to it. I did a simple test to verify that
>> the GO term does exist:
>>
>> ?crud <- as.list(GOTERM)
>> > crud$'GO:2000021'
>> GOID: GO:2000021
>> Term: regulation of ion homeostasis
>> Ontology: BP
>> Definition: Any process that modulates the frequency, rate or
extent
>> of ion homeostasis.
>> Synonym: regulation of electrolyte homeostasis
>> Synonym: regulation of negative regulation of crystal biosynthesis
>> Synonym: regulation of negative regulation of crystal formation
>>
>> So far so good. Now lets look to see what genes are annotated to
it:
>>
>> > library(org.Mm.eg.db)
>> > mget('GO:2000021',org.Mm.egGO)
>> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
>> ?value for "GO:2000021" not found
>>
>> > mget('GO:2000021',org.Mm.egGO2EG)
>> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>> ?value for "GO:2000021" not found
>> > mget('GO:2000021',org.Mm.egGO2ALLEGS)
>> $`GO:2000021`
>> ? ? ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?IGI ? ? ?IGI ? ? ?IMP
>> IGI ? ? ?ISO ? ? ?ISO ? ? ?IMP ? ? ?ISO ? ? ?ISO ? ? ?IDA
>> ?"11517" ?"11684" ?"11998" ?"12000" ?"12018" ?"12028" ?"12028"
>> "12043" ?"12061" ?"12257" ?"12291" ?"12349" ?"12372" ?"12389"
>> ? ? ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?IMP ? ? ?ISO
>> ISO ? ? ?IDA ? ? ?IMP ? ? ?IMP ? ? ?IGI ? ? ?IGI ? ? ?ISO
>> ?"12424" ?"12558" ?"13167" ?"13489" ?"13617" ?"13666" ?"14062"
>> "14126" ?"14225" ?"14225" ?"14226" ?"14629" ?"14630" ?"14652"
>> ? ? ISO ? ? ?IDA ? ? ?IDA ? ? ?ISO ? ? ?IDA ? ? ?ISO ? ? ? IC
>> ISO ? ? ?IMP ? ? ?IMP ? ? ?IDA ? ? ?IMP ? ? ?ISO ? ? ?ISO
>> ?"15171" ?"15978" ?"16818" ?"16867" ?"16963" ?"17096" ?"17131"
>> "18429" ?"18439" ?"18764" ?"19264" ?"20190" ?"21333" ?"21336"
>> ? ? ISO ? ? ?ISO ? ? ?IMP ? ? ?ISO ? ? ?ISO ? ? ?TAS ? ? ?IDA
>> ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO ? ? ?ISO
>> ?"21803" ?"21808" ?"21819" ?"21838" ?"22041" ?"22784" ?"23832"
>> "24111" ?"26361" ?"50849" ?"54140" ?"76055" ?"76757" "108837"
>> ? ? ISO ? ? ?IMP ? ? ?ISO ? ? ?ISO ? ? ?IMP ? ? ?ISO
>> "217369" "225908" "233081" "238276" "259277" "317757"
>>
>> BTW, this was all using GO.db_2.4.5
>>
>> From this information, there are no genes that are directly
annotated
>> to your GO term, only indirect annotations. I know this doesn't
help
>> your current situation, but it points towards the problem at least.
I
>> thought, however, when the summary was being prepared that it used
the
>> GO2ALLEGS mapping, and not the direct one. Perhaps someone more
>> knowledgeable can figure out where in the code the error is likely
to
>> be?
>>
>> -Robert
>>
>> Robert M. Flight, Ph.D.
>> University of Louisville Bioinformatics Laboratory
>> University of Louisville
>> Louisville, KY
>>
>> PH 502-852-1809 (HSC)
>> PH 502-852-0467 (Belknap)
>> EM robert.flight at louisville.edu
>> EM rflight79 at gmail.com
>>
>> Williams and Holland's Law:
>> ? ? ?? If enough data is collected, anything may be proven by
>> statistical methods.
>>
>>
>>
>> On Thu, Apr 7, 2011 at 11:22, Assa Yeroslaviz <frymor at="" gmail.com="">
wrote:
>> > Hi,
>> >
>> > I am trying to run a HyerGTest with GOstats on a mouse genome
entrez
>> > IDs.
>> >
>> > The Ids I have imported from biomart:
>> > entrez_data_1 <- getBM(attributes=c("mgi_id","entrezgene"),
filters=
>> > "mgi_id", values = as.character(data_1$MGI),mart = mart)
>> > head(entrez_data_1)
>> > entrezID_Universe <-getBM(mart = mart, attributes = c("mgi_id",
>> > "entrezgene"), filters ="mgi_id", values
=as.character(MaxQuant18$MGI))
>> > entrezID_Universe
>> > params <- new("GOHyperGParams", geneIds =
>> > as.character(entrez_data_1[,2]),
>> > universeGeneIds = as.character(entrezID_Universe[,2]), annotation
=
>> > "org.Mm.eg.db", ontology = "BP", pvalueCutoff = 0.05, conditional
=
>> > FALSE,
>> > testDirection = "over")
>> > I Than tried to run the HyperGTest command with success
>> > MmOverBP <- hyperGTest(paramsBP)
>> > MmOverBP
>> > Gene to GO BP ?test for over-representation
>> > 3146 GO BP ids tested (118 have p < 0.05)
>> > Selected gene set size: 1006
>> > ? ?Gene universe size: 2935
>> > ? ?Annotation package: org.Mm.eg
>> > but than:
>> > summary(MmOverBP)
>> >> summary(MmOverBP)
>> > Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
>> > ?value for "GO:2000021" not found
>> >
>> > As far as I know, I have the latest version of both packages. I
looked
>> > in
>> > AmiGO whether this GO Id exists: it does.
>> > AccessionGO:2000021OntologyBiological ProcessSynonymsrelated:
regulation
>> > of
>> > electrolyte homeostasis related: regulation of negative
regulation of
>> > crystal biosynthesisrelated: regulation of negative regulation of
>> > crystal
>> > formation Is there a way of putting/annotating this specific item
>> > manually,
>> > so that I can see it?
>> > If not-
>> > Is there a way of extracting this GO ID from the list of GO
categories,
>> > so
>> > that I can see the results?
>> >
>> > Thanks a lot
>> > Assa
>> >
>> >
>> >> sessionInfo()
>> > R version 2.12.2 (2011-02-25)
>> > Platform: x86_64-pc-linux-gnu (64-bit)
>> >
>> > locale:
>> > ?[1] LC_CTYPE=en_US.UTF-8 ? ? ? LC_NUMERIC=C
>> > ?[3] LC_TIME=en_US.UTF-8 ? ? ? ?LC_COLLATE=en_US.UTF-8
>> > ?[5] LC_MONETARY=C ? ? ? ? ? ? ?LC_MESSAGES=en_US.UTF-8
>> > ?[7] LC_PAPER=en_US.UTF-8 ? ? ? LC_NAME=C
>> > ?[9] LC_ADDRESS=C ? ? ? ? ? ? ? LC_TELEPHONE=C
>> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] splines ? grid ? ? ?stats ? ? graphics ?grDevices utils ? ?
datasets
>> > [8] methods ? base
>> >
>> > other attached packages:
>> > ?[1] GO.db_2.4.1 ? ? ? ? ?org.Mm.eg.db_2.4.6 ? biomaRt_2.6.0
>> > ?[4] Heatplus_1.20.0 ? ? ?gplots_2.8.0 ? ? ? ? caTools_1.11
>> > ?[7] bitops_1.0-4.1 ? ? ? gdata_2.8.1 ? ? ? ? ?gtools_2.6.2
>> > [10] siggenes_1.24.0 ? ? ?multtest_2.7.1 ? ? ? Rgraphviz_1.29.0
>> > [13] xtable_1.5-6 ? ? ? ? annotate_1.28.1 ? ? ?GOstats_2.16.0
>> > [16] RSQLite_0.9-4 ? ? ? ?DBI_0.2-5 ? ? ? ? ? ?graph_1.28.0
>> > [19] Category_2.16.0 ? ? ?AnnotationDbi_1.12.0 Biobase_2.10.0
>> >
>> > loaded via a namespace (and not attached):
>> > [1] genefilter_1.32.0 GSEABase_1.12.1 ? MASS_7.3-11 ? ? ?
RBGL_1.26.0
>> > [5] RCurl_1.5-0 ? ? ? survival_2.36-5 ? tcltk_2.12.2 ? ?
?tools_2.12.2
>> > [9] XML_3.2-0
>> >
>> > ? ? ? ?[[alternative HTML version deleted]]
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>
>
I have run into this error, but I believe all of my packages are up to date. After following the suggestions throughout this previous post, I think I have a GO ID from org.Mm.eg.db that is not in GO.db. Any suggestions on how to extract the summary(hgOver) information would be appreciated.
> library("GOstats")
Attaching package: ‘GOstats’
The following object is masked from ‘package:AnnotationDbi’:
I have run into this error, but I believe all of my packages are up to date. After following the suggestions throughout this previous post, I think I have a GO ID from org.Mm.eg.db that is not in GO.db. Any suggestions on how to extract the summary(hgOver) information would be appreciated.
> library("GOstats")
Attaching package: ‘GOstats’
The following object is masked from ‘package:AnnotationDbi’:
makeGOGraph
> library("AnnotationDbi")
> library("org.Mm.eg.db")
> total.genes <- dput(as.character(mito.prot.sort.entrez.nodups.clus$ENTREZID))
> hgCutoff = 0.001
> params <- new("GOHyperGParams",
+ geneIds=test.genes,
+ universeGeneIds=total.genes,
+ ontology="BP",
+ annotation= "org.Mm.eg.db",
+ pvalueCutoff=hgCutoff,
+ conditional=FALSE,
+ testDirection="over")
> paramsMF <- params
> ontology(paramsMF) <- "MF"
> paramsCC <- params
> ontology(paramsCC) <- "CC"
> hgOver <- hyperGTest(paramsCC)
> df <- summary(hgOver,categorySize=10)
Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :
value for "GO:0097708" not found
> x <- as.list(GOTERM)
> x$'GO:0097708'
NULL
> mget('GO:0097708',org.Mm.egGO)
Error in .checkKeys(value, Lkeys(x), x@ifnotfound) :
value for "GO:0097708" not found
> mget('GO:0097708',org.Mm.egGO2ALLEGS)
$`GO:0097708`
[removed long output]
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] GO.db_3.4.0 org.Mm.eg.db_3.4.0 GOstats_2.40.0 UniProt.ws_2.14.0 RCurl_1.95-4.8
[6] bitops_1.0-6 RSQLite_1.1-2 Category_2.40.0 Matrix_1.2-8 GSEABase_1.36.0
[11] graph_1.52.0 annotate_1.52.1 XML_3.98-1.5 AnnotationDbi_1.36.2 IRanges_2.8.1
[16] S4Vectors_0.12.1 Biobase_2.34.0 BiocGenerics_0.20.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.9 magrittr_1.5 splines_3.3.3 xtable_1.8-2 lattice_0.20-34
[6] R6_2.2.0 dplyr_0.5.0 tools_3.3.3 grid_3.3.3 AnnotationForge_1.16.1
[11] DBI_0.6 genefilter_1.56.0 assertthat_0.1 survival_2.40-1 RBGL_1.50.0
[16] digest_0.6.12 tibble_1.2 memoise_1.0.0