how to get the gene set from online GO database
1
0
Entering edit mode
@889f9217
Last seen 21 months ago
Canada

Now I'm doing enrichment using clusterprofiler and WebGestalR. In clusterprofiler:

x <- unique(unlist(as.list(org.Bt.egGO2ALLEGS)))
length(x)#5586

there are 5586 genes with GO annotation.

In WebGestalR:

enrichD_BP <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Biological_Process_noRedundant") geneSet_BP <- enrichD_BP$geneSet length(unique(geneSet_BP$gene))#9011

enrichD_CC <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Cellular_Component_noRedundant") geneSet_CC <- enrichD_CC$geneSet length(unique(geneSet_CC$gene))#6224

enrichD_MF <- loadGeneSet(organism = "btaurus",enrichDatabase = "geneontology_Molecular_Function_noRedundant") geneSet_MF <- enrichD_MF$geneSet length(unique(geneSet_MF$gene))#7960

geneSet <- unique(c(unique(geneSet_BP$gene),unique(geneSet_CC$gene),unique(geneSet_MF$gene)))
length(geneSet)#10085

There are at least 10085 genes with GO annotation.

WebGestalR has more gene set than clusterprofiler. So which one is the most up to date and same with online GO database? But how to get the gene set from online GO database. This question has puzzled me these days. Becasue I think they two are powerful and should have the same results

clusterProfiler WebGestalR • 1.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 14 hours ago
United States

WebGestaltR is a CRAN package and is beyond the scope of this support site. In addition, this support site is primarily intended to help people with technical questions about using Bioconductor tools. You are asking a qualitative question about the underlying data source for two unrelated packages, which is really something you should be checking yourself. The data source for the org.Bt.eg.db package is provided by the show method.

> library(AnnotationHub)
> hub <- AnnotationHub()
snapshotDate(): 2022-04-21
> z <- hub[["AH100401"]]
loading from cache
> z
OrgDb object:
| DBSCHEMAVERSION: 2.1
| Db type: OrgDb
| Supporting package: AnnotationDbi
| DBSCHEMA: BOVINE_DB
| ORGANISM: Bos taurus
| SPECIES: Bovine
| EGSOURCEDATE: 2022-Mar17
| EGSOURCENAME: Entrez Gene
| EGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
| CENTRALID: EG
| TAXID: 9913
| GOSOURCENAME: Gene Ontology
| GOSOURCEURL: http://current.geneontology.org/ontology/go-basic.obo
| GOSOURCEDATE: 2022-03-10
| GOEGSOURCEDATE: 2022-Mar17
| GOEGSOURCENAME: Entrez Gene
| GOEGSOURCEURL: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA
| KEGGSOURCENAME: KEGG GENOME
| KEGGSOURCEURL: ftp://ftp.genome.jp/pub/kegg/genomes
| KEGGSOURCEDATE: 2011-Mar15
| GPSOURCENAME: UCSC Genome Bioinformatics (Bos taurus)
| GPSOURCEURL: 
| GPSOURCEDATE: 2021-Mar15
| ENSOURCEDATE: 2021-Dec21
| ENSOURCENAME: Ensembl
| ENSOURCEURL: ftp://ftp.ensembl.org/pub/current_fasta
| UPSOURCENAME: Uniprot
| UPSOURCEURL: http://www.UniProt.org/
| UPSOURCEDATE: Fri Apr  1 15:06:51 2022

Please see: help('select') for usage information

You can see that we downloaded the go-basic.obo file from geneontology.org on 3/10/2022 and parsed it using Entrez Gene IDs as the source ID.

You will have to read whatever documentation is provided by WebGestaltR to see where they get their data and decide for yourself which one you think is 'better'.

ADD COMMENT
0
Entering edit mode

thank you for your information. But I can't run this:

hub <- AnnotationHub()

Error in UseMethod("filter_") : no applicable method for 'filter_' applied to an object of class "c('tbl_SQLiteConnection', 'tbl_dbi', 'tbl_sql', 'tbl_lazy', 'tbl')"

do you know why? How can I see the data source for the org.Bt.eg.db package? Thanks

ADD REPLY
0
Entering edit mode

I have soveld this problem. Thanks

ADD REPLY
0
Entering edit mode

But the go-basic.obo file only records go terms, and the structure of subclass, not the gene lists in each term. Do you think it is goa_cow.gaf.gz?

ADD REPLY

Login before adding your answer.

Traffic: 753 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6