Question

# Invalid keytype: GOALL. Please use the keytypes method to see a listing of valid arguments.

0

Entering edit mode

Nuria Mauri • 0

@618f7cbf

Last seen 5 months ago

Spain

Hi everyone! Trying to solve some issue here about 'makeOrgPackage' to use gseGO function of clusterProfiler package. Please, any help will be very appreciated. I need to analyse GSE GO terms for my RNA-seq expression study in Quercus suber. First of all, I looked for an available OrgDb file on NCBI and pum, there is one but sadly doesn't include any GO annotations. Second, I prepared the GO annotations files to build another OrgDb with makeOrgPackage as follows with the specific columns: GID, CHROMOSOME, START, END, STRAND, GOALL and the GO, ONTOLOGY, EVIDENCE. However, seems that GOALL column, which allows you to perform the analysis can not be integrated by this tool as was reported before in: Use of clusterProfiler : Error in testForValidKeytype(x, keytype)

So, do you know any other way to build a new OrgDb or implement the exiting one with the GO terms I already have? Thanks,

Nuri

library(AnnotationHub)
# Is Quercus suber already in the hub database?
#UPLOAD THE WHOLE ANNOTATIONHUB
hub <- AnnotationHub()
query(hub, c("suber", "orgdb"))
#AnnotationHub with 1 record
QS2 <- hub[["AH114342"]]
keytypes(QS2)
[1] "ACCNUM"   "ALIAS"    "ENTREZID" "GENENAME" "GID"     
[6] "PMID"     "REFSEQ"   "SYMBOL"
#no GO annotations

library(AnnotationDbi)
AnnotationDbi::keytypes(orgdb)
AnnotationDbi::columns(orgdb)

library(AnnotationForge)
a=read.csv(file = "gene_info.tsv", sep = "\t")
b=read.csv(file = "go.tsv", sep = "\t")
c=read.csv(file = "goall.tsv", sep = "\t")
makeOrgPackage(
  gene_info = a,  
  go = b,             
  goall = c,         
  tax_id = "58331",                    # Taxonomy ID for Quercus suber
  genus = "Quercus",
  species = "suber",
  version = "0.99.0",
  outputDir = "."
)
#  Invalid keytype: GOALL. Please use the keytypes method to see a listing of valid arguments.

sessionInfo( )
R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8      
 [2] LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                 
 [9] LC_ADDRESS=C              
[10] LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8
[12] LC_IDENTIFICATION=C       

time zone: Europe/Madrid
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats    
[3] graphics  grDevices
[5] utils     datasets 
[7] methods   base     

other attached packages:
 [1] tidyr_1.3.1            
 [2] dplyr_1.1.4            
 [3] biomaRt_2.60.1         
 [4] org.Qsuber.eg.db_0.99.0
 [5] AnnotationForge_1.46.0 
 [6] ggridges_0.5.6         
 [7] AnnotationDbi_1.66.0   
 [8] IRanges_2.38.1         
 [9] S4Vectors_0.42.1       
[10] Biobase_2.64.0         
[11] clusterProfiler_4.12.3 
[12] AnnotationHub_3.12.0   
[13] BiocFileCache_2.12.0   
[14] dbplyr_2.5.0           
[15] BiocGenerics_0.50.0    
[16] BiocManager_1.30.23

clusterProfiler makeOrgPackage gseGO • 1.0k views

ADD COMMENT • link updated 6 months ago by James W. MacDonald 68k • written 6 months ago by Nuria Mauri • 0

James W. MacDonald · Answer 1 · 2024-08-19

You can make your own. It takes a while because you download all the stuff from NCBI, but it's got all the GO stuff

> library(AnnotationForge)

> makeOrgPackageFromNCBI("0.0.1", "me <me@mine.org>", "me", ".", "58331", "Quercus","suber")
preparing data from NCBI ...
starting download for 
[1] gene2pubmed.gz
[2] gene2accession.gz
[3] gene2refseq.gz
[4] gene_info.gz
[5] gene2go.gz
getting data for gene2pubmed.gz
extracting data for our organism from : gene2pubmed
getting data for gene2accession.gz
extracting data for our organism from : gene2accession
getting data for gene2refseq.gz
extracting data for our organism from : gene2refseq
getting data for gene_info.gz
extracting data for our organism from : gene_info
getting data for gene2go.gz
extracting data for our organism from : gene2go
processing gene2pubmed
processing gene_info: chromosomes
processing gene_info: description
processing alias data
processing refseq data
processing accession data
processing GO data
making the OrgDb package ...
Populating genes table:
genes table filled
Populating pubmed table:
pubmed table filled
Populating chromosomes table:
chromosomes table filled
Populating gene_info table:
gene_info table filled
Populating entrez_genes table:
entrez_genes table filled
Populating alias table:
alias table filled
Populating refseq table:
refseq table filled
Populating accessions table:
accessions table filled
Populating go table:
go table filled
table metadata filled

'select()' returned many:1 mapping between keys and columns
Dropping GO IDs that are too new for the current GO.db
Populating go table:
go table filled
Populating go_bp table:
go_bp table filled
Populating go_cc table:
go_cc table filled
Populating go_mf table:
go_mf table filled
'select()' returned many:1 mapping between keys and columns
Populating go_bp_all table:
go_bp_all table filled
Populating go_cc_all table:
go_cc_all table filled
Populating go_mf_all table:
go_mf_all table filled
Populating go_all table:
go_all table filled
Creating package in ./org.Qsuber.eg.db 
Now deleting temporary database file
complete!
[1] "org.Qsuber.eg.sqlite"

> install.packages("org.Qsuber.eg.db", repos = NULL, type = "source")
Installing package into 'C:/Users/jmacdon/AppData/Local/R/win-library/4.4'
(as 'lib' is unspecified)
* installing *source* package 'org.Qsuber.eg.db' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Warning messages:
1: package 'IRanges' was built under R version 4.4.1 
2: package 'S4Vectors' was built under R version 4.4.1 
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Warning: package 'IRanges' was built under R version 4.4.1
Warning: package 'S4Vectors' was built under R version 4.4.1
** testing if installed package can be loaded from final location
Warning: package 'IRanges' was built under R version 4.4.1
Warning: package 'S4Vectors' was built under R version 4.4.1
** testing if installed package keeps a record of temporary installation path
* DONE (org.Qsuber.eg.db)
> library(org.Qsuber.eg.db)