Recently I started using it for pathway analysis. It works well for human mRNAseq DGE data. But I encountered some problem with non-model samples. I hope I could get some help from you.
The specific problem I have is with cat genome. I looked up the supported genome:
> supportedGenomes()[,1:4]
12 felCat5 Cat Sep. 2011 ICGSC Felis_catus-6.2
13 felCat4 Cat Dec. 2008 NHGRI catChrV17e
14 felCat3 Cat Mar. 2006 Broad Institute Release 3
followed the example in the tutorial, constructed the DE genes vector:
> genes ENSFCAG00000015719 1 ENSFCAG00000005390 0 …
then calculated pwf:
> pwf=nullp(genes,"felCat5","ensGene") > head(pwf) DEgenes bias.data pwf ENSFCAG00000015719 1 1637 0.01063125 ENSFCAG00000031227 1 546 0.01063223 ENSFCAG00000014746 1 738 0.01063220 ENSFCAG00000005042 1 3636 0.01015309 ENSFCAG00000001898 1 2379 0.01058249 ENSFCAG00000023471 1 540 0.01063223 > tail(pwf) DEgenes bias.data pwf ENSFCAG00000005390 0 492 0.01063224 ENSFCAG00000009612 0 1527 0.01063165 ENSFCAG00000027330 0 996 0.01063211 ENSFCAG00000012934 0 2844 0.01046660 ENSFCAG00000007036 0 1520 0.01063167 ENSFCAG00000004574 0 1420 0.01063179
However when I tried to test, I had error:
> GO.wall = goseq(pwf,"felcat5","ensGene") Error in library(paste(orgstring, "db", sep = "."), character.only = TRUE) : there is no package called ‘NA.db’
As a novice R user, I figured that the orgstring must have not been defined (NA) in this case. I guess goseq or biocondutor does not know what annotation package should be loaded for felcat5. This pazzles me because felcat5 is supposed to be surpported in goseq.
Some online search results suggest that I should install the annotation packages for the organism. But I seem to have trouble finding db packages for cat on bioconductor webpages. And the error message did not specify what package is missing.
So my question is how to make goseq working for cat DGE in this case. Is there a package I should install to solve the problem? And how to deal with this kind of situation (supported organism but difficult to find packages) in the future.
Thanks in advance!
sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-unknown-linux-gnu (64-bit) Running under: CentOS release 6.5 (Final) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel tools stats graphics grDevices utils datasets methods base other attached packages: [1] org.Hs.eg.db_3.1.2 AnnotationDbi_1.30.1 Biobase_2.28.0 rtracklayer_1.28.7 GenomicRanges_1.20.5 [6] GenomeInfoDb_1.4.1 IRanges_2.2.7 S4Vectors_0.6.3 BiocGenerics_0.14.0 goseq_1.20.0 [11] RSQLite_1.0.0 DBI_0.3.1 geneLenDataBase_1.4.0 BiasedUrn_1.06.1 loaded via a namespace (and not attached): [1] XVector_0.8.0 zlibbioc_1.14.0 GenomicAlignments_1.4.1 BiocParallel_1.2.20 lattice_0.20-33 [6] grid_3.2.1 nlme_3.1-121 mgcv_1.8-7 lambda.r_1.1.7 futile.logger_1.4.1 [11] Matrix_1.2-2 futile.options_1.0.0 bitops_1.0-6 RCurl_1.95-4.7 biomaRt_2.24.0 [16] GO.db_3.1.2 GenomicFeatures_1.20.1 Biostrings_2.36.2 Rsamtools_1.20.4 XML_3.98-1.3
@xyliu00 Are you, by chance, taking the Statistical-genomics course taught by Jeff Leek through Coursera? This is the same problem I am having on the 4th week of the course. I would like to find out if you found a work around or whether this was not a problem overall. Matt C., mockrun (at) gmail.com