Hi all,
I have been working with RNA-seq data from dogs.
In goseq, I can only get enriched GO terms (resolve length bias):
Since the dog genome assembly canFam3 is not supported in goseq database, I have to make my own `bias.data` to perform `nullp` in goseq. I was able to perform `GO.wall=goseq(pwf,gene2cat=geneID2GO)` to get enriched GO terms that takes length bias into consideration. However, "limiting GO categories and other category based tests" as `GO.MF=goseq(pwf,gene2cat=geneID2GO, test.cats=c("GO:MF"))` doesn't work for me. Neither does "KEGG pathway analysis".
complete R script: https://www.biostars.org/p/191626/
In limma, I can only get KEGG pathway (doesn't resolve length bias):
I can get KEGG patheays through `Kegg_pathway=kegga(kegga_genes, species.KEGG = "cfa")` but since I don't know how to specify the argument `universe`, `prior.prob`, and `covariate` in the following example, by default, I will get one-sided hypergeometric tests equivalent to Fisher’s exact test which doesn't resolve length bias. One needs to specified prior probabilities then a test based on the Wallenius’ noncentral hyper- geometric distribution is used to adjust for the relative probability that each gene will appear in a gene set, following the approach of Young et al (2010).
kegga(de, universe = NULL, species = "Hs", species.KEGG = NULL, convert = FALSE, gene.pathway = NULL, pathway.names = NULL, prior.prob = NULL, covariate=NULL, plot=FALSE, ...)
- For goseq, does anybody know how to perform KEGG pathway analysis for non-native Gene Identifier? Should I build my own `org.Cf.eg.db` package for dogs?
- For limma, does anybody know what's the input for the arguments? I couldn't figure it out by reading the package manual.
- Alternatively, is there any other packages that can help me resolve the length bias and generate enriched GO terms and KEGG pathway for dogs?
Thanks,
Candice
For length bias correction one can also use the cqn package to normalize by GC% too.