Question

How to resolve length bias and get GO/KEGG pathway for "non-standard" species in limma or goseq ?

0

Entering edit mode

CandiceChuDVM ▴ 40

@candicechudvm-10038

Last seen 6.6 years ago

Hi all,

I have been working with RNA-seq data from dogs.

In goseq, I can only get enriched GO terms (resolve length bias):

Since the dog genome assembly canFam3 is not supported in goseq database, I have to make my own `bias.data` to perform `nullp` in goseq. I was able to perform `GO.wall=goseq(pwf,gene2cat=geneID2GO)` to get enriched GO terms that takes length bias into consideration. However, "limiting GO categories and other category based tests" as `GO.MF=goseq(pwf,gene2cat=geneID2GO, test.cats=c("GO:MF"))` doesn't work for me. Neither does "KEGG pathway analysis".

complete R script: https://www.biostars.org/p/191626/

In limma, I can only get KEGG pathway (doesn't resolve length bias):

I can get KEGG patheays through `Kegg_pathway=kegga(kegga_genes, species.KEGG = "cfa")` but since I don't know how to specify the argument `universe`, `prior.prob`, and `covariate` in the following example, by default, I will get one-sided hypergeometric tests equivalent to Fisher’s exact test which doesn't resolve length bias. One needs to specified prior probabilities then a test based on the Wallenius’ noncentral hyper- geometric distribution is used to adjust for the relative probability that each gene will appear in a gene set, following the approach of Young et al (2010).

kegga(de, universe = NULL, species = "Hs", species.KEGG = NULL, convert = FALSE, gene.pathway = NULL, pathway.names = NULL, prior.prob = NULL, covariate=NULL, plot=FALSE, ...)

For goseq, does anybody know how to perform KEGG pathway analysis for non-native Gene Identifier? Should I build my own `org.Cf.eg.db` package for dogs?
For limma, does anybody know what's the input for the arguments? I couldn't figure it out by reading the package manual.
Alternatively, is there any other packages that can help me resolve the length bias and generate enriched GO terms and KEGG pathway for dogs?

Thanks,

Candice

rna-seq limma goseq • 1.5k views

ADD COMMENT • link updated 8.7 years ago by Nadia Davidson ▴ 320 • written 8.7 years ago by CandiceChuDVM ▴ 40

0

Entering edit mode

For length bias correction one can also use the cqn package to normalize by GC% too.

ADD REPLY • link 8.2 years ago Lluís Revilla Sancho ▴ 760

score 1 · Answer 1 · 2016-05-16

1

Entering edit mode

Nadia Davidson ▴ 320

@nadia-davidson-5739

Last seen 5.7 years ago

Australia

Hi Candine,

There was a similar question recently to at least part of your post, goseq: not able to limit by GO categories when using wallenius approximation for over-representation test. You might find it useful. If you can obtain the KEGG pathway information from some source you can pass this to goseq with gene2cat. In that case, you shouldn't specify anything with the test.cats option.

Cheers,

Nadia.

ADD COMMENT • link 8.7 years ago Nadia Davidson ▴ 320

0

Entering edit mode

Hi Nadia,

Thank you for your reply. It's good to know that it just won't work. I will try to see if I can pass my own gene2cat to get KEGG work.

Thanks,

Candice

ADD REPLY • link 8.7 years ago CandiceChuDVM ▴ 40