HI ALL,
I am working on non model species, and now I want to use GOseq to perform GO and KEGG enrichment analysis.
I have found a list of DEGs using DESeq2, FDR <= 0.05, |log2FC| => 1. However, some of those DEGs (say 50%) haven't associated with any GO and KEGG annotation. Will it affect the result a lots?
Besides, I found GO level from 1 to 15 in the GO annotation file. Do I need to input all those GO level, or can I only input relatively general GO level (say Level 2-6)? Because some of GO terms only have one or two associated genes, is that meaningful to include them. I have tried GOEAST and GAGE for enrichment analysis. Both of them will set a cut off the number of gene associated in GO term. And I am not sure GOseq can have gene set size option or not. And I am not sure lots of GO terms may affect the enrichment analysis and make the p values larger.
Many thanks,
Jack
I believe goseq will ignore unannotated genes by default.
As far as separating up- and down-regulated genes, I would try both ways. There is some evidence (e.g. Guo et al, 2014) that analyzing the up- and down-regulated genes separately can be beneficial.
Thanks Keith, yes, I also read that paper ytd. I will run the analysis again today