HI
I some how, figure out the manual to reproduce the same plots for my data.
I'm working with non-model organism it doesn't have a annotation information or package for GO/KEGG analysis.
I had ~4 experimental setup, after edgeR analysis based on the padjustvalue lessthan 0.05 I chosen significantly expressed differential genes include both Up/Down regulated genes.
I chosen, zebrafish annotation for my data, performed Gene Set enrichment analysis works very well.
While KEGG enrichment step I'm getting "no term enriched under specific pvalueCutoff..."
any sugesstions ? In this case, how do I get KEGG annotation?
> library(org.Dr.eg.db)
> kegg_organism = "dre"
> kk2 <- gseKEGG(geneList = KGgenelist, organism = "dre", nPerm = 10000, minGSSize = 3,maxGSSize = 800,pAdjustMethod = "none",keyType =
> "ncbi-geneid")
> preparing geneSet collections...
> GSEA analysis...
> no term enriched under specific pvalueCutoff...
> Warning messages:
> 1: In .GSEA(geneList = geneList, exponent = exponent, minGSSize = minGSSize, :
> We do not recommend using nPerm parameter incurrent and future releases
> 2: In fgsea(pathways = geneSets, stats = geneList, nperm = nPerm, minSize = minGSSize, :
> You are trying to run fgseaSimple. It is recommended to use fgseaMultilevel. To run fgseaMultilevel, you need to remove the nperm
> argument in the fgsea function call.
When I run with gene-list(pvalueCutoff 0.05), this how I'm getting output.
And what happens when using
pvalueCutoff = 1
(thus what I suggested above)?In addition, likely the culprit ; your input list seems NOT to be suitable for a GSEA analysis! For that you need to use as input a metric (e.g. t-value of signed log(p-value) for ALL genes analyzed, not a subset! You used as input only 157 genes!
An over-representation analysis should rather be used when selecting a subsets of analyzed genes!
It works with pvalueCutoff = 1. Thanks. Could you explain me clearly, I'm new to R, I could'nt understand what you're saying, Please.
If it works without filtering on significance (thus by setting
pvalueCutoff = 1
) this basically tells you that from a coding point of view you did things OK.Next question is of course whether the results make sense. You are responsible for both things. Apparently you are struggling with the basics of the various methodologies. I would recommend you dive into the literature the fill that knowledge gap. Some suggestions: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002375 (the differences between ORA and FCS (=GSEA)) are highlighted). https://www.pathwaycommons.org/guide/primers/data_analysis/gsea/ (nice primer into GSEA)
Also: the people at the Broad Institute (who were the first to publish on GSEA) implicitly recommend very limiting filtering of a transcriptomics dataset before running GSEA, hence my comment.
For the First sample, I generated some good plots make sense. For the second sample, I tried the same steps for Gene set enrichment analysis(gseGO) works fine.
For gseKEGG analysis It shows "no term enriched under specific pvalueCutoff" (with pvalueCutoff 1) and I'm not getting any results. How do I plot this data ?
Suggestions please.
Well, there is nothing to plot...!
Reason for this is that you only used 30 genes as input!! First of all, and as said before, you should not use a subset of genes for GSEA analysis!
Also note that of these 30 input genes, only 9 are annotated to a KEGG pathway!! And max 2 of these 9 genes are in the same pathway. Since you set
minGSSize = 3
, no gene set (pathway) was included in the analysis... and since there are no pathways to analyze, there won't be any result!If you set
minGSSize = 1
you will get results...The one and only positive aspect of such an analysis is that its shows your code is working, but other than that.... ????
Thanks for the nice explanation, I'm doing analysis only for significantly selected expressed genes below the 0.05 cutoff value. To understand genes role in pathways.
HI, I had around 100 Genes start with LOC* prefix, they don't have any orthologs. I searched in NCBI to find out the gene-id's information and further I try to convert them to ensemble using bitr function and further I perform groupGO & gseKEGG shows error. Do I have to neglect these genes ?
Kevin, please, you have already asked this question on Biostars: https://www.biostars.org/p/448859/
Always mention the other web-sites where you have asked questions.
Nothing passes this threshold, so, nothing can be plot. You will have to go back a few steps in your analysis to determine why there may be nothing passing the 0.05 threshold. Thanks! Guido has already answered the main question.