Hi all,
New poster and relatively new to R, please be kind!
I've been using clusterProfiler to find enriched GO terms in my nanoString dataset. I'm not having any success finding downregulated terms however. Can this be done? The code I'm using to get the enrcihed terms is as follows:
(up_gene_names
is the output of filtered DESeq2
analysis)
GO_BP_up <- enrichGO(gene = up_gene_names,
OrgDb = org.Hs.eg.db,
keyType = 'SYMBOL',
ont = "BP",
pAdjustMethod = "BH",
universe = all_gene_names,
pvalueCutoff = 0.01,
qvalueCutoff = 0.05)
I have a similar list down_gene_names
also from DESeq2
, however this does not return any results with the following code:
GO_BP_down <- enrichGO(gene = down_gene_names,
OrgDb = org.Hs.eg.db,
keyType = 'SYMBOL',
ont = "BP",
pAdjustMethod = "BH",
universe = all_gene_names,
pvalueCutoff = 0.01,
qvalueCutoff = 0.05)
returning:
#
# over-representation test
#
#...@organism Homo sapiens
#...@ontology BP
#...@keytype SYMBOL
#...@gene chr [1:294] "SORBS1" "PIK3C3" "TBC1D4" "SKP2" "PELI1" "SCIN" "LPL" "SPOPL" "BRAF" "SIRT1" ...
#...pvalues adjusted by 'BH' with cutoff <0.01
#...0 enriched terms found
#...Citation
T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
The Innovation. 2021, 2(3):100141
There are significantly more genes down regulated than upregulated, so this result is surprising. I note that enrichGO
is an "over-representation test", what should I be using instead?
Thanks!
Just my two cents: since you get results when using
up_gene_names
it suggest that from a coding perspective your analysis worked. But it rather seems that the down-regulated genes do have a very diverse function, and as a result no (specific) GO category is being significantly enriched (using a BH (FDR) corrected cutoff of 0.01). You may want to relax this cutoff value.Please note I don't have any hands-on experience with Nanostring data, but isn't a Nanostring dataset relatively small, and very focused, and therefore biased? Since (let's say) only a relatively small number (few hundred or so) of cancer-related genes are included, could it therefore be that finding even 'further enrichment' within this already focused dataset is not possible?
Thank you Guido, Yes, you are correct that nanoString is a focused panel, in my case only 774 genes, which certainly introduces bias. It's more that I was surprised that I didn't find any GO terms downregulated, like you said, I believe my code is working because I get meaningful output for upregulated terms. It's a good question about a diversity of genes which could be the reason why no terms are downregulated, however it's still surprising because there are significantly more downregulated genes.
I'm also wondering if the function might be specifically used for enriched terms only, seeing as it's called
enrichGO
. The help file also specifically states for enrichment analyis:"GO Enrichment Analysis of a gene set. Given a vector of genes, this function will return the enrichment GO categories after FDR control."
However I can't seem to find a function dedicated to 'depletion' analysis, or an argument for
enrichGO
which suggests analysis for depletion.