Question

Using clusterProfiler for depletion analysis

1

Entering edit mode

samsiljee • 0

@fb510204

Last seen 19 months ago

New Zealand

Hi all,

New poster and relatively new to R, please be kind!

I've been using clusterProfiler to find enriched GO terms in my nanoString dataset. I'm not having any success finding downregulated terms however. Can this be done? The code I'm using to get the enrcihed terms is as follows: (up_gene_names is the output of filtered DESeq2 analysis)

GO_BP_up <- enrichGO(gene = up_gene_names,
                  OrgDb = org.Hs.eg.db,
                  keyType = 'SYMBOL',
                  ont = "BP",
                  pAdjustMethod = "BH",
                  universe = all_gene_names,
                  pvalueCutoff = 0.01,
                  qvalueCutoff = 0.05)

I have a similar list down_gene_names also from DESeq2, however this does not return any results with the following code:

GO_BP_down <- enrichGO(gene = down_gene_names,
                  OrgDb = org.Hs.eg.db,
                  keyType = 'SYMBOL',
                  ont = "BP",
                  pAdjustMethod = "BH",
                  universe = all_gene_names,
                  pvalueCutoff = 0.01,
                  qvalueCutoff = 0.05)

returning:

#
# over-representation test
#
#...@organism    Homo sapiens 
#...@ontology    BP 
#...@keytype     SYMBOL 
#...@gene    chr [1:294] "SORBS1" "PIK3C3" "TBC1D4" "SKP2" "PELI1" "SCIN" "LPL" "SPOPL" "BRAF" "SIRT1" ...
#...pvalues adjusted by 'BH' with cutoff <0.01 
#...0 enriched terms found
#...Citation
 T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L Zhan, X Fu, S Liu, X Bo, and G Yu.
 clusterProfiler 4.0: A universal enrichment tool for interpreting omics data.
 The Innovation. 2021, 2(3):100141

There are significantly more genes down regulated than upregulated, so this result is surprising. I note that enrichGO is an "over-representation test", what should I be using instead?

Thanks!

clusterProfiler enrichGO • 1.6k views

ADD COMMENT • link 3.4 years ago samsiljee • 0

1

Entering edit mode

Just my two cents: since you get results when using up_gene_names it suggest that from a coding perspective your analysis worked. But it rather seems that the down-regulated genes do have a very diverse function, and as a result no (specific) GO category is being significantly enriched (using a BH (FDR) corrected cutoff of 0.01). You may want to relax this cutoff value.

Please note I don't have any hands-on experience with Nanostring data, but isn't a Nanostring dataset relatively small, and very focused, and therefore biased? Since (let's say) only a relatively small number (few hundred or so) of cancer-related genes are included, could it therefore be that finding even 'further enrichment' within this already focused dataset is not possible?

ADD REPLY • link 3.4 years ago Guido Hooiveld ★ 4.1k

0

Entering edit mode

Thank you Guido, Yes, you are correct that nanoString is a focused panel, in my case only 774 genes, which certainly introduces bias. It's more that I was surprised that I didn't find any GO terms downregulated, like you said, I believe my code is working because I get meaningful output for upregulated terms. It's a good question about a diversity of genes which could be the reason why no terms are downregulated, however it's still surprising because there are significantly more downregulated genes.

I'm also wondering if the function might be specifically used for enriched terms only, seeing as it's called enrichGO. The help file also specifically states for enrichment analyis:

"GO Enrichment Analysis of a gene set. Given a vector of genes, this function will return the enrichment GO categories after FDR control."

However I can't seem to find a function dedicated to 'depletion' analysis, or an argument for enrichGO which suggests analysis for depletion.

ADD REPLY • link 3.4 years ago samsiljee • 0