piano runGSA with input from DESEQ2
2
0
Entering edit mode
@simonpsnoeck-11800
Last seen 8.0 years ago

Hi,

For performing a gene enrichment analysis, we used the following settings for the R-function runGSA (piano package);

gsaRes_xxx<-runGSA(pval_xxx, geneSetStat="fisher", directions=fc_xxx, signifMethod="nullDist", adjMethod="BH", gsc=gsc, gsSizeLim=c(5,Inf))

with:

fc_xxx = log2fc of genes (ouput deseq2)

pval_xxx = the p-values (output deseq2) or should we use the adj p-value from deseq2?

This seemed to work, can anyone confirm our settings?

Kind regards,

Simon

 

 

deseq2 gene ontology piano • 2.0k views
ADD COMMENT
0
Entering edit mode
Leif Väremo ▴ 70
@leif-varemo-5897
Last seen 5.1 years ago
Sweden

Note that Fisher's (combined probability) test tends to give low p-values to a huge amount of genes. There is also a tendency for this method to return gene-set p-values that correlate with gene-set size (see e.g. Fig 3B in Väremo et al. (2013)).

Normal p-values sometimes have a higher resolution (more unique values) than adjusted p-values so in that sense it could be good to use as input. The gene-set p-values should however be adjusted for multiple testing. One could also use the adj p-values as input. Maybe someone with a more solid statistical background could add a comment on this?

Apart from those notes, the syntax of your command looks correct to me.

And a recommendation: once you have your gene-set results and conclusion, go back to the gene-level data for the specific gene-sets and spot-check/validate that your results are sensible given the input data.

Kind regards

Leif

ADD COMMENT
0
Entering edit mode
@simonpsnoeck-11800
Last seen 8.0 years ago

Thanks Leif,

About those low p-values, how should we interpret the following case;

Genes (up) Stat (mix.dir.up) p (mix.dir.up) p adj (mix.dir.up) Genes (down) Stat (mix.dir.dn) p (mix.dir.dn) p adj (mix.dir.dn)
13 1714.4 0 0 1 16.757 0.00022976 0.00022976
13 1714.4 0 0 1 16.757 0.00022976 0.00022976

In both cases only one gene is down (in comparison with 13 up). Concerning the stats for the gene that went down, this still results in a p-value <0.05. Hence, a significant effect on the concerned GO by one gene. Or are we interpreting this in the wrong way?

Kind regards,

Simon

 

ADD COMMENT
0
Entering edit mode

Yes that looks a bit weird of course. Note that the mixed-directional score is calculated by essentially subsetting the gene-set into two parts, one with the up-regulated genes and one with the down-regulated genes. The two parts are "unaware" of each other. In this case it means that a gene-set of 1 (down-regulated) gene got fairly significant, probably based on the fact that the single gene itself was quite significant.

I would take the number of genes into account (as you do) when you interpret these results. 

An alternative would be to choose a method that would also return the distinct directional score, which for your example gene-set would definitely mark it as affected by up-regulation, but not down-regulation (since it does not do the subsetting in that case).

ADD REPLY

Login before adding your answer.

Traffic: 451 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6