statistical tests to show the specificity of a phenomenon (eg increase in H3K27me3 mark)
1
0
Entering edit mode
Bogdan ▴ 670
@bogdan-2367
Last seen 14 months ago
Palo Alto, CA, USA

Dear all, 

although this may not be a question specifically for BioC, thought that I can still post it (if you do not mind), shall any packages for ChIP-seq analysis/statistical analysis be available to address it.

the question regards the statistical tests to show the specificity of phenomenon : let's consider an example - someone did a ChIP_seq for H3K27me3, and wants to show that a histone mark (eg H3K27me3 mark) increases on the genes involved in a particular biological process (eg 300 autophagy-related genes , from a total of 1000 genes with increased H3K27me3) after cell treatment .

what type of analysis would you recommend in order to show that the phenomenon (ie increase in H3K27me3) is specific to a set of genes (ie autophagy genes) :

-- taking random sets of non-autophagy genes (practically, the rest of the genes in the genome) -- and using parametric and non-parametric tests when comparing SET 1 (autophagy genes) with SET 2 (non-autophagy genes)

or

-- using hypergeometric / fisher-tests on a matrix (autophagy/no-autophagy genes vs increase/no-increase in H3K27me3) ?

thanks a lot, and happy weekend ;) !

bogdan

chip-seq • 1.1k views
ADD COMMENT
3
Entering edit mode
@wolfgang-huber-3550
Last seen 3 months ago
EMBL European Molecular Biology Laborat…

Bogdan

The main point is: don't use a test, or the language and concepts of testing here. Rejecting a null hypothesis of non-specificity is near to uninformative (boring, besides the point, ridiculous, ...) with regard to the strength of specificity, since such a hypothesis test would confound effect size and sample size.

Instead, choose a reasonable quantitative summary statistic (e.g. odds-ratio, or other measures of enrichment) and in addition to its point estimate, get information about the associated distribution or confidence region by resampling, e.g,. bootstrap. The choice of which summary statistic to use is less a statistical question but a biological one, and presumably you can consider several.

Wolfgang
 

ADD COMMENT
0
Entering edit mode

Dear Wolfgang, greetings, very glad to hear from you, and thank you for your comments and suggestions ;)

 

ADD REPLY

Login before adding your answer.

Traffic: 535 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6