Entering edit mode
Kareem,
That is a very nice analogy you made with marbles in a jar!
The white marbles represent peaks in your TF1 list and the black ones
are non TF1 peaks. TF2 peaks would be the results sampled from the jar
including both white and black marbles. The overlap would be white
marbles with TF2 peaks. Is this what you are referring to too? Thanks!
Best regards,
Julie
On 11/7/11 10:22 AM, "Carr, Kareem" <kareemcarr@fas.harvard.edu>
wrote:
Dear Dr. Zhu,
I have been working with your ChIPpeakAnno package and I had a
question about the p-value for makeVennDiagram. I have read the posts
on stat.ethz.ch and gmane.org including the comments by Noah Dowell
where he suggests picking one of the transcription factors and
estimating its number of possible binding sites.
When I relate the p-value computed by the hypergeometric distribution
to the idea of having a jar of marbles which are both white and black
and taking a sample. It seems to me that marbles are all the possible
binding sites of my first transcription factor TF1. The black ones
represent sites with no peak and the white ones represent sites with
peaks. My second TF2 should represent taking a sample of marbles
where some will be white and some will be black.
My problem is the random variable represented by TF2 doesnt only
sample sites from all the binding sites of TF1. It can also be said
to be sampling sites where TF2 could bind and TF1 could not.
Therefore, in order to make this model of overlapping peaks work, we
actually want the random variable represented by all binding sites of
TF2 given that they are also binding sites of TF1.
Do you agree with this analysis?
I would appreciate any insight that you can give.
Thanks.
Kareem
--------------------------------------------------------
Kareem Carr
Research Fellow
Department of Molecular and Cellular Biology
Harvard University
Website: http://www.people.fas.harvard.edu/~kareemcarr/
[[alternative HTML version deleted]]