Dear,
I am new to differential ChIPseq analysis. One condition has 6 biological replicates and the other 3. I did not perform the experiment, I just perform the analysis. I started with loading the data:
prol.10 = dba(sampleSheet="macs10_30.csv")
This resulted in a total of 7602 peaks with 47 present in at least 2 samples. This indicates that the experiment was not very reproducible. I further looked into peaks in common between biological replicates and the result was very disappointing:
dba.overlap(prol.10, prol.10$masks$elo, mode=DBA_OLAP_RATE)
resulted in 1408 0 0
and
dba.overlap(prol.10, prol.10$masks$pro, mode=DBA_OLAP_RATE)
6206 35 2 0 0 0
My question is now: how is it possible that I still find 16 peaks to be differentially bound out of the 45 after peak merging when no peaks were found in all biological replicates? I read in another post that dba.count re-counts the overlapping reads for every consensus peak for every sample, whether or not that peak was identified in that sample. When I extract the reads, I find for 28 peaks reads higher than 5, while 0 overlapping peaks were found. I would expect reads in one biological replicate and reads below 5 for the other replicates. This is my code:
prol.10 = dba.count(prol, summits=250)
contrast.10 = dba.contrast(prol.10, categories=DBA_CONDITION)
de.10 = dba.analyze(contrast.10)
reads.10 = dba.count(prol.10, peaks=NULL, score=DBA_SCORE_READS)
bindingMatrix.10 = dba.peakset(reads.10, bRetrieve=TRUE, DataType=DBA_DATA_FRAME)
counts.10 = bindingMatrix.10[,4:ncol(bindingMatrix.10)]
Should I change the summits option?
I would very much value your input,
Thanks, Veronique
Dear Rory,
I emailed the DBA objects to you last friday. I hope I used the correct email adress. Thank you for your time.
Veronique
Hi Veronique-
No, I didn't get any email.
rory.stark @ cruk.cam.ac.uk