Hello,
I have 2 samples (3 replicates of each) of MeDIP, and I have questions regarding their analysis with the MEDIPSpackage:
1. I have one input DNA for bot conditions. I read section 6.6 in the manual, and as far as I understand this, since I have only one input, there is no point in having the input as ISet1 and ISet2. Is that correct? Will calculations be different if I add the input as ISet1 and ISet2?
2. I followed the commands in the manual for comparing the 2 sets. Also, I called peaks with MACS for each of these 2 samples. Do you recommend also to call peaks in each sample separately also within MEDIPD?
If yes - this should be done with the MEDIPS.meth command right?
3. Do you recommend to integrate peaks with differentially methylated regions? If a differentially methylated region was also called as a peak - is it an indication for a stronger finding?
Dear GFM,
abt 1) since you have only one Input data set, please simply assign it to ISet1.
abt 2) MEDIPS does not call peaks. Instead, MEDIPS calculates differential enrichment between conditions. In fact, the Input data set will not even be considered.
abt 3) its a valid approach to restrict differentially enriched regions (DERs) to regions identified as peaks in the respective condition. In case there is no peak at a DER you might have a) used lose significance threshold in MEDIPS, b) high significance thresholds in MACS, or c) identified a region with valid differences in enrichment, but below the detectable threshold for the peak caller.
All the best,
Lukas
On 14 Jun 2016, at 20:35, GFM [bioc] <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> wrote:
Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""/>
User GFM<https: support.bioconductor.org="" u="" 8326=""/> wrote Question: MEDIPS - input<https: support.bioconductor.org="" p="" 83806=""/>:
Hello,
I have 2 samples (3 replicates of each) of MeDIP, and I have questions regarding their analysis with the MEDIPSpackage:
1. I have one input DNA for bot conditions. I read section 6.6 in the manual, and as far as I understand this, since I have only one input, there is no point in having the input as ISet1 and ISet2. Is that correct? Will calculations be different if I add the input as ISet1 and ISet2?
2. I followed the commands in the manual for comparing the 2 sets. Also, I called peaks with MACS for each of these 2 samples. Do you recommend also to call peaks in each sample separately also within MEDIPD?
If yes - this should be done with the MEDIPS.meth command right?
3. Do you recommend to integrate peaks with differentially methylated regions? If a differentially methylated region was also called as a peak - is it an indication for a stronger finding?
Thank you
________________________________
Post tags: MEDIPS, input
You may reply via email or visit MEDIPS - input
Dear Lukas,
Thank you very much for your detailed reply and for the MEDIPS package.
I have 3 additional questions:
1. How is the input used for the calculations?
I have performed analysis with the input and without, and values of all columns are exactly the same.
2. I don't manage to perform the enrichment analysis.
Here is the command I used:
er_bam.Control1 = MEDIPS.CpGenrich(file = bam.Control1, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq)
It ran for hours without giving any result. Any suggestions?
I am using MEDIPS_1.16.0.
I tried to add the sessionInfo, but I get an error that the post is too long.
3. Is it possible to use parallel processing with MEDIPS?
Dear GFM,
abt 1) Input is not considered when calculating differential ChIP enrichment between conditions. Please see also the last two paragraphs of section 6.7 of the MEDIPS vignette.
abt 2) Have you tried running the example given at ?MEDIPS.CpGenrich ?It terminates in ~8 minutes on my system:
library(MEDIPSData)
bam.file.hESCs.Rep1.MeDIP = system.file("extdata", "hESCs.MeDIP.Rep1.chr22.bam", package="MEDIPSData”)
er=MEDIPS.CpGenrich(file=bam.file.hESCs.Rep1.MeDIP, BSgenome="BSgenome.Hsapiens.UCSC.hg19", chr.select="chr22", extend=0, shift=0, uniq=1e-3)
In case this successfully terminates for you, it will be necessary to figure out why it does not terminate in your case. My best guess is that the combination of the size of your bam file, the size of your reference genome and your computers RAM causes successful termination in reasonable time.
abt 3) no, there is no parallelisation implemented.
Thank you and all the best,
Lukas
On 24 Jun 2016, at 03:38, GFM [bioc] <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> wrote:
Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""/>
User GFM<https: support.bioconductor.org="" u="" 8326=""/> wrote Answer: MEDIPS - input<https: support.bioconductor.org="" p="" 83806="" #84245="">:
Dear Lukas,
Thank you very much for your detailed reply and for the MEDIPS package.
I have 2 additional questions:
1. How is the input used for the calculations?
I have performed analysis with the input and without, and values of all columns are exactly the same.
2. I don't manage to perform the enrichment analysis.
Here is the command I used:
er_bam.Control1 = MEDIPS.CpGenrich(file = bam.Control1, BSgenome = BSgenome, extend = extend, shift = shift, uniq = uniq)
It ran for hours without giving any result. Any suggestions?
I am using MEDIPS_1.16.0.
I tried to add the sessionInfo, but I get an error that the post is too long.
3. Is it possible to use parallel processing with MEDIPS?
Thanks a lot
________________________________
Post tags: MEDIPS, input
You may reply via email or visit A: MEDIPS - input
Dear Lukas,
Thank you very much for your answers.
I will try to run the enrichment on the example file. My bam files are 2.5-9 G.
I have another question regarding the output of the differential analysis.
I got ~2000 windows which are significantly down regulated and ~40 which are up-regulated (the samples are not very different biologically). The number of mapped reads differ between the samples (some have ~12M mapped reads and some ~30M reads). I am worried that this is an artifact (maybe due to the different coverage, low coverage or from other reason).
What do you think on such difference between the number of up and down regulated windows?
Thank you very much for all the support.
Dear GFM,
MEDIPS applies edgeR’s TMM for library size normalisation.
It could be that your adjusted p-value threshold is too relaxed and in fact there are no relevant differences at all. It could also be that your results indicate import biological events. It could also be that you have a batch effect in your data which needs to be corrected for (e.g. by Quantile normalisation). It’s almost impossible for me to judge, if the analysis of your data results in any biological meaningful outcome.
All the best,
Lukas
On 28 Jun 2016, at 15:47, GFM [bioc] <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> wrote:
Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""/>
User GFM<https: support.bioconductor.org="" u="" 8326=""/> wrote Answer: MEDIPS - input<https: support.bioconductor.org="" p="" 83806="" #84367="">:
Dear Lukas,
Thank you very much for your answers.
I will try to run the enrichment on the example file. My bam files are 2.5-9 G.
I have another question regarding the output of the differential analysis.
I got ~2000 windows which are significantly down regulated and ~40 which are up-regulated (the samples are not very different biologically). The number of mapped reads differ between the samples (some have ~12M mapped reads and some ~30M reads). I am worried that this is an artifact (maybe due to the different coverage, low coverage or from other reason).
What do you think on such difference between the number of up and down regulated windows?
Thank you very much for all the support.
________________________________
Post tags: MEDIPS, input
You may reply via email or visit A: MEDIPS - input