Entering edit mode
Paolo Kunderfranco
▴
350
@paolo-kunderfranco-5158
Last seen 7.4 years ago
Dear Lucas Chavez,
Repeat masking of the reference genome assembly should be considered
before
aligning my samples? Or I may loose information?
Thanks,
Paolo
2013/2/19 Lukas Chavez <lukas.chavez.mailings@googlemail.com>
>
> Dear Paolo,
>
> > why in the example refered in the manual there is only one
INPUT.SET for
> two conditions?
>
> Currently, MEDIPS allows for only one control, one treatment, and
one
> combined Input data set. However, there is obviously a desperate
need for
> considering replicates per group as well as individual Input data
sets.
> Therefore (and because of many other issues), I have extensively
revised
> the MEDIPS package which will allow for processing replicates per
condition
> as well as two groups of Input data. I intend to update the MEDIPS
package
> as soon as possible, especially in advance of the next Bioconductor
> release. Nevertheless, it is not clear how you designed your
experiments
> and your analysis strategy? The MEDIPS update will be helpful, e.g.
in case
> you are comparing two groups of IP-seq samples and you want to
consider two
> according groups of Input samples in order to identify genomic
variants
> that influence the IP enrichments.
>
> >I followed MEDUSA protocol MEDUSA protocol (...)
>
> I greatly appreciate that MEDIPS has been incorporated in other
analysis
> pipelines. However, please excuse that I can only comment on issues
and
> functionalities of the MEDIPS package.
>
> > (...) when I filter out for non-unique reads. Roughly 90 % are
discarded
> (...)
>
> This issue may refer to amplification and oversequencing problems
and
> there are different opinions about unique reads. However, the
fraction of
> non-unique reads in you sequencing data is an issue that goes beyond
what I
> can discuss here. Currently, MEDIPS allows for considering all reads
or for
> replacing all unique reads (or maybe better: reads that map to the
same
> genomic position) by one representative. However, you can pre-filter
your
> input files by any estimate of global or local thresholds for non-
unique
> reads and continue using MEDIPS by considering all given mapping
results.
>
>
> >Is it possible that such a low number of reads is sufficient to
generate a
> saturated and reproducible methylation profile?
>
> This depends on the methylaion status of your reference genome. In
case
> you are studying the methylation status of a small and only barely
> methylated genome, your results might be reasonable.
>
> All the best,
> Lukas
>
>
>
> Dear All,
>
> I will now start and anlyze some MeDIP seq data with MEDIPS
Bioconductor
> Package
>
> I went through reading all the MEDIPS manual,
>
> I have to compare methylation profile of two cell lines, I have the
Input
> of both of them
> ,
> why in the example refered in the manual there is only one INPUT.SET
for
> two conditions?
>
> CONTROL.SET, TREAT.SET, and INPUT.SET
>
>
> Any suggestions?
>
> Thanks,
> Paolo
>
>
> On Fri, Feb 15, 2013 at 5:33 AM, Paolo Kunderfranco <
> paolo.kunderfranco@gmail.com> wrote:
>
>> Dear Lucas Chavez
>>
>> I followed MEDUSA protocol to filter out both not properly paired,
low
>> quality mapping and non-unique sequences from my alignment files to
use
>> MEDIPS fur further analysis of DMR.
>>
>> For example one mC sample started with 100 milions reads. 80 %
mapped, 70
>> %
>> of them properly mapped with high quility (mapQ>40).
>> The problem arises when I filter out for non-unique reads. Roughly
90 %
>> are
>> discarded leading to a final number of 2-4 milions of reads.
>> All my mC samples behave in the same way.
>>
>> Maybe the DNA starting material was not properly quantified (2-3 ng
>> instead
>> of 5 ng were used for the generation of the libraries).
>> We didn't observe the same problem for the Input DNA ( correctly
>> quantified) and for 2 samples out of 4 for 5-hydroxy-mC.
>>
>> The high number of non-unique reads could be due to a technical
problem or
>> a biological problem? Have you ever experienced a similar problem?
>> How do you think I should proceed with the analysis? Is it
absolutely
>> necessary to remove non-unique reads for MEDIPS analysis?
>>
>> Is the first time I deal with this kind of analysis I would like to
>> undestand which is the best approach to follow.
>>
>> I tried to run MEDIPS.saturationAnalysis with the following samples
and
>> the
>> correalation looks fine:
>>
>> $numberReads
>> [1] 1890528
>>
>> $maxEstCor
>> [1] 1.890528e+06 9.997250e-01
>>
>> $maxTruCor
>> [1] 9.452640e+05 9.994605e-01
>>
>> Is it possible that such a low number of reads is sufficient to
generate a
>> saturated and reproducible methylation profile?
>>
>> Thank you very much for your time,
>> Paolo
>>
>> [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
>
[[alternative HTML version deleted]]