Dear All,
I will now start and anlyze some MeDIP seq data with MEDIPS
Bioconductor
Package
I went through reading all the MEDIPS manual,
I have to compare methylation profile of two cell lines, I have the
Input
of both of them
,
why in the example refered in the manual there is only one INPUT.SET
for
two conditions?
CONTROL.SET, TREAT.SET, and INPUT.SET
Any suggestions?
Thanks,
Paolo
[[alternative HTML version deleted]]
Dear Lucas Chavez
I followed MEDUSA protocol to filter out both not properly paired, low
quality mapping and non-unique sequences from my alignment files to
use
MEDIPS fur further analysis of DMR.
For example one mC sample started with 100 milions reads. 80 % mapped,
70 %
of them properly mapped with high quility (mapQ>40).
The problem arises when I filter out for non-unique reads. Roughly 90
% are
discarded leading to a final number of 2-4 milions of reads.
All my mC samples behave in the same way.
Maybe the DNA starting material was not properly quantified (2-3 ng
instead
of 5 ng were used for the generation of the libraries).
We didn't observe the same problem for the Input DNA ( correctly
quantified) and for 2 samples out of 4 for 5-hydroxy-mC.
The high number of non-unique reads could be due to a technical
problem or
a biological problem? Have you ever experienced a similar problem?
How do you think I should proceed with the analysis? Is it absolutely
necessary to remove non-unique reads for MEDIPS analysis?
Is the first time I deal with this kind of analysis I would like to
undestand which is the best approach to follow.
I tried to run MEDIPS.saturationAnalysis with the following samples
and the
correalation looks fine:
$numberReads
[1] 1890528
$maxEstCor
[1] 1.890528e+06 9.997250e-01
$maxTruCor
[1] 9.452640e+05 9.994605e-01
Is it possible that such a low number of reads is sufficient to
generate a
saturated and reproducible methylation profile?
Thank you very much for your time,
Paolo
[[alternative HTML version deleted]]
Dear Paolo,
> why in the example refered in the manual there is only one INPUT.SET
for
two conditions?
Currently, MEDIPS allows for only one control, one treatment, and one
combined Input data set. However, there is obviously a desperate need
for
considering replicates per group as well as individual Input data
sets.
Therefore (and because of many other issues), I have extensively
revised
the MEDIPS package which will allow for processing replicates per
condition
as well as two groups of Input data. I intend to update the MEDIPS
package
as soon as possible, especially in advance of the next Bioconductor
release. Nevertheless, it is not clear how you designed your
experiments
and your analysis strategy? The MEDIPS update will be helpful, e.g. in
case
you are comparing two groups of IP-seq samples and you want to
consider two
according groups of Input samples in order to identify genomic
variants
that influence the IP enrichments.
>I followed MEDUSA protocol MEDUSA protocol (...)
I greatly appreciate that MEDIPS has been incorporated in other
analysis
pipelines. However, please excuse that I can only comment on issues
and
functionalities of the MEDIPS package.
> (...) when I filter out for non-unique reads. Roughly 90 % are
discarded
(...)
This issue may refer to amplification and oversequencing problems and
there
are different opinions about unique reads. However, the fraction of
non-unique reads in you sequencing data is an issue that goes beyond
what I
can discuss here. Currently, MEDIPS allows for considering all reads
or for
replacing all unique reads (or maybe better: reads that map to the
same
genomic position) by one representative. However, you can pre-filter
your
input files by any estimate of global or local thresholds for non-
unique
reads and continue using MEDIPS by considering all given mapping
results.
>Is it possible that such a low number of reads is sufficient to
generate a
saturated and reproducible methylation profile?
This depends on the methylaion status of your reference genome. In
case you
are studying the methylation status of a small and only barely
methylated
genome, your results might be reasonable.
All the best,
Lukas
Dear All,
I will now start and anlyze some MeDIP seq data with MEDIPS
Bioconductor
Package
I went through reading all the MEDIPS manual,
I have to compare methylation profile of two cell lines, I have the
Input
of both of them
,
why in the example refered in the manual there is only one INPUT.SET
for
two conditions?
CONTROL.SET, TREAT.SET, and INPUT.SET
Any suggestions?
Thanks,
Paolo
On Fri, Feb 15, 2013 at 5:33 AM, Paolo Kunderfranco <
paolo.kunderfranco@gmail.com> wrote:
> Dear Lucas Chavez
>
> I followed MEDUSA protocol to filter out both not properly paired,
low
> quality mapping and non-unique sequences from my alignment files to
use
> MEDIPS fur further analysis of DMR.
>
> For example one mC sample started with 100 milions reads. 80 %
mapped, 70 %
> of them properly mapped with high quility (mapQ>40).
> The problem arises when I filter out for non-unique reads. Roughly
90 % are
> discarded leading to a final number of 2-4 milions of reads.
> All my mC samples behave in the same way.
>
> Maybe the DNA starting material was not properly quantified (2-3 ng
instead
> of 5 ng were used for the generation of the libraries).
> We didn't observe the same problem for the Input DNA ( correctly
> quantified) and for 2 samples out of 4 for 5-hydroxy-mC.
>
> The high number of non-unique reads could be due to a technical
problem or
> a biological problem? Have you ever experienced a similar problem?
> How do you think I should proceed with the analysis? Is it
absolutely
> necessary to remove non-unique reads for MEDIPS analysis?
>
> Is the first time I deal with this kind of analysis I would like to
> undestand which is the best approach to follow.
>
> I tried to run MEDIPS.saturationAnalysis with the following samples
and the
> correalation looks fine:
>
> $numberReads
> [1] 1890528
>
> $maxEstCor
> [1] 1.890528e+06 9.997250e-01
>
> $maxTruCor
> [1] 9.452640e+05 9.994605e-01
>
> Is it possible that such a low number of reads is sufficient to
generate a
> saturated and reproducible methylation profile?
>
> Thank you very much for your time,
> Paolo
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
[[alternative HTML version deleted]]