Entering edit mode
Within the last couple of weeks, the bsseq package (new) was added to
the devel repository,
http://bioconductor.org/packages/2.11/bioc/html/bsseq.html
This package is used for the analysis of whole-genome bisulfite
sequencing data, used for studying DNA methylation.
The package contains the reference implementation of the non-alignment
portion of the BSmooth algorithm, first used in
Hansen, K. D. et al. Increased methylation variation in epigenetic
domains across cancer types. Nat Genet 43, 768?775 (2011).
and better described in
Hansen, K.D. et al. BSmooth: from whole genome bisulfite sequencing
reads to differentially methylated regions. Genome Biology, in press
(2012).
The remaining part of BSmooth (the alignment suite) can be obtained
from
http://rafalab.jhsph.edu/bsmooth/
In addition to the implementation of the main statistical algorithm,
the package contains tools for handling and manipulating whole-genome
bisulfite sequencing data. We have also used these tools to handle
capture (targetted) bisulfite sequencing, although the statistical
algorithm (BSmooth) does not handle such data out of the box. The
parallel package is used to support parallel computing across multiple
cores in a single node. In order to handle human genome wide data,
you will need access to a machine with a reasonable amount of RAM (say
32GB).
The package contains two vignettes, including one that contains a
rough analysis of a 3 vs 3 colon cancer data on chr21 and chr22 from
the first publication above. This data is contained in the
experimental package bsseqData. The vignette(s) contains more detail
on limitations of the package.
While the package is new on Bioconductor, we have been using earlier
versions internally for quite a while. The most recent changes (for
people who may have had access to earlier versions) include vastly
improved documentation.
Best,
Kasper D Hansen