Entering edit mode
hi Daniel,
On Mon, Apr 1, 2013 at 6:41 PM, daniel.aguirre
<daniel.aguirre@cbm.uam.es>wrote:
> Hi,
>
> I´m a little puzzled about your 'Di erential analysis of count data
{ the
> DESeq2 package' protocol.
>
> I was trying it with two samples and got the DE results, then I
tried the
> suggested transformations:
>
> (being 'des' my previous results, just as it appears in the
'manual')
>
> dseBlind <- dse
> design(dseBlind) <- formula(~ 1)
> dseBlind <- estimateDispersions(dseBlind)
>
> rld <- rlogTransformation(dseBlind)
> vsd <- varianceStabilizingTransformat**ion(dseBlind)
>
> At this point I had assumed that the 'rld' and 'vsd' objects are
like
> 'dse' but with the transformations, however whe i try to retrieve
the
> results I get:
>
> Prueba.rld.res <- results(rld)
>>
> Error in tail(all.vars(design(object)), 1) :
> error in evaluating the argument 'x' in selecting a method for
function
> 'tail': Error in (function (classes, fdef, mtable) :
> unable to find an inherited method for function design for
signature
> "SummarizedExperiment"
>
> Am I missing something, should I instead use the 'rld' or 'vsd'
objects
> with my DE analysis somehow???
>
>
Both varianceStabilizingTransformation and rlogTransformation return
SummarizedExperiment objects: see the value section of the man pages
for
these functions, and the transformed values are accessed using the
assay()
accessor, see the GenomicRanges manual pages on SummarizedExperiment.
(you
can do class(dse) or class(rld) to see what kind of object you have)
Section 7 and 8 in the vignette no longer have to do with DE analysis,
maybe we should make this more clear in the vignette. Here we describe
optional transformations of the data which might be useful for other
applications, such as clustering, which might give nicer results when
the
variance is relatively constant across the range of values. For
example we
show a hierarchical clustering of the samples by transformed values in
Figure 8 of the vignette.
> many many thanks!!
>
> (also, I assume that the aanlysis takes into account differences in
> library depth and hence normalizes in this regard?)
>
> if I have several conditions (only one sample each though) should I
> counduct pairwise analyses or would it be better to pool them
together so
> that the dispersion model is better? how would the formula be
written in
> that case?
> cheers!
if you have several conditions for one factor, we address this in
Section G
of the vignette on multi-level conditions. You just need to specify
which
level is the base level. Then in the DE analysis, the other two
levels
will be compared against this one. We are working to implement the
contrasts between all 3.
If you have only one replicate per condition, you can treat the
samples as
replicates in order to calculate dispersion. In the original DESeq
paper,
they advise, "While one may not want to draw strong conclusions from
such
an analysis, it may still be useful for exploration and hypothesis
generation." This is done automatically for the 2 sample case, but I
still
need to generalize this code. You can use the code below in the
meantime:
The recommended pipeline then, for three samples with something like
colData(dse)$condition <- factor(c("ctrl","A","B"),
levels=c("ctrl","A","B")), would be:
design(dse) <- ~ 1
dse <- estimateSizeFactors(dse)
dse <- estimateDispersions(dse)
design(dse) <- ~ condition
dse <- nbinomWaldTest(dse)
resultsNames(dse) # prints out the names of the variables in the final
model
results(dse,"conditionA") # gets the table of logFC, p-values and FDRs
for
a single variable
results(dse,"conditionB")
Mike
[[alternative HTML version deleted]]