DESeq: GLM for multi-factor and mulit-level designs
1
0
Entering edit mode
@dorota-herman-5627
Last seen 10.1 years ago
Hello everyone, I am currently working on GLM for RNA-seq data in DESeq for a design with two factors and three factor levels, such as: type experiments A_1 A exper1 A_2 A exper2 A_3 A exper3 B_1 B exper2 B_2 B exper3 B_3 B exper4 C_1 C exper4 C_2 C exper2 C_3 C exper1 In the recent example of the GLM application there is a fixme note: ?Can we add a paragraph on what to do if we only want to make specific comparisons, e.g. let fac be a factor with three levels A, B and C, and we want to test pairwise for differences between A and B (without C), between A and C (without B) etc.? .... and between B and C (without A) Is it possible to conduct such a analysis with the current DESeq version? Thanks a lot Dorota -- ================================================================== Dorota Herman, PhD VIB Department of Plant Systems Biology, Ghent University Technologiepark 927 9052 Gent, Belgium Tel: +32 (0)9 3313692 Email:dorota.herman at psb.vib-ugent.be Web: http://www.psb.ugent.be
DESeq DESeq • 1.6k views
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 2 days ago
Icahn School of Medicine at Mount Sinaiā€¦
If you are more familiar with edgeR, I have written a version of edgeR's glmFit that takes a DESeq CountDataSet and returns an edgeR DGEGLM object. Then you can use edgeR's GLM functionality, which directly supports the kind of comparisions you are seeking. I have tested these functions and found that they produce identical results to DESeq's GLM methods when running the same test on the same CountDataSet object, so I believe that they will also produce valid results for tests that DESeq's methods do not directly support. If people find these useful, I'd like to get them included in Bioconductor, but I'm not sure which package they belong in. Suggestions welcome. ## The following two functions allow the use of edgeR's infrastructure ## to execute the DESeq method. Instead of glmFit(dge, design), use ## glmFit.CountDataSet(cds, design), then continue as with a normal ## edgeR analysis. getOffset.CountDataSet <- function(y) { if anyis.na(sizeFactors(y)))) stop("Call estimateDispersions first") log(sizeFactors(y)) - mean(log(sizeFactors(y) / colSums(counts(y)))) } glmFit.CountDataSet <- function (y, design = NULL, dispersion = NULL, offset = NULL, weights = NULL, lib.size = NULL, start = NULL, method = "auto", ...) { stopifnot(is(y, "CountDataSet")) if (is.null(dispersion)) { if ("disp_pooled" %in% colnames(fData(y))) dispersion <- fData(y)$disp_pooled else if ("disp_blind" %in% colnames(fData(y))) { if (fitInfo(y, "blind")$sharingMode != "fit-only") warning("You have used 'method=\"blind\"' in estimateDispersion without also setting 'sharingMode=\"fit-only\"'. This will not yield useful results.") dispersion <- fData(y)$disp_blind } else stop("Call 'estimateDispersions' with 'method=\"pooled\"' (or 'blind') first.") } if (is.null(offset) && is.null(lib.size)) offset <- getOffset.CountDataSet(y) ## UGLY HACK: DESeq can produce infinite dispersion estimates, which ## cause errors in glmFit. To fix this, we replace infinite ## dispersion estimates with the maximum representable floating ## point value, which should always result in a PValue of 1. infdisp <- !is.finite(dispersion) dispersion[infdisp] <- .Machine$double.xmax fit <- glmFit.default(y = counts(y), design = design, dispersion = dispersion, offset = offset, weights = weights, lib.size = lib.size, start = start, method = method, ...) ## Now set the dispersions back to infinite in the resulting fit object. fit$dispersion[infdisp] <- +Inf dimnames(fit$coefficients) <- list(rownames(counts(y)), colnames(design)) fit$counts <- counts(y) fit$samples <- pData(y)[-1] fit$genes <- fData(y)[setdiff(names(fData(y)), c("disp_blind", "disp_pooled"))] new("DGEGLM", fit) } On 11/26/2012 08:34 AM, Dorota Herman wrote: > Hello everyone, > > I am currently working on GLM for RNA-seq data in DESeq for a design > with two factors and three factor levels, such as: > > type experiments > A_1 A exper1 > A_2 A exper2 > A_3 A exper3 > B_1 B exper2 > B_2 B exper3 > B_3 B exper4 > C_1 C exper4 > C_2 C exper2 > C_3 C exper1 > > In the recent example of the GLM application there is a fixme note: > > ?Can we add a paragraph on what to do if we only want to make specific > comparisons, e.g. let fac be a factor with three levels A, B and C, > and we want to test pairwise for differences between A and B (without > C), between A and C (without B) etc.? .... and between B and C > (without A) > > Is it possible to conduct such a analysis with the current DESeq version? > > Thanks a lot > Dorota >
ADD COMMENT

Login before adding your answer.

Traffic: 472 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6