Hi all,
I am working with the publicly available RNA-seq data from the GTEx database at https://storage.googleapis.com/gtexanalysisv7/rnaseqdata/GTExAnalysis2016-01-15v7RNASeQCv1.1.8genereads.gct.gz
I have normalized the count data using EdgeR's calcNormFactors() and cpm(x, log=TRUE) functions, and I am trying to run my differential analysis with DESeq. The DESeqDataSetFromMatrix() function returns "some values in assay are negative" after passing the normalized counts into the function, and I am not sure how to mitigate this error.
Is it possible for me to use the normalized data with DESeq(2), and if so, I would love to see how. Attached please see the full pipeline, and I appreciate all help greatly!
countdata <- unname(t(data.frame)) head(countdata, 10)
coldata["condition"] = condition coldata["color"] = condition.color coldata["cluster"] = condition.cluster head(coldata, 10)
head(condition, 10)
y <- DGEList(counts=countdata) keep <- filterByExpr(y) y <- y[keep, , keep.lib.sizes=FALSE] y <- calcNormFactors(y) data.scaled <- cpm(y, log=TRUE)
fvizpcaind(df.pca, label="none", habillage = condition.color, geom.ind="point")
dds <- DESeqDataSetFromMatrix(countData = t(data.scaled), colData = coldata, design= ~ condition)
dds.color <- DESeqDataSetFromMatrix(data.scaled = t(data.scaled), colData = coldata, design= ~ color)
dds.cluster <- DESeqDataSetFromMatrix(countData = t(data.scaled), colData = coldata, design= ~ cluster)
res <- results(dds, name = "results") summary(res)
res.color <- results(dds.color, name = "results.color") summary(res.color)
res.cluster <- results(dds.cluster, name = "results.cluster") summary(res.cluster)
That is great to know, thanks for the speedy answer! The reason I was looking to mix both functions was because I am comparing pipelines, between EdgeR and DESeq(2), and I wanted to keep the normalization step constant in my comparison. However, since I was comparing for performance measure, I think that I will stick with the recommended implementation of DESeq, although I would hugely appreciate it if you could also tell me how I could use norm-factors from edgeR in DESeq -- it's not a huge deal but I am curious. Thanks again for your advice!
-Neekon ,
If you want to truly compare pipeline performance, you should let each method do the normalization it's own way. Each method has been designed with its own philosophy and underlying assumptions, so mixing parts of one method with another is likely going to give you sub-optimal performance. (or, if you have enough time, you could test both mixed and unmixed and see for yourself).