Getting batch corrected counts after RUVSeq
1
0
Entering edit mode
Shashank • 0
@fdf47975
Last seen 14 months ago
Germany

I have an RNASeq dataset with data from multiple sources and I have decided to use RUVSeq along with DESeq2 for the batch correction and DEG analysis. Following the information on the vignette here, I have the following code

#generating dds object and result
dds <- DESeq2::DESeqDataSetFromMatrix (countData = as.data.frame(final_merged_counts), colData = col_data , design = ~ Type )
dds$Type <- relevel(dds$Type,ref = "Control")
dds <- DESeq(dds,test = "LRT",reduced = ~1)
res <- results(dds)
#batch correction 
set <- newSeqExpressionSet(counts(dds))
idx <- rowSums(counts(set) >5) >=2
set <- set[idx,]
set <- betweenLaneNormalization(set,which = "upper")
not.sig <- rownames(set)[which(res$pvalue>0.1)]
empirical <- rownames(set)[rownames(set) %in% not.sig]
set <- RUVg(set,empirical, k=7)
pData(set)

#results
dds$W1 <- set$W_1
dds$W2 <- set$W_2
dds$W3 <- set$W_3
dds$W4 <- set$W_4
dds$W5 <- set$W_5
dds$W6 <- set$W_6
dds$W7 <- set$W_7
design(dds) <- ~ W1 + W2 + W3 + W4+ W5+W6+W7+ Type 
dds <- DESeq(dds)
dds <- dds[which(mcols(dds)$betaConv),]
result_after_ruv <- results(dds, contrast = c("Type", "Tumor", "Control"))

I would now like to use limma::removeBatchEffect on the matrix assay(vsd) however I am not sure how to incorporate the factors from RUVSeq with the function. I would like to get final batch corrected counts for downstream analysis. Does it make sense to use counts(dds,normalized=TRUE) I also tried

vsd <- vst(dds,blind=FALSE)
mat <- assay(vsd)
mat2 <- limma::removeBatchEffect(mat,batch = vsd$W1+vsd$W2+vsd$W3+vsd$W4+vsd$W5+vsd$W6+vsd$W7)

However the final results are not correct since all the values in a row are same

                TCGA-AB-2955-03A-01T-0734-13 TCGA-AB-2936-03A-01T-0740-13 TCGA-AB-2913-03A-01T-0734-13 TCGA-AB-2928-03A-01T-0740-13
ENSG00000000003                     4.380496                     4.380496                     4.380496                     4.380496
ENSG00000000005                     2.511648                     2.511648                     2.511648                     2.511648
ENSG00000000419                     9.187214                     9.187214                     9.187214                     9.187214
ENSG00000000457                     8.844177                     8.844177                     8.844177                     8.844177
ENSG00000000460                     8.529430                     8.529430                     8.529430                     8.529430
                TCGA-AB-2878-03A-01T-0734-13
ENSG00000000003                     4.380496
ENSG00000000005                     2.511648
ENSG00000000419                     9.187214
ENSG00000000457                     8.844177
ENSG00000000460                     8.529430

Would really appreciate any help/guidance on this issue. Thank you

BatchEffect RUVSeq RNASeq DESeq2 • 1.6k views
ADD COMMENT
1
Entering edit mode
ATpoint ★ 4.6k
@atpoint-13662
Last seen 2 days ago
Germany

You're using removeBatchEffect incorrectly. The batch argument takes factors, but you have numeric values. You can feed the matrix with the RUVg factors (I know, name collision, RUVg "factors" are the numeric values, but batch= takes factor()) into the covariate= argument of the limma function.

ADD COMMENT
0
Entering edit mode

Thank you for the answer. I understand the issue now, do you think this would be right way to obtain the counts then:

mat2 <- limma::removeBatchEffect(mat,covariates = vsd$W1+vsd$W2+vsd$W3+vsd$W4+vsd$W5+vsd$W6+vsd$W7)

Sorry I dont fully understand what you mean by "the matrix with the RUVg factors "

ADD REPLY
1
Entering edit mode

pData(set) should contain a matrix (or data.frame, not sure) with the RUVseq factors that can go into removeBatchEffect. Use this matrix with covariates argument, just the matrix, not with + or any formula.

ADD REPLY
0
Entering edit mode

Ah ok, sorry if this out of scope of this question but can you explain why this works? What I mean is that the dataframe, pData(set) only contains information about the RUVg factors and not the initial covariate from the experiment. Also since we do not specify any value for the batch argument, does the function still work as intended?

Thank you for your help.

ADD REPLY
1
Entering edit mode

you do not pass any group information to set so pDats only contains the factors.

ADD REPLY

Login before adding your answer.

Traffic: 577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6