Hello, I'm doing some bulk RNA-Seq analysis to identify different genes in two condition (Braccio) but before that I want to identify some not known variable and delete the batch effect (RIN) of my variable. So I'm moving in this way:
dds_campione <- DESeqDataSetFromMatrix(countData=raw,
colData=condition_breakfast,
design=~ 1)
smallestGroupSize <- 3
keep <- rowSums(counts(dds_campione) >= 10) >= smallestGroupSize
dds_campione <- dds_campione[keep,]
dds_campione <- estimateSizeFactors(dds_campione)
matriceFiltrata_campione <- counts(dds_campione,normalized=TRUE)
mm_campione <- model.matrix(~ Braccio, colData(dds_campione))
mm0_campione <- model.matrix(~1, colData(dds_campione))
fit_campione <- svaseq(matriceFiltrata_campione, mod=mm_campione, mod0=mm0_campione)
dds_campione$SV1 <- fit_campione$sv[,1]
dds_campione$SV2 <- fit_campione$sv[,2]
dds_campione$SV3 <- fit_campione$sv[,3]
dds_campione$SV4 <- fit_campione$sv[,4]
dds_campione$SV5 <- fit_campione$sv[,5]
dds_campione$SV6 <- fit_campione$sv[,6]
dds_campione$SV7 <- fit_campione$sv[,7]
dds_campione$SV8 <- fit_campione$sv[,8]
design(dds_campione) <- ~ SV1 + SV2 + SV3 + SV4 + SV5 + SV6 + SV7 + SV8 + Braccio
vsd_campione <- vst(dds_campione, blind=FALSE)
assay(vsd_campione) <- limma::removeBatchEffect(assay(vsd_campione),
batch = condition_breakfast$RIN, design=mm_campione)
plotPCA(vsd_campione,intgroup=c( "Braccio"))
dds_campione <- DESeq(dds_campione)
res_campione <- results(dds_campione)
data_campione <- as.data.frame(res_campione)
data_campione <- tibble::rownames_to_column(data_campione, "geneID")
My question is:
When I look at the pca, my variance is high so removing the batch effect and using the sva seems to work but than, when I go one with the analysis in DESeq2, I do not see so much variabilites and that's because in "dds_campione <- DESeq(dds_campione)" I do not have the matix adjusted but the original one, how i can do? Because this matrix is with integers so I can not use it in the command. How I can use the adjusted matrix that I plot with the pca for the analysis?
I also try to insert the batch in the design
dds_campione <- DESeqDataSetFromMatrix(countData=raw,
colData=condition_breakfast,
design=~ RIN + Braccio)
But nothing really changes
Thank you!
You're not using the surrogate variables at all in your batch effect removal. Since it's continuous values you would need to provide it to the
covariates
argument inremoveBatchEffect
.