Hello,
I have a question on using RUVSeq for removing batch effects for 8 samples of RNA-Seq data - 2 batches with 4 samples each.
The example data in the tutorial shows 2 conditions with 3 replicates each:
https://www.bioconductor.org/packages/release/bioc/vignettes/RUVSeq/inst/doc/RUVSeq.pdf
Following the steps in section 3 to estimate factors of unwanted variation using replicate samples, I am trying to define the matrix as:
differences <- matrix(data=c(1:2,3:4,5:6,7:8), byrow=TRUE, nrow=2)
PCA plot for set3 doesn't group the replicates:
However, if I define matrix for pairwise condition as:
differences1 <- matrix(data=c(1:2,3:4), byrow=TRUE, nrow=2)
set3_1 <- RUVs(set, genes, k=1, differences1)
plotPCA(set3_1, col=colors[x], cex=1.2)
and
differences2 <- matrix(data=c(5:6,7:8), byrow=TRUE, nrow=2)
set3_2 <- RUVs(set, genes, k=1, differences2)
plotPCA(set3_2, col=colors[x], cex=1.2)
Each PCA plot grouping of replicates for each condition.
Could you please let me know what is the correct approach to define the matrix and do DE analysis for pairs of conditions
Thanks
Sharvari
Yes, you should set the contrast coefficient for W_1 to zero, because that's not the comparison of interest. The W_1 variable is used only to adjust for unwanted variation.
There is a very nice Section on contrasts in the limma vignette that I suggest you read.