Hi there,
I'm trying to remove batch effects from some high-throughput data using the ComBat function in the sva package. It's not clear in the current tutorial guide (https://bioconductor.org/packages/release/bioc/vignettes/sva/inst/doc/sva.pdf) whether the variable of interest should be included or not in the model matrix.
In Section 7 it currently reads "Just as with sva, we then need to create a model matrix for the adjustment variables, including the variable of interest." However, the code given to set up the model matrix is:
> modcombat = model.matrix(~1, data=pheno)
It should be something like this if the variable of interest is to be included:
> modcombat = model.matrix(~as.factor(cancer), data=pheno)
So can anyone clarify which one it is? I know there were some similar questions raised around a year and a half ago (see https://support.bioconductor.org/p/63007/) after the ComBat developers started recommending not to include the any variables. So I'm wondering why the tutorial has been updated again to now say include the variables without updating the code to set up the model matrix. Is the old issue with including the variables fixed?
Cheers,
John
Thanks for the quick reply Evan! That helps a lot