Hi,
I have some 450k data for ~1000 samples. I want to use ComBat to adjust for 2 known batches. I would like to adjust for Age_Group and Sentrix_ID (cf. example below). My variable of interest is Sample_Group. I don't really know how to proceed (run ComBat twice? which is yes accordingly to old posts) and what to put in the model.matrix. I was thinking to do:
First:
batch <- pheno$Age_Group mod <- model.matrix(~as.factor(Sample_Group) + as.factor(Sentrix_ID) + as.factor(Sentrix_Position), pheno)
And then:
batch <- pheno$Sentrix_ID mod <- model.matrix(~as.factor(Sample_Group) + as.factor(Age_Group) + as.factor(Sentrix_Position), pheno)
>data_table()
sample1 | sample2 | sample3 | sample4 | sample5 | |
---|---|---|---|---|---|
probe1 | 2.705917 | 2.741391 | 1.9946831 | 2.685013 | 3.176680 |
probe2 | 3.257425 | 2.031391 | -3.5303723 | 2.474620 | 1.859015 |
probe3 | 1.725112 | -5.941922 | 0.8883048 | 5.792727 | -5.632866 |
probe4 | 3.594785 | -6.1409508 | 3.047706 | 1.9946831 | 3.479367 |
>pheno_table()
Sample_Name | Age_Group | Sample_Group | Sentrix_ID | Sentrix_Position | Disease_Group |
---|---|---|---|---|---|
sample1 | A | A | 9376537155 | R01C01 | Patient |
sample2 | B | B | 9376537256 | R02C02 | Patient |
sample3 | A | D | 9376537155 | R02C02 | Control |
sample4 | C | A | 9376537155 | R01C06 | Control |
sample5 | D | D | 9376537256 | R02C05 | Patient |
sample6 | B | C | 9376537100 | R05C02 | Patient |
Thank you in advance for your assistance.
Hi ben.run974,
did you find a solution yet if this is the correct way?
Thanks!
Sebastian
I also want to use Combat to adjust my data for multiple confounders, but I am not sure this sounds right. What if potential confounders were interacting, for example age and medication?
A more detailed explanation as to why one shouldn't run Combat like this https://support.bioconductor.org/p/93457/#93467