Hi!
I'm working on dataset with 100 samples run on Illumina Infinium HumanMethylation450 BeadChip array.
I've performed PCA on all samples using all the QC passed probes before and after normalization and tested for association between each PC and independent experimental variables like BCD batch (2 batches), Experiment batch (3 batches), Sentrix ID (9 chips), Sentrix position, cell components and sample groups using linear regression.
I find large amount of variation in my data due to BCD batch, Experiment batch and Sentrix ID even after normalization and I wish to adjust for these batch effects before I proceed to differential methylation analysis.
I've few questions about applying ComBat.mc (ENmix package) for the same.
1) Can I combine three batch variables into one factor as follows:
library(minfi)
library(ENmix)
baseDir <- "E:/IDPP3_450K_PILOT/Pilot_Analysis_12-2-16/450KPilot_Edit"
targets <- read.metharray.sheet(baseDir)
batchcom <-factor(paste(targets$BCD_Batch,targets$Experiment_Batch,targets$Sentrix_ID,sep="."))
beta <- read.csv("beta.csv", row.names=1, check.names=FALSE)
beta <- as.matrix(beta)
beta_combat<-ComBat.mc(beta, batchcom, nCores=8, mod=NULL)
2) Use ComBat.mc multiple times, adjusting for the first batch and then adjust for the second batch, and so on. If so, will the order affect the batch adjustment
batch1=factor(targets$BCD_Batch)
batch2=factor(targets$Experiment_Batch)
batch3=factor(targets$Sentrix_ID)
beta_combat1<-ComBat.mc(beta, batch1, nCores=8, mod=NULL)
beta_combat2<-ComBat.mc(beta_combat1, batch2, nCores=8, mod=NULL)
beta_combat3<-ComBat.mc(beta_combat2, batch3, nCores=8, mod=NULL)
Should I be using combat from sva package instead of combat.mc from ENmix package?
3) Is using M-values preferred over beta values for these adjustments?
Thanks in advance for your help.
Regards,
Priyanka
You can combine the batches in one factor and adjust this way. Just check the results with a PCA or PVCA analysis. You can also do it sequentially, I don't know if the order matters, the point is use PVCA to check your batch effect is gone which ever method you use. I'd prefer to do it in one step because I don't like transforming my data many times. I would use COMBAT from SVA package because I have not heard of the other package. Not sure about your third question.... ATB
Hi Chris,
Thanks for your inputs.
I have done batch adjustment by combat.mc function from ENmix using individual batches and also all possible permutation combinations with different orders. At least with my data, final results (beta/M values after combat) did not differ with respect to the method used i.e, adjustment with all batches in one factor or sequential adjustment. The order didn't matter too. I've done a PCA and checked the percentage variance explained by top principal components after different batch adjustments and they are identical.
P.S: combat.mc enables to take advantage of multiple processors, the core function remains the same as combat from sva.
Also, with my data, adjustment for Sentrix ID alone removed the batch effects coming from Sentrix ID, Experimental batch and BCD batch!! So I've adjusted my data just for Sentrix ID.
Cool, you using PVCA package to check your principle components? I only found out about this recently and it is super useful....