I am trying to adjust my RNA data using ComBat-Seq
(from sva
R package) since I realised that there are 3 batches that I need to adjust for:
- Place (2 levels: place 1, place 2)
- Library Preparation Date (16 levels - different dates)
- Type of tube (2 levels: A, B)
I have 960 samples and around 62000 genes.
In my biological matrix, I have: Age, Sex, Group (cases,controls..) and WBC counts.
biol_mat = model.matrix(~Age + as.factor(Sex) + as.factor(Group) + LYMPH + MONO + NEUT, data=phenotype)
In the tutorial of Combat-Seq
appears how to adjust by 1 variable but it doesn't tell you how to adjust by more than 1.
I have seen a lot of posts using combat
that the only way is to adjust by 1 variable and then, with those results, adjust again by the 2nd variable and so on.
That would be:
Adjust by library prep.
raw.cts_adjustedLibPrep <- ComBat_seq(counts = raw_cts_matrix, batch=batch_libraryprep, group=NULL, covar_mod = biol_mat)
Adjust by library prep + type of tube.
raw.cts_adjusted_LibPrep_TypeTube <- ComBat_seq(counts = raw.cts_adjustedLibPrep, batch=batch_type_tube, group=NULL, covar_mod = biol_mat)
Adjust by library prep + type of tube + place
raw.cts_adjusted_LibPrep_TypeTube_Place <- ComBat_seq(counts = raw.cts_adjusted_LibPrep_TypeTube, batch=batch_place, group=NULL, covar_mod = biol_mat)
For the first adjustment (library prep) it takes around 15min, but for the second... it has been running for more than 2 days.. I stopped it and launch it again, changing the adjustment but I am not sure if it will work..
Does anybody have an idea about how to fix the problem?
Thanks in advance