Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.2 years ago
Hi!
I'm writing with a few questions about applying ComBat (sva package)
to a set of ~180 samples run on the the Illumina Infinium
HumanMethylation450 BeadChip array (~450,000 DNA methylation data
points).
There is a large amount of variation in my data due to the plate the
samples were run on (3 different plates), the chip they were run on
(24 different chips) and the position they were located on the chip -
specifically the row (6 different rows). The chips are set up in a 6
row * 2 column format like this:
sample 01 sample 02
sample 03 sample 04
sample 05 sample 06
sample 07 sample 08
sample 09 sample 10
sample 11 sample 12
I read Dr. Evan Johnson's suggestions to someone else with this
"multiple-batch-effect-variable" problem in the ComBat google group
(https://groups.google.com/forum/#!topic/combat-user-
forum/PcTxNlaUmAI). He had 2 suggestions:
- Combine the two batch variables into one, if 3-4 reps are left in
each batch
- Use ComBat multiple times, adjusting for the first batch using the
other batch variables as covariates, and then adjust for the second
batch, and so on
I cannot go with the first suggestion because combining the batch
variables would create too many categories and I would not have enough
replicates per batch category.
I am seeking advice on the points:
- The google group post is now a few years old, is it still thought
that the step-wise correction is a valid approach?
- The google group post also was asking about adjusting for 2, not 3
batch variables, does this concern anyone more if I apply ComBat 3
times?
- Row would be better treated as a continuous adjustment variable than
a factor. In the version of sva that I am using (3.0.2) I believe that
only factor adjustment variables are supported. I have seen mention in
a few forums that there might be an update to ComBat to adjust for a
numeric batch variable, is one available?
Thank you in advanced for your help!
Magda Price, UBC
-- output of sessionInfo():
R version 2.14.0 (2011-10-31)
Platform: x86_64-pc-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] grid stats graphics grDevices utils datasets
methods base
other attached packages:
[1] sva_3.0.2 mgcv_1.7-22
corpcor_1.6.4 wateRmelon_1.2.2
[5] IlluminaHumanMethylation450k.db_1.4.6 org.Hs.eg.db_2.6.4
RSQLite_0.11.2 DBI_0.2-5
[9] AnnotationDbi_1.16.19 matrixStats_0.6.2
ROC_1.30.0 limma_3.10.3
[13] RColorBrewer_1.0-5 gplots_2.11.0
MASS_7.3-16 KernSmooth_2.23-6
[17] caTools_1.14 gdata_2.12.0
gtools_2.7.1 compare_0.2-3
[21] lattice_0.20-10 lumi_2.6.0
nleqslv_2.0 methylumi_2.0.13
[25] Biobase_2.14.0
loaded via a namespace (and not attached):
[1] affy_1.32.1 affyio_1.22.0 annotate_1.32.3
BiocInstaller_1.2.1 bitops_1.0-5 hdrcde_2.15
IRanges_1.12.6 Matrix_1.0-5
[9] nlme_3.1-108 preprocessCore_1.16.0 R.methodsS3_1.4.2
tools_2.14.0 xtable_1.7-1 zlibbioc_1.0.1
--
Sent via the guest posting facility at bioconductor.org.