Hi,
I am using minfi to analyze an Illumina 450K dataset, but due to an
experimental design that comprises more than 2 grps I am a lost on how
to properly go for 'bumphunting'... In addition, I do have 3 queries
regarding minfi's workflow. See below. Any suggestions would be
appreciated!
Thanks,
Guido
Design of my study: parallel design, before + after measurement for
each subject (baseline vs treatment, = paired), young and aged
subjects.
We included blood cells from 10 subjects in our study. Subjects were
divided into 2 grps, "young" (5x) and "old" (5x). At baseline (before
treatment) samles were taken (10x), and after 48hr of treatment (10x).
In total thus 20 samples were analysed on the 450k array. Each
treatment sample can be compared with it's respective baseline (paired
analysis).
Of interest are:
1) difference between young vs old at baseline (unpaired analysis, 10
vs 10),
2) effect of treatment in young (paired analysis, 5 vs 5),
3) effect of treatment in old (paired analysis, 5 vs 5),
4) difference in response due to age, thus CpGs that respond
differently in 3 vs 2 ["delta of the delta's"].
To identify the individual regulated CpGs for each comparison, I used
limma (on the M-values) using the multilevel model (thus taking into
account the paired structure using the duplicateCorrelation()
function).
However, considering the design of my experiment, how to optimally
analyse for enriched regions using the bumphunter() function?
Since I have multiple groups + paired structure, the design matrix
doesn't contain a column (coefficient) that corresponds to each of the
contrasts of interest (1-4). This is also returned as warning by the
bumphunter() function, and referred is to the vignette [which one I
don't know, but I guess it's from the library bumphunter]?
** Thus how to setup the design matrix that all 4 contrasts are
analysed with bumphunter?
Code:
> norm.swan
MethylSet (storageMode: lockedEnvironment)
assayData: 485512 features, 20 samples
element names: Meth, Unmeth
phenoData
sampleNames: 7766130001_R01C01 7766130001_R02C01 ...
7766130037_R05C02 (20 total)
varLabels: Sample_ID Sample_Plate ... filenames (15 total)
varMetadata: labelDescription
Annotation
array: IlluminaHumanMethylation450k
annotation: ilmn12.hg19
Preprocessing
Method: SWAN (based on a MethylSet preprocesses as 'Raw (no
normalization or bg correction)'
minfi version: 1.10.1
Manifest version: 0.4.0
>
> # convert MethylSet (norm.swan) into GenomicRatioSet through
RatioSet
> norm.swan2 <- ratioConvert(norm.swan, what = "both", keepCN = TRUE)
> norm.swan2 <- mapToGenome(norm.swan2)
Loading required package: IlluminaHumanMethylation450kanno.ilmn12.hg19
>
> # experimental factors
> treatment <- pd$TrtmntGrps
> subject <- as.factor(pd$SubjectID)
>
> design <- model.matrix(~0+ treatment + subject)
>
> design
treatmentoldctrl treatmentoldWY treatmentyoungctrl treatmentyoungWY
subject31 subject34 subject35
1 1 0 0 0
0 0 0
2 0 0 0 1
0 0 0
3 0 1 0 0
0 0 0
4 1 0 0 0
0 0 0
5 1 0 0 0
0 0 0
6 0 0 0 1
0 0 1
7 0 1 0 0
0 0 0
8 0 0 1 0
0 0 0
9 0 0 0 1
0 0 0
10 1 0 0 0
0 0 0
11 0 1 0 0
0 0 0
12 1 0 0 0
0 0 0
13 0 0 1 0
0 0 0
14 0 0 0 1
0 1 0
15 0 0 1 0
0 1 0
16 0 1 0 0
0 0 0
17 0 1 0 0
0 0 0
18 0 0 1 0
0 0 1
19 0 0 1 0
1 0 0
20 0 0 0 1
1 0 0
subject43 subject52 subject62 subject64 subject65 subject66
1 0 1 0 0 0 0
2 1 0 0 0 0 0
3
<<snip>>
> #sample (nonsense) bumphunter run to check everything worked
> bumps <- bumphunter(norm.swan2, design = design, coef=1, B = 1, type
= "Beta", cutoff = 0.10)
[bumphunterEngine] Using a single core (backend: doSEQ, version:
1.4.2).
[bumphunterEngine] Computing coefficients.
[bumphunterEngine] Performing 1 permutations.
[bumphunterEngine] Computing marginal permutation p-values.
[bumphunterEngine] cutoff: 0.1
[bumphunterEngine] Finding regions.
[bumphunterEngine] Found 205534 bumps.
[bumphunterEngine] Computing regions for each permutation.
Loading required package: rngtools
Loading required package: pkgmaker
Loading required package: registry
Attaching package: 'pkgmaker'
The following object is masked from 'package:IRanges':
new2
[bumphunterEngine] Estimating p-values and FWER.
Warning message:
In bumphunterEngine(getMethSignal(object, type), design = design, :
The use of the permutation test (B>0), is not recommended with
multiple covariates, (ncol(design)>2). See vignette for more
information.
>
I also have three small queries on the workflow:
** I normalized my data using SWAN. To use these data as input for
bumphunter(), I had to do a 2-step conversion [ratioConvert() followed
by mapToGenome()]. Is this the way to do this, or is there another
(direct) way?
> norm.swan2 <- ratioConvert(norm.swan, what = "both", keepCN = TRUE)
> norm.swan2 <- mapToGenome(norm.swan2)
** To confirm, in the bumphunter() function, when 'type=beta' a
'cutoff=0.10' means that the absolute difference between the beta
values of the 2 grps should at least be 0.10? Thus at least a
difference of 10% in methylation status of one of the CpGs in the
region (bump)?
** What is the current way of annotating the 450k data? Making use of
"FDb.InfiniumMethylation.hg19" from Tim,
"IlluminaHumanMethylation450kanno.ilmn12.hg19" from Kasper, or an
allternative method?
> sessionInfo()
R Under development (unstable) (2013-11-19 r64265)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United
States.1252
attached base packages:
[1] parallel stats graphics grDevices utils datasets
methods base
other attached packages:
[1] doRNG_1.6 rngtools_1.2.4
[3] pkgmaker_0.22 registry_0.2
[5] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.2.1
IlluminaHumanMethylation450kmanifest_0.4.0
[7] minfi_1.10.1
bumphunter_1.4.1
[9] locfit_1.5-9.1
iterators_1.0.7
[11] foreach_1.4.2
Biostrings_2.32.0
[13] XVector_0.4.0
GenomicRanges_1.16.3
[15] GenomeInfoDb_1.0.2 IRanges_1.22.6
[17] lattice_0.20-29 Biobase_2.24.0
[19] BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] annotate_1.42.0 AnnotationDbi_1.26.0 base64_1.1
beanplot_1.1 codetools_0.2-8 compiler_3.1.0
[7] DBI_0.2-7 digest_0.6.4 genefilter_1.46.1
grid_3.1.0 illuminaio_0.6.0 limma_3.20.2
[13] MASS_7.3-33 matrixStats_0.8.14 mclust_4.3
multtest_2.20.0 nlme_3.1-117 nor1mix_1.1-4
[19] plyr_1.8.1 preprocessCore_1.26.1 R.methodsS3_1.6.1
RColorBrewer_1.0-5 Rcpp_0.11.1 reshape_0.8.5
[25] RSQLite_0.11.4 siggenes_1.38.0 splines_3.1.0
stats4_3.1.0 stringr_0.6.2 survival_2.37-7
[31] tools_3.1.0 XML_3.98-1.1 xtable_1.7-3
zlibbioc_1.10.0
>
---------------------------------------------------------
Guido Hooiveld, PhD
Nutrition, Metabolism & Genomics Group
Division of Human Nutrition
Wageningen University
Biotechnion, Bomenweg 2
NL-6703 HD Wageningen
the Netherlands
tel: (+)31 317 485788
fax: (+)31 317 483342
email: guido.hooiveld@wur.nl
internet:
http://nutrigene.4t.com
http://scholar.google.com/citations?user=qFHaMnoAAAAJ
http://www.researcherid.com/rid/F-4912-2010
[[alternative HTML version deleted]]