Question

MeDIPS: warning when using "ttest" settings in MEDIPS.meth

0

Entering edit mode

stb • 0

@stb-11175

Last seen 7.8 years ago

Hi,

I'm currently working with my MeDIP-seq data using MEDIPS in R. I have 18 IP samples (no Input) in 3 groups with PE50. For mapping, I have used bwa for Illumina with default settings in Galaxy.

Arguments for creating MEDIPS SET:

BSgenome <- 'BSgenome.Mmusculus.UCSC.mm9'

uniq <- 1e-3

ws <- 500

chr.select <- c("chr1","chr2","chr3","chr4","chr5","chr6","chr7","chr8","chr9","chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16", "chr17", "chr18", "chr19")

paired = TRUE

However, when using MEDIPS.meth to calculate differential coverage between two groups (Control_MeDIP and LPS_MeDIP), I get the following warning:

> mr.edgeR_Control_LPS <- MEDIPS.meth(MSet1 = Control_MeDIP, MSet2 = LPS_MeDIP, chr = chr.select, p.adj = "BH", diff.method = "ttest", MeDIP = F, CNV = F, minRowSum = 10, diffnorm = "rpkm")

Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Preprocessing MEDIPS SET 1 in MSet1...
Preprocessing MEDIPS SET 2 in MSet1...
Preprocessing MEDIPS SET 3 in MSet1...
Preprocessing MEDIPS SET 4 in MSet1...
Preprocessing MEDIPS SET 5 in MSet1...
Preprocessing MEDIPS SET 6 in MSet1...
Preprocessing MEDIPS SET 1 in MSet2...
Preprocessing MEDIPS SET 2 in MSet2...
Preprocessing MEDIPS SET 3 in MSet2...
Preprocessing MEDIPS SET 4 in MSet2...
Preprocessing MEDIPS SET 5 in MSet2...
Preprocessing MEDIPS SET 6 in MSet2...
Extracting data for chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 ...
4944695 windows on chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19
Differential coverage analysis...
Extracting count windows with at least 10 reads...
Calculating score for 573068 windows...
Adjusting p.values for multiple testing...
Please note, log2 ratios are reported as log2(MSet1/MSet2).
Creating results table...
Adding differential coverage results...

Warning messages:
1: In err0^4/(na0 - 1) :
longer object length is not a multiple of shorter object length
2: In err1^4/(na1 - 1) :
longer object length is not a multiple of shorter object length
3: In (err0^2 + err1^2)^2/(err0^4/(na0 - 1) + err1^4/(na1 - 1)) :
longer object length is not a multiple of shorter object length
4: In df[fi] <- (err0^2 + err1^2)^2/(err0^4/(na0 - 1) + err1^4/(na1 - :
number of items to replace is not a multiple of replacement length

If I increase minRowSum to 30 or change diff.method to "edgeR" I get rid of the warning, but since I have six replicates in each MEDIPS sets I would like to use "ttest", and I would like to decrease minRowSum to below 30.

Any suggestions why I get this warning, and how I can use "ttest" without having to increase minRowSum to 30?

Thanks!

Stine

Output from sessionInfo():

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=Danish_Denmark.1252 LC_CTYPE=Danish_Denmark.1252 LC_MONETARY=Danish_Denmark.1252 LC_NUMERIC=C
[5] LC_TIME=Danish_Denmark.1252

attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] BSgenome.Mmusculus.UCSC.mm9_1.4.0 MEDIPS_1.23.2 Rsamtools_1.25.1 BiocInstaller_1.23.6
[5] BSgenome_1.41.2 rtracklayer_1.33.11 Biostrings_2.41.4 XVector_0.13.7
[9] GenomicRanges_1.25.93 GenomeInfoDb_1.9.4 IRanges_2.7.12 S4Vectors_0.11.10
[13] BiocGenerics_0.19.2

loaded via a namespace (and not attached):
[1] AnnotationDbi_1.35.4 DNAcopy_1.47.1 edgeR_3.15.2 zlibbioc_1.19.0
[5] GenomicAlignments_1.9.6 BiocParallel_1.7.6 lattice_0.20-33 tools_3.3.1
[9] SummarizedExperiment_1.3.81 grid_3.3.1 Biobase_2.33.0 DBI_0.5
[13] gtools_3.5.0 preprocessCore_1.35.0 Matrix_1.2-6 bitops_1.0-6
[17] biomaRt_2.29.2 RCurl_1.95-4.8 RSQLite_1.0.0 limma_3.29.17
[21] locfit_1.5-9.1 XML_3.98-1.4

Output from traceback():

9: q2qnbinom(y, input.mean = input.mean, output.mean = output.mean,
dispersion = dispersion)
8: equalizeLibSizes.default(y, group = group, dispersion = disp,
lib.size = lib.size)
7: equalizeLibSizes(y, group = group, dispersion = disp, lib.size = lib.size)
6: estimateCommonDisp.default(y$counts, group = group, lib.size = lib.size,
tol = tol, rowsum.filter = rowsum.filter, verbose = verbose,
...)
5: estimateCommonDisp(y$counts, group = group, lib.size = lib.size,
tol = tol, rowsum.filter = rowsum.filter, verbose = verbose,
...)
4: estimateCommonDisp.DGEList(d)
3: edgeR::estimateCommonDisp(d)
2: MEDIPS.diffMeth(base = base, values = counts.medip, diff.method = "edgeR",
nMSets1 = nMSets1, nMSets2 = nMSets2, p.adj = p.adj, n.r.M1 = n.r.M1,
n.r.M2 = n.r.M2, MeDIP = MeDIP, minRowSum = minRowSum, diffnorm = diffnorm)
1: MEDIPS.meth(MSet1 = Control_MeDIP, MSet2 = LPS_MeDIP, chr = chr.select,
p.adj = "BH", diff.method = "edgeR", MeDIP = F, CNV = F,
minRowSum = 30, diffnorm = "none")

R medips medip-seq • 2.3k views

ADD COMMENT • link 8.7 years ago • updated 7.8 years ago stb • 0

score 0 · Answer 1 · 2016-08-16

Dear Stine, thank you for your reporting the warning in ’ttest’ mode. I am actually puzzled why the “ttest” mode depends on the minRowSum parameter and I cannot immediately reproduce the warning. Do you receive a result table in spite of the warning? If yes, can you figure out if there are specific columns missing? It will help me a lot to figure out at which stage of the workflow the warning comes up- is this after the “Adding differential coverage results…” and the workflow then crashes without returning any result table? All the best, Lukas On 15 Aug 2016, at 16:33, stb [bioc] <noreply@bioconductor.org<mailto:noreply@bioconductor.org>> wrote: Activity on a post you are following on support.bioconductor.org<https: support.bioconductor.org=""/> User stb<https: support.bioconductor.org="" u="" 11175=""/> wrote Question: MeDIPS: warning when using "ttest" settings in MEDIPS.meth<https: support.bioconductor.org="" p="" 86170=""/>: Hi, I'm currently working with my MeDIP-seq data using MEDIPS in R. I have 18 IP samples (no Input) in 3 groups with PE50. For mapping, I have used bwa for Illumina with default settings in Galaxy. Arguments for creating MEDIPS SET: BSgenome <- 'BSgenome.Mmusculus.UCSC.mm9' uniq <- 1e-3 ws <- 500 chr.select <- c("chr1","chr2","chr3","chr4","chr5","chr6","chr7","chr8","chr9","chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16", "chr17", "chr18", "chr19") paired = TRUE However, when using MEDIPS.meth to calculate differential coverage between two groups (Control_MeDIP and LPS_MeDIP), I get the following warning: > mr.edgeR_Control_LPS <- MEDIPS.meth(MSet1 = Control_MeDIP, MSet2 = LPS_MeDIP, chr = chr.select, p.adj = "BH", diff.method = "ttest", MeDIP = F, CNV = F, minRowSum = 10, diffnorm = "rpkm") Calculating genomic coordinates... Creating Granges object for genome wide windows... Preprocessing MEDIPS SET 1 in MSet1... Preprocessing MEDIPS SET 2 in MSet1... Preprocessing MEDIPS SET 3 in MSet1... Preprocessing MEDIPS SET 4 in MSet1... Preprocessing MEDIPS SET 5 in MSet1... Preprocessing MEDIPS SET 6 in MSet1... Preprocessing MEDIPS SET 1 in MSet2... Preprocessing MEDIPS SET 2 in MSet2... Preprocessing MEDIPS SET 3 in MSet2... Preprocessing MEDIPS SET 4 in MSet2... Preprocessing MEDIPS SET 5 in MSet2... Preprocessing MEDIPS SET 6 in MSet2... Extracting data for chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 ... 4944695 windows on chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 Differential coverage analysis... Extracting count windows with at least 10 reads... Calculating score for 573068 windows... Adjusting p.values for multiple testing... Please note, log2 ratios are reported as log2(MSet1/MSet2). Creating results table... Adding differential coverage results... Warning messages: 1: In err0^4/(na0 - 1) : longer object length is not a multiple of shorter object length 2: In err1^4/(na1 - 1) : longer object length is not a multiple of shorter object length 3: In (err0^2 + err1^2)^2/(err0^4/(na0 - 1) + err1^4/(na1 - 1)) : longer object length is not a multiple of shorter object length 4: In df[fi] <- (err0^2 + err1^2)^2/(err0^4/(na0 - 1) + err1^4/(na1 - : number of items to replace is not a multiple of replacement length If I increase minRowSum to 30 or change diff.method to "edgeR" I get rid of the warning, but since I have six replicates in each MEDIPS sets I would like to use "ttest", and I would like to decrease minRowSum to below 30. Any suggestions why I get this warning, and how I can use "ttest" without having to increase minRowSum to 30? Thanks! Stine Output from sessionInfo(): R version 3.3.1 (2016-06-21) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200) locale: [1] LC_COLLATE=Danish_Denmark.1252 LC_CTYPE=Danish_Denmark.1252 LC_MONETARY=Danish_Denmark.1252 LC_NUMERIC=C [5] LC_TIME=Danish_Denmark.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] BSgenome.Mmusculus.UCSC.mm9_1.4.0 MEDIPS_1.23.2 Rsamtools_1.25.1 BiocInstaller_1.23.6 [5] BSgenome_1.41.2 rtracklayer_1.33.11 Biostrings_2.41.4 XVector_0.13.7 [9] GenomicRanges_1.25.93 GenomeInfoDb_1.9.4 IRanges_2.7.12 S4Vectors_0.11.10 [13] BiocGenerics_0.19.2 loaded via a namespace (and not attached): [1] AnnotationDbi_1.35.4 DNAcopy_1.47.1 edgeR_3.15.2 zlibbioc_1.19.0 [5] GenomicAlignments_1.9.6 BiocParallel_1.7.6 lattice_0.20-33 tools_3.3.1 [9] SummarizedExperiment_1.3.81 grid_3.3.1 Biobase_2.33.0 DBI_0.5 [13] gtools_3.5.0 preprocessCore_1.35.0 Matrix_1.2-6 bitops_1.0-6 [17] biomaRt_2.29.2 RCurl_1.95-4.8 RSQLite_1.0.0 limma_3.29.17 [21] locfit_1.5-9.1 XML_3.98-1.4 Output from traceback(): 9: q2qnbinom(y, input.mean = input.mean, output.mean = output.mean, dispersion = dispersion) 8: equalizeLibSizes.default(y, group = group, dispersion = disp, lib.size = lib.size) 7: equalizeLibSizes(y, group = group, dispersion = disp, lib.size = lib.size) 6: estimateCommonDisp.default(y$counts, group = group, lib.size = lib.size, tol = tol, rowsum.filter = rowsum.filter, verbose = verbose, ...) 5: estimateCommonDisp(y$counts, group = group, lib.size = lib.size, tol = tol, rowsum.filter = rowsum.filter, verbose = verbose, ...) 4: estimateCommonDisp.DGEList(d) 3: edgeR::estimateCommonDisp(d) 2: MEDIPS.diffMeth(base = base, values = counts.medip, diff.method = "edgeR", nMSets1 = nMSets1, nMSets2 = nMSets2, p.adj = p.adj, n.r.M1 = n.r.M1, n.r.M2 = n.r.M2, MeDIP = MeDIP, minRowSum = minRowSum, diffnorm = diffnorm) 1: MEDIPS.meth(MSet1 = Control_MeDIP, MSet2 = LPS_MeDIP, chr = chr.select, p.adj = "BH", diff.method = "edgeR", MeDIP = F, CNV = F, minRowSum = 30, diffnorm = "none") ________________________________ Post tags: R, medips, medip-seq You may reply via email or visit MeDIPS: warning when using "ttest" settings in MEDIPS.meth

score 0 · Answer 2 · 2017-06-16

Dear Lukas,

We have made a new observation concerning the 'ttest' functionality.

If I have two groups with 5 and 6 replicates, and then compare the run of MEDIPS.meth with diff.method set to 'ttest' or 'edgeR', with diffnorm set to 'rpkm' and 'tmm', respectively, there is a huge difference in the number of windows included in the score calculation. Moreover, this depends on the minRowSum. The analyzes are with the mm9 reference genome, window size of 500 bp, and the uniq parameter set to 1e-5.

When minRowSum is set to 1; 4,485,740 and 4,614,550 windows are included for ttest and edgeR, respectively.

When minRowSum is set to 10; only 383.265 windows are included for ttest, whereas edgeR includes 4,553,980 which makes more sense to me, with the many replicates included.

Does this have anything to do with the normalization method used, or is there another good explanation?

Thanks.

Best regards,

Stine

Example of output

> MEDIPS.meth(MSet1 = Control_MeDIP, MSet2 = Treatment_MeDIP, chr = chr.select, p.adj = "fdr",
+ diff.method = "ttest", MeDIP = F, CNV = F, minRowSum = 10, diffnorm = "rpkm")
Calculating genomic coordinates...
Creating Granges object for genome wide windows...
Preprocessing MEDIPS SET 1 in MSet1...
Preprocessing MEDIPS SET 2 in MSet1...
Preprocessing MEDIPS SET 3 in MSet1...
Preprocessing MEDIPS SET 4 in MSet1...
Preprocessing MEDIPS SET 5 in MSet1...
Preprocessing MEDIPS SET 6 in MSet1...
Preprocessing MEDIPS SET 1 in MSet2...
Preprocessing MEDIPS SET 2 in MSet2...
Preprocessing MEDIPS SET 3 in MSet2...
Preprocessing MEDIPS SET 4 in MSet2...
Preprocessing MEDIPS SET 5 in MSet2...
Extracting data for chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 ...
4944695 windows on chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19
Differential coverage analysis...
Extracting count windows with at least 10 reads...
Calculating score for 383265 windows...
Adjusting p.values for multiple testing...
Please note, log2 ratios are reported as log2(MSet1/MSet2).
Creating results table...
Adding differential coverage results...