Hello everyone,
I have a H3K4me3 and H3K27me3 ChIP-seq data from drosophila each histone modification with two replicates. I aligned the reads with bowtie2 and call peaks with macs2 callpeak function. I want to perform quality check on the data using ChIPQC.
When I perform ChIPQC, BiocParallel error occured. H3K4me3 ChIPQC presents below and the H3K27me3 is the same.
> K4QC <- ChIPQC(K4_samples,chromosomes = NULL,annotation = 'dm3')
K4_F_L1_rep1 female L1 1 bed
K4_F_L2_rep1 female L2 1 bed
K4_F_L3_rep1 female L3 1 bed
K4_F_WP_rep1 female WP 1 bed
K4_F_BP_rep1 female BP 1 bed
K4_M_L1_rep1 male L1 1 bed
K4_M_L2_rep1 male L2 1 bed
K4_M_L3_rep1 male L3 1 bed
K4_M_WP_rep1 male WP 1 bed
K4_M_BP_rep1 male BP 1 bed
K4_F_L1_rep4 female L1 4 bed
K4_F_L2_rep4 female L2 4 bed
K4_F_L3_rep4 female L3 4 bed
K4_F_WP_rep4 female WP 4 bed
K4_F_BP_rep4 female BP 4 bed
K4_M_L1_rep4 male L1 4 bed
K4_M_L2_rep4 male L2 4 bed
K4_M_L3_rep4 male L3 4 bed
K4_M_WP_rep4 male WP 4 bed
K4_M_BP_rep4 male BP 4 bed
Compiling annotation...
Computing metrics for 40 samples...
list
Bam file has 8 contigs
Error: BiocParallel errors
element index: 1, 2, 3, 4, 5, 6, ...
first error: could not find function "seqlevels<-"
In addition: Warning messages:
1: In serialize(data, node$con) :
'package:stats' may not be available when loading
2: In serialize(data, node$con) :
'package:stats' may not be available when loading
3: In serialize(data, node$con) :
'package:stats' may not be available when loading
4: In serialize(data, node$con) :
'package:stats' may not be available when loading
5: In serialize(data, node$con) :
'package:stats' may not be available when loading
6: In serialize(data, node$con) :
'package:stats' may not be available when loading
7: In serialize(data, node$con) :
'package:stats' may not be available when loading
8: In serialize(data, node$con) :
'package:stats' may not be available when loading
9: In serialize(data, node$con) :
'package:stats' may not be available when loading
10: In serialize(data, node$con) :
'package:stats' may not be available when loading
I manage to fix the problem by the suggestion of @jared.andrews07 from biostar post(https://www.biostars.org/p/357154/) with the following codes.
library(BiocParallel)
register(DoparParam())
Although I have fixed this problem by dispatch parallel operations(I think), I still want to figure out why would this happends and how can I do pararllel for it.
Also the help document of ChIPQC describe multiple options of PeakCaller column in samples meta table, I wonder what is the best value for broad peak called by macs2 '--broad' parameter. As macs2 mention in their documents:
NAME_peaks.broadPeak
is in BED6+3 format which is similar to narrowPeak file, except for missing the 10th column for annotating peak summits.
For my understanding, I select the 'bed' format for the reason that the 'narrow' or 'macs' options are suitable for "narrowPeak" but not "broadPeak" called by macs and the first six columns of both "narrowPeak" and "broadPeak" files are in standard bed format.
But, what is the most suitable PeakCaller
value for macs2 callpeak results for broad peaks as well as narrow peaks?
sessioninfo
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936
[2] LC_CTYPE=Chinese (Simplified)_China.936
[3] LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.936
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] BiocParallel_1.22.0 ChIPQC_1.24.1
[3] DiffBind_2.16.0 SummarizedExperiment_1.18.2
[5] DelayedArray_0.14.0 matrixStats_0.56.0
[7] Biobase_2.48.0 GenomicRanges_1.40.0
[9] GenomeInfoDb_1.24.2 IRanges_2.22.2
[11] S4Vectors_0.26.1 BiocGenerics_0.34.0
[13] ggplot2_3.3.2
Thanks in advance, lida