ChIPQC BiocParallel problems and how to select the "PeakCaller" for broad histone modification
0
0
Entering edit mode
lida • 0
@lida-24371
Last seen 2.0 years ago
China

Hello everyone,

I have a H3K4me3 and H3K27me3 ChIP-seq data from drosophila each histone modification with two replicates. I aligned the reads with bowtie2 and call peaks with macs2 callpeak function. I want to perform quality check on the data using ChIPQC.

When I perform ChIPQC, BiocParallel error occured. H3K4me3 ChIPQC presents below and the H3K27me3 is the same.

> K4QC <- ChIPQC(K4_samples,chromosomes = NULL,annotation = 'dm3')
K4_F_L1_rep1  female L1  1 bed
K4_F_L2_rep1  female L2  1 bed
K4_F_L3_rep1  female L3  1 bed
K4_F_WP_rep1  female WP  1 bed
K4_F_BP_rep1  female BP  1 bed
K4_M_L1_rep1  male L1  1 bed
K4_M_L2_rep1  male L2  1 bed
K4_M_L3_rep1  male L3  1 bed
K4_M_WP_rep1  male WP  1 bed
K4_M_BP_rep1  male BP  1 bed
K4_F_L1_rep4  female L1  4 bed
K4_F_L2_rep4  female L2  4 bed
K4_F_L3_rep4  female L3  4 bed
K4_F_WP_rep4  female WP  4 bed
K4_F_BP_rep4  female BP  4 bed
K4_M_L1_rep4  male L1  4 bed
K4_M_L2_rep4  male L2  4 bed
K4_M_L3_rep4  male L3  4 bed
K4_M_WP_rep4  male WP  4 bed
K4_M_BP_rep4  male BP  4 bed
Compiling annotation...
Computing metrics for 40 samples...
list
Bam file has 8 contigs
Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: could not find function "seqlevels<-"
In addition: Warning messages:
1: In serialize(data, node$con) :
  'package:stats' may not be available when loading
2: In serialize(data, node$con) :
  'package:stats' may not be available when loading
3: In serialize(data, node$con) :
  'package:stats' may not be available when loading
4: In serialize(data, node$con) :
  'package:stats' may not be available when loading
5: In serialize(data, node$con) :
  'package:stats' may not be available when loading
6: In serialize(data, node$con) :
  'package:stats' may not be available when loading
7: In serialize(data, node$con) :
  'package:stats' may not be available when loading
8: In serialize(data, node$con) :
  'package:stats' may not be available when loading
9: In serialize(data, node$con) :
  'package:stats' may not be available when loading
10: In serialize(data, node$con) :
  'package:stats' may not be available when loading

I manage to fix the problem by the suggestion of @jared.andrews07 from biostar post(https://www.biostars.org/p/357154/) with the following codes.

library(BiocParallel)
register(DoparParam())

Although I have fixed this problem by dispatch parallel operations(I think), I still want to figure out why would this happends and how can I do pararllel for it.

Also the help document of ChIPQC describe multiple options of PeakCaller column in samples meta table, I wonder what is the best value for broad peak called by macs2 '--broad' parameter. As macs2 mention in their documents:

NAME_peaks.broadPeak is in BED6+3 format which is similar to narrowPeak file, except for missing the 10th column for annotating peak summits.

For my understanding, I select the 'bed' format for the reason that the 'narrow' or 'macs' options are suitable for "narrowPeak" but not "broadPeak" called by macs and the first six columns of both "narrowPeak" and "broadPeak" files are in standard bed format.

But, what is the most suitable PeakCaller value for macs2 callpeak results for broad peaks as well as narrow peaks?

sessioninfo

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 
[2] LC_CTYPE=Chinese (Simplified)_China.936   
[3] LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C                              
[5] LC_TIME=Chinese (Simplified)_China.936    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] BiocParallel_1.22.0         ChIPQC_1.24.1              
 [3] DiffBind_2.16.0             SummarizedExperiment_1.18.2
 [5] DelayedArray_0.14.0         matrixStats_0.56.0         
 [7] Biobase_2.48.0              GenomicRanges_1.40.0       
 [9] GenomeInfoDb_1.24.2         IRanges_2.22.2             
[11] S4Vectors_0.26.1            BiocGenerics_0.34.0        
[13] ggplot2_3.3.2              

Thanks in advance, lida

ChIPQC • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 527 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6