I tried to use ATAC-QC to generate the ATAC-seq QC plots , but I met some problem with the shiftGAlignmentsList function. Thank you for your help in advance.
bamfile <- "G:/project/ATAC-seq_QC/rdrmbam/my.rdrm.bam"
possibleTag <- combn(LETTERS, 2)
possibleTag <- c(paste0(possibleTag[1, ], possibleTag[2, ]),
paste0(possibleTag[2, ], possibleTag[1, ]))
library(Rsamtools)
bamTop100 <- scanBam(BamFile(bamfilde, yieldSize = 100),
param = ScanBamParam(tag=possibleTag))[[1]]$tag
tags <- names(bamTop100)[lengths(bamTop100)>0]
> tags
[1] "AS" "MD" "XG" "NM" "XM" "XN" "XO" "XS" "YS" "YT"
library(BSgenome.Mmusculus.UCSC.mm10)
seqlev <- "chr1" ## subsample data for quick run
which <- as(seqinfo(Mmusculus)[seqlev], "GRanges")
> which
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
chr1 chr1 1-195471971 *
seqinfo: 1 sequence from mm10 genome
mygal <- readBamFile(bamfile, tag=tags, which=which, asMates= TRUE, bigFile=TRUE)
>mygal
GAlignmentsList object of length 0:
<0 elements>
seqinfo: no sequences
mygal1 <- shiftGAlignmentsList(mygal#,
+ #outbam=shiftedBamfile
+ )
>Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function 'readGAlignments' for signature '"NULL"'
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 LC_CTYPE=Chinese (Simplified)_China.936
[3] LC_MONETARY=Chinese (Simplified)_China.936 LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.936
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.34.8
[3] AnnotationDbi_1.44.0 BSgenome.Hsapiens.UCSC.hg19_1.4.0
[5] BSgenome.Mmusculus.UCSC.mm10_1.4.0 BSgenome_1.50.0
[7] rtracklayer_1.42.0 RColorBrewer_1.1-2
[9] ATACseqQC_1.6.4 GenomicAlignments_1.18.1
[11] Rsamtools_1.34.0 Biostrings_2.50.1
[13] XVector_0.22.0 SummarizedExperiment_1.12.0
[15] DelayedArray_0.8.0 BiocParallel_1.16.0
[17] matrixStats_0.54.0 Biobase_2.42.0
[19] GenomicRanges_1.34.0 GenomeInfoDb_1.18.0
[21] IRanges_2.16.0 S4Vectors_0.20.0
[23] BiocGenerics_0.28.0
loaded via a namespace (and not attached):
[1] ProtGenerics_1.14.0 bitops_1.0-6 bit64_0.9-7
[4] progress_1.2.0 httr_1.3.1 tools_3.5.1
[7] R6_2.3.0 rGADEM_2.30.0 KernSmooth_2.23-15
[10] seqLogo_1.48.0 DBI_1.0.0 lazyeval_0.2.1
[13] colorspace_1.3-2 ade4_1.7-13 motifStack_1.26.0
[16] prettyunits_1.0.2 bit_1.1-14 curl_3.2
[19] compiler_3.5.1 VennDiagram_1.6.20 graph_1.60.0
[22] formatR_1.5 grImport_0.9-2 scales_1.0.0
[25] randomForest_4.6-14 RBGL_1.58.0 stringr_1.3.1
[28] digest_0.6.18 pkgconfig_2.0.2 htmltools_0.3.6
[31] ensembldb_2.6.8 limma_3.38.2 regioneR_1.14.0
[34] htmlwidgets_1.3 rlang_0.3.0.1 rstudioapi_0.8
[37] RSQLite_2.1.1 shiny_1.2.0 RCurl_1.95-4.11
[40] magrittr_1.5 polynom_1.3-9 GO.db_3.7.0
[43] GenomeInfoDbData_1.2.0 futile.logger_1.4.3 Matrix_1.2-14
[46] Rcpp_1.0.0 munsell_0.5.0 stringi_1.2.4
[49] yaml_2.2.0 MASS_7.3-50 zlibbioc_1.28.0
[52] AnnotationHub_2.14.1 grid_3.5.1 blob_1.1.1
[55] promises_1.0.1 crayon_1.3.4 lattice_0.20-35
[58] splines_3.5.1 multtest_2.38.0 hms_0.4.2
[61] GenomicScores_1.6.0 MotIV_1.38.0 seqinr_3.4-5
[64] biomaRt_2.38.0 futile.options_1.0.1 XML_3.98-1.16
[67] lambda.r_1.2.3 BiocManager_1.30.4 idr_1.2
[70] httpuv_1.4.5 assertthat_0.2.0 mime_0.6
[73] preseqR_4.0.0 xtable_1.8-3 AnnotationFilter_1.6.0
[76] later_0.7.5 survival_2.42-6 ChIPpeakAnno_3.16.1
[79] memoise_1.1.0 interactiveDisplayBase_1.20.0
I got exact the same errors for mouse data.
ATACSeqQC is version 1.8.1.
I confirmed that my bam file exist and all parameters are OK.
It seems that readBamFile does not work for bam files mapped on UCSC mm10 genome.
Are there mouse bam files that I can test to run ATACSeqQC?
seqinfo: 1 sequence from mm10 genome
seqinfo: no sequences
Shuang, did you use the mm10 reference genome from the same source, such as NCBI, UCSC Genome Browser or Ensembl, to do alignment and the ATACseqQC analysis?
I don't think ATACseqQC is species-aware.
Haibo
Hi all,
I am using ATAC-seqQC and I dont understand why when I use my bam files and run the following code I always get the same tags across all my samples.
While I noticed that in the preloaded data they of course changes base on the BAM files (GL1.bam and GL2.bam). Then PG and YT tags are character while I expect some integer.
Do you guys think my .bai files have some issue? Here how I generated it:
As input in ATAC-seqQC i am using sort BAM files.
Thanks in advance for your kind help.
Giuseppe
Could you use samtools to show the top 10 lines of your bam file like this:
Shuang, did you use the mm10 reference genome from the same source, such as NCBI, UCSC Genome Browser or Ensembl, to do alignment and the ATACseqQC analysis?
I don't think ATACseqQC is species-aware.
Haibo
First, If you set bigFile as TRUE, gal will only keep a link to the file, and will load data when need.
Try to subset your bam file by samtools view to a small file and set the bigFile to false to ensure the reads are properly pored end reads.
Let me know what you get. Thank you.
It seems that shiftGAlignmentsList cannot handle bam files aligned by bwa.
all(elementNROWS(gal) < 3) is not TRUE
I generated the bam files using bowtie2, and the problems are solved.
you can update to the development version. Or set flag=scanBamFlag(isSecondaryAlignment = FALSE, isUnmappedQuery=FALSE, isNotPassingQualityControls = FALSE, isSupplementaryAlignment = FALSE) for readBamFile in released version to overcome the issue of all(elementNROWS(gal) < 3) is not TRUE