Hello, I have bam files of paired reads. I have checked their statistics (find my statistics on one example at the end of this message) and I have reads properly paired and mapped with a good mapping quality. The bam files where mapped with bowtie2, coordinate-sorted and the index file is up-to-date, there is no secondary alignment. The max size of fragments was investigated and confirmed with samtools on Galaxy platform. But when I run windowCounts command, the counts are empty (or are a matrix of zero if I do not filter). I've tried the command getPESizes and suprisingly, csaw says that all my reads are "mate.unmapped" (which is not the case with my samtools statistics). Is there other information than FLAG in my bam which can be used by csaw and which can explain this strange behaviour ? Have you any idea to help me ? Here is my commands and my results :
paramsQ1<-readParam(pe="both", max.frag=500,minq=1)
datasetCount<-windowCounts("H3K27me3.bam",width=2500,spacing=1000,bin=FALSE,filter=1,param=paramsQ1)
datasetCount
class: RangedSummarizedExperiment
dim: 0 1
metadata(6): spacing width ... param final.ext
assays(1): counts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(4): bam.files totals ext rlen
assays(datasetCount)$count
[,1]
getPESizes("H3K27me3.bam",param=readParam(pe="both",max.frag=500,minq=1))
$sizes
integer(0)
$diagnostics
total.reads mapped.reads single mate.unmapped unoriented
77259091 73697251 0 73697251 0
inter.chr
0
# include your problematic code here with any corresponding output
# please also include the results of running the following in an R session
sessionInfo( )
R version 4.3.0 (2023-04-21)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 16.04.7 LTS
Matrix products: default
BLAS/LAPACK: /home/galaxy/anaconda2/envs/R_LudivineProject/lib/libopenblasp-r0.3.23.so; LAPACK version 3.11.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=C
[7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Paris
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] csaw_1.34.0 SummarizedExperiment_1.30.2
[3] Biobase_2.60.0 MatrixGenerics_1.12.2
[5] matrixStats_1.0.0 GenomicRanges_1.52.0
[7] GenomeInfoDb_1.36.0 IRanges_2.34.0
[9] S4Vectors_0.38.1 BiocGenerics_0.46.0
loaded via a namespace (and not attached):
[1] crayon_1.5.2 DelayedArray_0.26.3 RCurl_1.98-1.12
[4] Biostrings_2.68.1 locfit_1.5-9.8 grid_4.3.0
[7] bitops_1.0-7 compiler_4.3.0 codetools_0.2-19
[10] limma_3.56.2 Rcpp_1.0.10 edgeR_3.42.4
[13] XVector_0.40.0 BiocParallel_1.34.2 lattice_0.21-8
[16] metapod_1.8.0 parallel_4.3.0 GenomeInfoDbData_1.2.10
[19] Matrix_1.5-4.1 tools_4.3.0 Rsamtools_2.16.0
[22] zlibbioc_1.46.0 S4Arrays_1.0.4
Please find below the statistics of my dataset with samtools commands :
samtools view -c -f 4 H3K27me3.bam
2336397 # unmapped reads
samtools view -c -f 256 H3K27me3.bam
0 #not primary alignment
samtools view -c -f 2048 H3K27me3.bam
0 #supplementary alignment
samtools view -c -f 1 H3K27me3.bam
77259100 #reads paired
samtools view -c -f 8 H3K27me3.bam
2336397 #mate.unmapped
Thanks for your help.
Julie