Entering edit mode
Hi,
I am getting the following error (code below) when running the getCTSS function from the CAGEr package using bam files:
cage_rep1_bam <- "/n/projects/sga/analysis/pipeline_test/cage/kc167_cage_1.bam"
cage_rep2_bam <- "/n/projects/sga/analysis/pipeline_test/cage/kc167_cage_2.bam"
ce <- CAGEexp( genomeName = "BSgenome.Dmelanogaster.UCSC.dm6"
, inputFiles = c(cage_rep1_bam, cage_rep2_bam)
, inputFilesType = "bam"
, sampleLabels = c("cage_rep1", "cage_rep2"))
getCTSS(ce, removeFirstG = F, correctSystematicG = F, nrCores = 10)
Reading in file: /n/projects/sga/analysis/pipeline_test/cage/kc167_cage_1.bam...
-> Filtering out low quality reads...
Error in `[[<-`(`*tmp*`, name, value = new("Rle", values = c(1L, 38L, :
2084199 elements in value to replace 104203438 elements
Interestingly, I have not had this error before when running the same code with the same samples. Does anyone have an idea of what may be causing this?
Thanks a lot!
Sergio
sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /n/apps/CentOS7/install/r-4.1.0/lib64/R/lib/libRblas.so
LAPACK: /n/apps/CentOS7/install/r-4.1.0/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicAlignments_1.28.0 Rsamtools_2.8.0 data.table_1.14.0 lattice_0.20-44
[5] cowplot_1.1.1 gridExtra_2.3 ggseqlogo_0.1 CAGEr_1.34.0
[9] MultiAssayExperiment_1.18.0 SummarizedExperiment_1.22.0 Biobase_2.52.0 MatrixGenerics_1.4.0
[13] matrixStats_0.59.0 plyranges_1.12.0 reshape2_1.4.4 dplyr_1.0.6
[17] ggplot2_3.3.3 BSgenome.Dmelanogaster.UCSC.dm6_1.4.1 BSgenome_1.60.0 Biostrings_2.60.1
[21] XVector_0.32.0 rtracklayer_1.52.0 magrittr_2.0.1 GenomicRanges_1.44.0
[25] GenomeInfoDb_1.28.0 IRanges_2.26.0 S4Vectors_0.30.0 BiocGenerics_0.38.0
loaded via a namespace (and not attached):
[1] nlme_3.1-152 bitops_1.0-7 tools_4.1.0 utf8_1.2.1 R6_2.5.0 vegan_2.5-7 KernSmooth_2.23-20
[8] DBI_1.1.1 mgcv_1.8-36 colorspace_2.0-1 permute_0.9-5 withr_2.4.2 tidyselect_1.1.1 compiler_4.1.0
[15] DelayedArray_0.18.0 scales_1.1.1 stringr_1.4.0 digest_0.6.27 rmarkdown_2.8 stringdist_0.9.6.3 pkgconfig_2.0.3
[22] htmltools_0.5.1.1 fastmap_1.1.0 rlang_0.4.11 rstudioapi_0.13 VGAM_1.1-5 BiocIO_1.2.0 generics_0.1.0
[29] BiocParallel_1.26.0 gtools_3.9.2 RCurl_1.98-1.3 GenomeInfoDbData_1.2.6 Matrix_1.3-4 Rcpp_1.0.6 munsell_0.5.0
[36] fansi_0.5.0 lifecycle_1.0.0 stringi_1.6.2 yaml_2.2.1 MASS_7.3-54 zlibbioc_1.38.0 plyr_1.8.6
[43] grid_4.1.0 formula.tools_1.7.1 crayon_1.4.1 splines_4.1.0 knitr_1.33 beanplot_1.2 pillar_1.6.1
[50] rjson_0.2.20 XML_3.99-0.6 glue_1.4.2 evaluate_0.14 operator.tools_1.6.3 vctrs_0.3.8 gtable_0.3.0
[57] purrr_0.3.4 reshape_0.8.8 assertthat_0.2.1 cachem_1.0.5 xfun_0.23 restfulr_0.0.13 tibble_3.1.2
I actually found where the error is occurring within the getCTSS function, but still unsure how to fix it. There seems to be a difference in length when trying to assign a score to the CTSS granges object generated by the CTSS function.
This what I tried using the code within the getCTSS function I got from Github:
1) Filter low quality reads from bam file
2) Make sure information from aligned data and BSgenome match
3) Create a CTSS object and assign scores (here is the problem)
Also, while I am able to run the CTSS function and obtain the gp object, I get this error when printing it:
Thank you!
Sergio
Hi, we are preparing a version 2.0 of CAGEr. Can you open an issue in our GitHub repository ? https://github.com/charles-plessy/CAGEr/issues