Hey folks. I am curious if the rsubread is ready for pacbio long read sequencing? I find the sublong() function, however, when i try to use it for one fastq file, the R session crashes. I needed to go down from 10million reads to 100 reads and still over 18 GB were occupied.
What I realised it the "wrong_rebuild", which I get numerous warnings of when using more reads.
Originally, I got hifi reads in bam format. I used a standard workflow with isoseq including the refine step and then pbtk's bam2fastq.
What could be the issue? The fastq reads look good.
Thanks for the input. Let's see if I can make this work. Else I also appreciate other tools to quantify RNA expression, not focusing on RNA variants, which is the main output with the isoseq/pigeon workflow.
Cheers
# command line
# skera deconcatenates reads into constituent cDNA molecules
mkdir -p "bam/split"
skera split -j 21 bam/hifi_raw/control.bam add_files/mas8_primers.fasta bam/split/control_split.bam
# lima step removes barcoded Iso-Seq primers
mkdir -p "bam/lima"
lima -j 21 --ignore-biosamples --isoseq --peek-guess --overwrite-biosample-names bam/split/control_split.bam add_files/IsoSeq_v2_primers_12.fasta bam/lima/control_lima.bam
# refine removes polyA tails reads removes polyA tails artificial concatemers
mkdir -p "bam/isoseq_refined"
isoseq refine -j 21 --require-polya bam/lima/control_lima.IsoSeqX_bc01_5p--IsoSeqX_3p.bam add_files/IsoSeq_v2_primers_12.fasta bam/isoseq_refined/control_refined.bam
# convert into fastq format
bam2fastq -o ../../fastq/control.fastq control_refined.bam
################################
## R session
> sublong(index = "/home/chuddy/bioinformatics/database_references/human_38/genome/subread_gencode_45/hGencode45",
+ nthreads = 6,
+ readFiles = fastq_files_abs[2],
+ outputFiles = "new/bam/sublong/control_mapped.bam")
====== Subread long read mapping ======
Threads: 6
Input file: /home/chuddy/bioinformatics/johannes/pacbio/new/fastq/control_n1000.fastq.gz
Output file: /home/chuddy/bioinformatics/johannes/pacbio/new/bam/sublong/control_mapped.bam (BAM)
Index: /home/chuddy/bioinformatics/database_references/human_38/genome/subread_gencode_45/hGencode45
Index was loaded; the gap bewteen subreads is 1 bases
Processing 0-th read for task 10; used 0.2 minutes
WRONG_REBUILD : m84149R1_240917_080353_s1/238949300/ccs/69_2551 : 2480 != 2482 ; //318M/MMMMMMMMMMMMMMMMMM/51M/MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM/30M/MM/10M/MMMMMMMMMMMMMMMMMMMMMMMMMMM/1523M//118M/MM......339S.............................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/265425060/ccs/4263_6902 : 2637 != 2639 ; /SMM/1454M//112M/MMMXMMMMDDMMM.....1057S.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/252974806/ccs/8179_10496 : 2315 != 2317 ; /SMMMMM/699M//224M/DDDDDDDDDDDDDDDDDDDDDDDDDMMMM.....1382S.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/239734526/ccs/11291_12729 : 1435 != 1438 ; /SMM/18M/MMMD/874M//21M/DDDDDDDDDDDDDDDDDDDDDDXMXM......512S.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/238032088/ccs/69_1771 : 1699 != 1702 ; //288M/MM/852M//24M/DDDDDDDDDDDDDDDDDDDDDDXMXMDM......528S....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/263066353/ccs/10544_12392 : 1846 != 1848 ; /SMM/1258M//82M/DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDMIMMMMMXM......494S.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
All finished.
Total processed reads : 1000
Mapped reads: 999 (99.9%)
Time: 0.2 minutes
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 20.04.6 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=de_CH.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_CH.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_CH.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C
time zone: Europe/Zurich
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rsubread_2.18.0 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2 readr_2.1.5
[8] tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] vctrs_0.6.5 cli_3.6.3 knitr_1.48 rlang_1.1.4 xfun_0.46 stringi_1.8.4
[7] generics_0.1.3 glue_1.7.0 colorspace_2.1-0 hms_1.1.3 scales_1.3.0 fansi_1.0.6
[13] grid_4.4.1 munsell_0.5.1 tzdb_0.4.0 lifecycle_1.0.4 compiler_4.4.1 timechange_0.3.0
[19] pkgconfig_2.0.3 rstudioapi_0.16.0 lattice_0.22-5 R6_2.5.1 tidyselect_1.2.1 utf8_1.2.4
[25] pillar_1.9.0 magrittr_2.0.3 Matrix_1.7-0 tools_4.4.1 withr_3.0.0 gtable_0.3.5