Rsubread - sublong function
0
0
Entering edit mode
@adde7aac
Last seen 7 weeks ago
Switzerland

Hey folks. I am curious if the rsubread is ready for pacbio long read sequencing? I find the sublong() function, however, when i try to use it for one fastq file, the R session crashes. I needed to go down from 10million reads to 100 reads and still over 18 GB were occupied.

What I realised it the "wrong_rebuild", which I get numerous warnings of when using more reads.

Originally, I got hifi reads in bam format. I used a standard workflow with isoseq including the refine step and then pbtk's bam2fastq.

What could be the issue? The fastq reads look good.

Thanks for the input. Let's see if I can make this work. Else I also appreciate other tools to quantify RNA expression, not focusing on RNA variants, which is the main output with the isoseq/pigeon workflow.

Cheers

# command line

# skera deconcatenates reads into constituent cDNA molecules
mkdir -p "bam/split"
skera split -j 21 bam/hifi_raw/control.bam add_files/mas8_primers.fasta bam/split/control_split.bam


# lima step removes barcoded Iso-Seq primers
mkdir -p "bam/lima"
lima -j 21 --ignore-biosamples --isoseq --peek-guess --overwrite-biosample-names bam/split/control_split.bam add_files/IsoSeq_v2_primers_12.fasta bam/lima/control_lima.bam


# refine removes polyA tails reads removes polyA tails artificial concatemers
mkdir -p "bam/isoseq_refined"
isoseq refine -j 21 --require-polya bam/lima/control_lima.IsoSeqX_bc01_5p--IsoSeqX_3p.bam add_files/IsoSeq_v2_primers_12.fasta bam/isoseq_refined/control_refined.bam

# convert into fastq format
bam2fastq -o ../../fastq/control.fastq control_refined.bam


################################

## R session
> sublong(index = "/home/chuddy/bioinformatics/database_references/human_38/genome/subread_gencode_45/hGencode45",  
+       nthreads = 6, 
+       readFiles = fastq_files_abs[2],
+       outputFiles = "new/bam/sublong/control_mapped.bam")

 ====== Subread long read mapping ======

Threads: 6
Input file: /home/chuddy/bioinformatics/johannes/pacbio/new/fastq/control_n1000.fastq.gz
Output file: /home/chuddy/bioinformatics/johannes/pacbio/new/bam/sublong/control_mapped.bam (BAM)
Index: /home/chuddy/bioinformatics/database_references/human_38/genome/subread_gencode_45/hGencode45

Index was loaded; the gap bewteen subreads is 1 bases
Processing 0-th read for task 10; used 0.2 minutes
WRONG_REBUILD : m84149R1_240917_080353_s1/238949300/ccs/69_2551 : 2480 != 2482 ; //318M/MMMMMMMMMMMMMMMMMM/51M/MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM/30M/MM/10M/MMMMMMMMMMMMMMMMMMMMMMMMMMM/1523M//118M/MM......339S.............................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/265425060/ccs/4263_6902 : 2637 != 2639 ; /SMM/1454M//112M/MMMXMMMMDDMMM.....1057S.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/252974806/ccs/8179_10496 : 2315 != 2317 ; /SMMMMM/699M//224M/DDDDDDDDDDDDDDDDDDDDDDDDDMMMM.....1382S.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/239734526/ccs/11291_12729 : 1435 != 1438 ; /SMM/18M/MMMD/874M//21M/DDDDDDDDDDDDDDDDDDDDDDXMXM......512S.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/238032088/ccs/69_1771 : 1699 != 1702 ; //288M/MM/852M//24M/DDDDDDDDDDDDDDDDDDDDDDXMXMDM......528S....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../
WRONG_REBUILD : m84149R1_240917_080353_s1/263066353/ccs/10544_12392 : 1846 != 1848 ; /SMM/1258M//82M/DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDMIMMMMMXM......494S.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................../


All finished.

Total processed reads : 1000
Mapped reads: 999 (99.9%)
Time: 0.2 minutes 

> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=de_CH.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_CH.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=de_CH.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=de_CH.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Zurich
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rsubread_2.18.0 lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1   dplyr_1.1.4     purrr_1.0.2     readr_2.1.5    
 [8] tidyr_1.3.1     tibble_3.2.1    ggplot2_3.5.1   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5       cli_3.6.3         knitr_1.48        rlang_1.1.4       xfun_0.46         stringi_1.8.4    
 [7] generics_0.1.3    glue_1.7.0        colorspace_2.1-0  hms_1.1.3         scales_1.3.0      fansi_1.0.6      
[13] grid_4.4.1        munsell_0.5.1     tzdb_0.4.0        lifecycle_1.0.4   compiler_4.4.1    timechange_0.3.0 
[19] pkgconfig_2.0.3   rstudioapi_0.16.0 lattice_0.22-5    R6_2.5.1          tidyselect_1.2.1  utf8_1.2.4       
[25] pillar_1.9.0      magrittr_2.0.3    Matrix_1.7-0      tools_4.4.1       withr_3.0.0       gtable_0.3.5
pacbio sublong Rsubread • 135 views
ADD COMMENT

Login before adding your answer.

Traffic: 1037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6