Question

CleanUpRNAseq with a GTF that is missing mitochondria annotation data

0

Entering edit mode

rcreed • 0

@e847af5e

Last seen 11 weeks ago

United States

I am running CleanUpRNAseq on tomato RNAseq data prepared through ribodepletion: however, the ensembl tomato GTF file does not include mitochondria or chloroplast annotations.

The following code works fine for me, except the resulting saf_list does not include a data frame for the organelle annotations:

bam_file <- file.path("../star_salmon/32_13_1_2.markdup.sorted.bam")

saf_list <- get_saf(
    ensdb_sqlite = sl_ensdb,
    bamfile = bam_file,
    ##mitochondrial_genome = "MT" ##commented out because there are no annotations for this in the GTF file
)

This causes an issue in generating the counts list in the following chunk:

## featurecounts 
 capture.output({counts_list <- summarize_reads(
     SummarizedCounts = sc,
     saf_list = saf_list,
     gtf = gtf,
     threads = 16,
     verbose = TRUE
 )}, file = tempfile())

Error in summarize_reads(SummarizedCounts = sc, saf_list = saf_list, gtf = gtf,  : 
  A valid SAF list for gene, exon, intergenic region,intronic region, rRNA genes, and mitochondrion is needed

Is it possible to run CleanUpRNAseq without annotation data associated with organelles?

# Session info
sessionInfo( )
  [1] RColorBrewer_1.1-3          rstudioapi_0.17.1           jsonlite_1.8.9              tximport_1.34.0            
  [5] magrittr_2.0.3              farver_2.1.2                rmarkdown_2.29              fs_1.6.5                   
  [9] BiocIO_1.16.0               zlibbioc_1.52.0             vctrs_0.6.5                 memoise_2.0.1              
 [13] Rsamtools_2.22.0            RCurl_1.98-1.16             base64enc_0.1-3             htmltools_0.5.8.1          
 [17] S4Arrays_1.6.0              curl_5.2.1                  gridGraphics_0.5-1          SparseArray_1.6.0          
 [21] Formula_1.2-5               KernSmooth_2.23-24          htmlwidgets_1.6.4           plyr_1.8.9                 
 [25] cachem_1.1.0                GenomicAlignments_1.42.0    lifecycle_1.0.4             pkgconfig_2.0.3            
 [29] Matrix_1.7-1                R6_2.5.1                    fastmap_1.2.0               GenomeInfoDbData_1.2.13    
 [33] MatrixGenerics_1.18.0       digest_0.6.37               colorspace_2.1-1            DESeq2_1.46.0              
 [37] Hmisc_5.2-0                 RSQLite_2.3.7               fansi_1.0.6                 httr_1.4.7                 
 [41] abind_1.4-8                 mgcv_1.9-1                  compiler_4.4.1              bit64_4.5.2                
 [45] htmlTable_2.4.3             backports_1.5.0             BiocParallel_1.40.0         DBI_1.2.3                  
 [49] DelayedArray_0.32.0         rjson_0.2.23                tools_4.4.1                 foreign_0.8-87             
 [53] nnet_7.3-19                 glue_1.8.0                  restfulr_0.0.15             nlme_3.1-166               
 [57] grid_4.4.1                  checkmate_2.3.2             cluster_2.1.6               reshape2_1.4.4             
 [61] generics_0.1.3              sva_3.54.0                  gtable_0.3.6                BSgenome_1.74.0            
 [65] qsmooth_1.22.0              data.table_1.16.2           utf8_1.2.4                  XVector_0.46.0             
 [69] ggrepel_0.9.6               pillar_1.9.0                stringr_1.5.1               yulab.utils_0.1.8          
 [73] limma_3.62.1                genefilter_1.88.0           splines_4.4.1               dplyr_1.1.4                
 [77] lattice_0.22-6              survival_3.7-0              rtracklayer_1.66.0          bit_4.5.0                  
 [81] annotate_1.84.0             tidyselect_1.2.1            locfit_1.5-9.10             Biostrings_2.74.0          
 [85] knitr_1.49                  gridExtra_2.3               ProtGenerics_1.38.0         edgeR_4.4.0                
 [89] SummarizedExperiment_1.36.0 xfun_0.49                   statmod_1.5.0               matrixStats_1.4.1          
 [93] pheatmap_1.0.12             stringi_1.8.4               UCSC.utils_1.2.0            lazyeval_0.2.2             
 [97] yaml_2.3.10                 evaluate_1.0.1              codetools_0.2-20            tibble_3.2.1               
[101] cli_3.6.3                   rpart_4.1.23                xtable_1.8-4                munsell_0.5.1              
[105] Rsubread_2.20.0             Rcpp_1.0.13-1               png_0.1-8                   XML_3.99-0.17              
[109] parallel_4.4.1              ggplot2_3.5.1               blob_1.2.4                  bitops_1.0-9               
[113] scales_1.3.0                crayon_1.5.3                rlang_1.1.4                 KEGGREST_1.46.0

CleanUpRNAseq • 380 views

ADD COMMENT • link 11 weeks ago rcreed • 0

score 2 · Accepted Answer · 2024-11-19

2

Entering edit mode

Haibo Liu ▴ 20

@haibol2017-23658

Last seen 11 weeks ago

United States

Thank you for using our CleanUpRNAseq package.

I just update the code in the package to accomodate a GTF without mitochondrial annotation. Please remove your current installation and reinstall the package as follows:

install.packages("devtools")
devtools::install_github("haibol2016/CleanUpRNAseq")

It will take some time for the Bioconductor site to update the package.

If you encounter any problems, please let me know.

ADD COMMENT • link 11 weeks ago Haibo Liu ▴ 20

0

Entering edit mode

@haibol2017-23658 make sure you have the valid version bump in the description on our git repository in order for the new version to propagate

ADD REPLY • link 11 weeks ago shepherl 4.1k

0

Entering edit mode

Thank you for your reminder. I did forget that, but I just bumped the version to 1.1.1.