tximeta on STAR-aligned Salmon-quantified output
1
0
Entering edit mode
josephbrown ▴ 10
@josephbrown-21888
Last seen 3.2 years ago
United States

Note: I haven't yet updated to R 4.0 or Bioconductor 3.11 yet in case that matters.

I built a STAR index using GRCm38 ensembl release 98 dna primary assembly and the release 98 gtf, and aligned with the --quantMode TranscriptomeSAM to get outputs in transcriptome coordinates. I then used gffread -w transcripts.fa -g $DNA $GTF to make a reference for salmon's alignment mode (salmon quant -t transcripts.fa -l A -a Sample.toTranscriptome.out.bam -o $OUTDIR) which seems to have worked.

Unfortunately, despite using unchanged references from ensembl, I can import it with tximeta, but it does not recognize the transcriptome and only provides a non-ranged SummarizedExperiment.

In the tximeta vignette sections about creating linked transcriptomes, everything seems to require a salmon index unless I'm missing something.

Can someone help me find a way that will allow use of the automatic metadata gathering feature of tximeta?

> library(tximeta)
> meta <- tibble::tribble(
+   ~names,        ~files,
+   "sample1", "results/quant/salmon/STAR/quant.sf"
+ )
> se <- tximeta(meta)
importing quantifications
reading in files with read_tsv
1 
couldn't find matching transcriptome, returning non-ranged SummarizedExperiment

.

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] tximeta_1.4.5

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6                lattice_0.20-41             prettyunits_1.1.1           Rsamtools_2.2.3             Biostrings_2.54.0          
 [6] assertthat_0.2.1            digest_0.6.25               BiocFileCache_1.10.2        R6_2.4.1                    GenomeInfoDb_1.22.1        
[11] stats4_3.6.3                RSQLite_2.2.0               httr_1.4.1                  pillar_1.4.4                zlibbioc_1.32.0            
[16] rlang_0.4.6                 GenomicFeatures_1.38.2      progress_1.2.2              lazyeval_0.2.2              curl_4.3                   
[21] rstudioapi_0.11             blob_1.2.1                  S4Vectors_0.24.4            Matrix_1.2-18               BiocParallel_1.20.1        
[26] readr_1.3.1                 stringr_1.4.0               ProtGenerics_1.18.0         RCurl_1.98-1.2              bit_1.1-15.2               
[31] biomaRt_2.42.1              DelayedArray_0.12.3         compiler_3.6.3              rtracklayer_1.46.0          pkgconfig_2.0.3            
[36] askpass_1.1                 BiocGenerics_0.32.0         tximport_1.14.2             openssl_1.4.1               tidyselect_1.0.0           
[41] SummarizedExperiment_1.16.1 tibble_3.0.1                GenomeInfoDbData_1.2.2      IRanges_2.20.2              matrixStats_0.56.0         
[46] XML_3.99-0.3                fansi_0.4.1                 crayon_1.3.4                dplyr_0.8.5                 dbplyr_1.4.3               
[51] GenomicAlignments_1.22.1    bitops_1.0-6                rappdirs_0.3.1              grid_3.6.3                  jsonlite_1.6.1             
[56] lifecycle_0.2.0             DBI_1.1.0                   AnnotationFilter_1.10.0     magrittr_1.5                cli_2.0.2                  
[61] stringi_1.4.6               XVector_0.26.0              ellipsis_0.3.0              vctrs_0.2.4                 ensembldb_2.10.2           
[66] tools_3.6.3                 bit64_0.9-7                 Biobase_2.46.0              glue_1.4.0                  purrr_0.3.4                
[71] hms_0.5.3                   parallel_3.6.3              AnnotationDbi_1.48.0        GenomicRanges_1.38.0        memoise_1.1.0
tximeta salmon STAR • 1.8k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 19 hours ago
United States

The way tximeta works is by reading the hash value that salmon index produces, however you can obtain this without running index. I do this for example when hashing GENCODE and Ensembl. Rob has a python package called fasta-digest you can install with pip that will produce the hash value as a standalone output.

ADD COMMENT
0
Entering edit mode

Thanks for the response. I installed and used that tool by running compute_fasta_digest --reference transcripts.fa --out results/digest, renamed the file info.json to make it look like salmon index's output, then ran

makeLinkedTxome(indexDir = "results",
            source = "Ensembl", organism = "Mus musculus",
            release = "98", genome = "GRCm38",
            fasta = "transcripts.fa", gtf = "Mus_musculus.GRCm38.98.gtf",
            jsonFile = "results/index.json")

but it still doesn't work. Here's the output from my process:

> library(tximeta)
> makeLinkedTxome(indexDir = "results",
+                 source = "Ensembl", organism = "Mus musculus",
+                 release = "98", genome = "GRCm38",
+                 fasta = "transcripts.fa", gtf = "Mus_musculus.GRCm38.98.gtf",
+                 jsonFile = "results/index.json")
writing linkedTxome to results/index.json
C:\Users\JOSEPH~1.BRO\AppData\Local\BiocFileCache\BiocFileCache\Cache
  does not exist, create directory? (yes/no): yes
saving linkedTxome in bfc (first time)
> meta <- tibble::tribble(
+   ~names,        ~files,
+   "sample1", "results/quant/salmon/STAR/sample1/quant.sf"
+ )
> se <- tximeta(meta)
importing quantifications
reading in files with read_tsv
1 
couldn't find matching transcriptome, returning non-ranged SummarizedExperiment
ADD REPLY
0
Entering edit mode

Because Ensembl 98 for mouse is supported you can skip making a linkedTxome.

You just need to insert this hash value into a file of your choosing in each sample directory (details in vignette but you’ll need to use the current release). You can use this argument:

http://bioconductor.org/packages/release/bioc/vignettes/tximeta/inst/doc/tximeta.html#other_quantifiers

ADD REPLY

Login before adding your answer.

Traffic: 857 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6