tximeta doesn't match gencode M23 transcripts to gtf
1
0
Entering edit mode
jtobias ▴ 10
@jtobias-22448
Last seen 21 months ago
United States

I used salmon 1.0 to count samples against gencode M23 transcripts.

Using tximeta v1.5.2, I get the error below which tells me that my tx IDs in my quant.sf files aren't in the gtf.

I believe the problem might be that the names in the gencode transcript FASTA (which match those in the quant.sf) file are like:

ENSMUST00000193812.1|ENSMUSG00000102693.1|OTTMUSG00000049935.1|OTTMUST00000127109.1|4933401J01Rik-201|4933401J01Rik|1070|TEC|

but the corresponding line in the gtf is like:

$ grep ENSMUST00000193812.1 gencode.vM23.annotation.gtf chr1 HAVANA transcript 3073253 3074322 . + . geneid "ENSMUSG00000102693.1"; transcriptid "ENSMUST00000193812.1"; genetype "TEC"; genename "4933401J01Rik"; transcripttype "TEC"; transcriptname "4933401J01Rik-201"; level 2; transcriptsupportlevel "NA"; mgiid "MGI:1918292"; tag "basic"; havanagene "OTTMUSG00000049935.1"; havana_transcript "OTTMUST00000127109.1";

Thanks for any advice!

  • John Tobias

Error text: ```

se <- tximeta(coldata) importing quantifications reading in files with readtsv 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 found matching transcriptome: [ GENCODE - Mus musculus - release M23 ] loading existing TxDb created: 2019-11-25 21:57:58 Loading required package: GenomicFeatures loading existing transcript ranges created: 2019-11-25 21:58:44 Error in checkAssays2Txps(assays, txps) : none of the transcripts in the quantification files are in the GTF In addition: Warning message: In class(object) <- "environment" : Setting class(x) to "environment" sets attribute to NULL; result will no longer be an S4 object sessionInfo() R version 3.6.1 (2019-07-05) Platform: x8664-apple-darwin15.6.0 (64-bit) Running under: macOS Catalina 10.15.1

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale: [1] enUS.UTF-8/enUS.UTF-8/enUS.UTF-8/C/enUS.UTF-8/en_US.UTF-8

attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base

other attached packages: [1] GenomicFeatures1.36.4 org.Mm.eg.db3.8.2
[3] AnnotationDbi1.46.1 openxlsx4.1.3
[5] webshot0.5.2 BiocStyle2.12.0
[7] biomaRt2.40.5 pcaExplorer2.10.1
[9] tximportData1.12.0 tximport1.12.3
[11] tximeta1.5.2 DESeq21.24.0
[13] SummarizedExperiment1.14.1 DelayedArray0.10.0
[15] BiocParallel1.18.1 matrixStats0.55.0
[17] Biobase2.44.0 GenomicRanges1.36.1
[19] GenomeInfoDb1.20.0 IRanges2.18.3
[21] S4Vectors0.22.1 BiocGenerics0.30.0
[23] readr_1.3.1

loaded via a namespace (and not attached): [1] GOstats2.50.0 backports1.1.5 Hmisc4.3-0
[4] BiocFileCache
1.8.0 NMF0.21.0 plyr1.8.4
[7] igraph1.2.4.1 lazyeval0.2.2 GSEABase1.46.0
[10] shinydashboard
0.7.1 splines3.6.1 crosstalk1.0.0
[13] ggplot23.2.1 gridBase0.4-7 digest0.6.23
[16] foreach
1.4.7 ensembldb2.8.1 htmltools0.4.0
[19] GO.db3.8.2 magrittr1.5 checkmate1.9.4
[22] memoise
1.1.0 cluster2.1.0 doParallel1.0.15
[25] limma3.40.6 Biostrings2.52.0 annotate1.62.0
[28] prettyunits
1.0.2 colorspace1.4-1 blob1.2.0
[31] rappdirs0.3.1 ggrepel0.8.1 xfun0.11
[34] dplyr
0.8.3 crayon1.3.4 RCurl1.95-4.12
[37] jsonlite1.6 graph1.62.0 genefilter1.66.0
[40] zeallot
0.1.0 survival3.1-7 iterators1.0.12
[43] glue1.3.1 registry0.5-1 gtable0.3.0
[46] zlibbioc
1.30.0 XVector0.24.0 Rgraphviz2.28.0
[49] SparseM1.77 scales1.1.0 pheatmap1.0.12
[52] DBI
1.0.0 rngtools1.4 bibtex0.4.2
[55] Rcpp1.0.3 xtable1.8-4 progress1.2.2
[58] htmlTable
1.13.2 foreign0.8-72 bit1.1-14
[61] Formula1.2-3 DT0.10 AnnotationForge1.26.0
[64] htmlwidgets
1.5.1 httr1.4.1 threejs0.3.1
[67] RColorBrewer1.1-2 shinyAce0.4.1 acepack1.4.1
[70] pkgconfig
2.0.3 XML3.98-1.20 nnet7.3-12
[73] dbplyr1.4.2 locfit1.5-9.1 tidyselect0.2.5
[76] rlang
0.4.2 reshape21.4.3 later1.0.0
[79] munsell0.5.0 tools3.6.1 RSQLite2.1.2
[82] shinyBS
0.61 evaluate0.14 stringr1.4.0
[85] fastmap1.0.1 yaml2.2.0 knitr1.26
[88] bit64
0.9-7 zip2.0.4 purrr0.3.3
[91] AnnotationFilter1.8.0 RBGL1.60.0 mime0.7
[94] compiler
3.6.1 rstudioapi0.10 curl4.2
[97] png0.1-7 tibble2.1.3 geneplotter1.62.0
[100] stringi
1.4.3 lattice0.20-38 ProtGenerics1.16.0
[103] Matrix1.2-17 vctrs0.2.0 pillar1.4.2
[106] lifecycle
0.1.0 BiocManager1.30.10 d3heatmap0.6.1.2
[109] data.table1.12.6 bitops1.0-6 httpuv1.5.2
[112] rtracklayer
1.44.4 R62.4.1 latticeExtra0.6-28
[115] promises1.1.0 topGO2.36.0 gridExtra2.3
[118] codetools
0.2-16 assertthat0.2.1 Category2.50.0
[121] pkgmaker0.27 withr2.1.2 GenomicAlignments1.20.1 [124] Rsamtools2.0.3 GenomeInfoDbData1.2.1 hms0.5.2
[127] grid3.6.1 rpart4.1-15 tidyr1.0.0
[130] rmarkdown
1.17 shiny1.4.0 base64enc0.1-3

tximeta • 1.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 23 hours ago
United States

This error is from tximport and can be solved two ways, when you index you can use --gencode to fix the names at the point, or when importing you can still fix with ignoreAfterBar. The tximport arguments can be called directly from tximeta().

ADD COMMENT
0
Entering edit mode

Thanks for the quick reply!

So I tried this with the same error:

se <- tximeta(coldata, ignoreAfterBar = TRUE)

Should I be doing something else to pass this argument to tximport?

Could it be that I need to update tximport to the devel version as well?

ADD REPLY
0
Entering edit mode

I take it back. It's tximeta that needs to ignoreAfterBar, and I don't have code for this yet. I can try to fix this tomorrow for devel branch.

In the mean time, the other solution would be to index with --gencode which is recommended for Gencode transcripts.fa files (note that it also makes the quant files smaller because the transcript IDs lose a lot of the unnecessary characters).

ADD REPLY
0
Entering edit mode

John,

Thanks for the bug report. I just pushed version 1.5.3 to Bioconductor and GitHub. Would you mind checking if it solves on your end? Just tximeta(coldata), no other arguments needed

ADD REPLY

Login before adding your answer.

Traffic: 984 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6