I try to import the kallisto results on R
I use ensembl vs87 . On bioconductor I found the version 86.
$ cat abundance.tsv |head target_id length eff_length est_counts tpm ENST00000448914.1 13 12 0 0 ENST00000631435.1 12 11 0 0 ENST00000632684.1 12 11 0 0 ENST00000434970.2 9 8 0 0 ENST00000415118.1 8 7 0 0 ENST00000633010.1 16 15 0 0
Tx <- transcripts(txdb,return.type="DataFrame") tx2gene <- as.data.frame(Tx[,c("tx_id","gene_id")]) > txi <- tximport(files, type = "kallisto", tx2gene = tx2gene, reader = read_tsv) reading in files 1 Parsed with column specification: cols( target_id = col_character(), length = col_integer(), eff_length = col_double(), est_counts = col_double(), tpm = col_double() ) Error in summarizeToGene(txi, tx2gene, ignoreTxVersion, countsFromAbundance) : None of the transcripts in the quantification files are present in the first column of tx2gene. Check to see that you are using the same annotation for both. sessionInfo() R version 3.3.2 (2016-10-31) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.2 LTS locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=it_IT.UTF-8 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=it_IT.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=it_IT.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] readr_1.1.0 tximport_1.0.3 EnsDb.Hsapiens.v86_2.1.0 [4] BiocInstaller_1.22.3 ensembldb_1.4.7 GenomicFeatures_1.24.5 [7] AnnotationDbi_1.34.4 tximportData_1.0.2 VariantAnnotation_1.18.7 [10] Rsamtools_1.24.0 Biostrings_2.40.2 XVector_0.12.1 [13] SummarizedExperiment_1.2.3 Biobase_2.32.0 GenomicRanges_1.24.3 [16] GenomeInfoDb_1.8.7 IRanges_2.6.1 S4Vectors_0.10.3 [19] BiocGenerics_0.18.0 circlize_0.3.9 loaded via a namespace (and not attached): [1] shape_1.4.2 colorspace_1.3-2 htmltools_0.3.5 [4] rtracklayer_1.32.2 yaml_2.1.14 base64enc_0.1-3 [7] interactiveDisplayBase_1.10.3 XML_3.98-1.5 DBI_0.6-1 [10] BiocParallel_1.6.6 plyr_1.8.4 stringr_1.2.0 [13] zlibbioc_1.18.0 munsell_0.4.3 gtable_0.2.0 [16] GlobalOptions_0.0.10 QoRTs_1.1.8 memoise_1.0.0 [19] evaluate_0.10 knitr_1.15.1 biomaRt_2.28.0 [22] httpuv_1.3.3 Rcpp_0.12.9 xtable_1.8-2 [25] backports_1.0.5 scales_0.4.1 BSgenome_1.40.1 [28] jsonlite_1.2 mime_0.5 AnnotationHub_2.4.2 [31] hms_0.3 digest_0.6.12 stringi_1.1.2 [34] dplyr_0.5.0 shiny_1.0.0 grid_3.3.2 [37] rprojroot_1.2 tools_3.3.2 bitops_1.0-6 [40] magrittr_1.5 tibble_1.2 RCurl_1.95-4.8 [43] lazyeval_0.2.0 RSQLite_1.1-2 assertthat_0.1 [46] rmarkdown_1.3 httr_1.2.1 R6_2.2.0 [49] GenomicAlignments_1.8.4
Note there is an argument in tximport to strip the version number for you. Jarod, see ?tximport