Hi All,
I am struggling with importing kallisto output into tximport in order to afterwards use DESeq2 for differential expression at gene level. I just reinstalled R, Bioconductor, tximport, tximportData, as I saw in some of the previous posts this solved the issues. I simply can't figure out why the code below works for salmon but not for kallisto... For my own data, I only have kallisto output and I get the same result with file.exists, 'FALSE', the files are not visible. Please see the code and output below.
sessioninfo()
library(tximportData)
library(tximport)
dir <- system.file("extdata", package = "tximportData")
list.files(dir)
[1] "alevin" "cufflinks" "kallisto"
[4] "kallisto_boot" "refseq" "rsem"
[7] "sailfish" "salmon" "salmon_dm"
[10] "salmon_ec" "salmon_gibbs" "samples.txt"
[13] "samples_extended.txt" "tx2gene.csv" "tx2gene.ensembl.v87.csv"
[16] "tx2gene.gencode.v27.csv" "tx2gene_alevin.tsv"
samples <- read.table(file.path(dir, "samples.txt"), header = TRUE)
samples
pop center assay sample experiment run
1 TSI UNIGE NA20503.1.M_111124_5 ERS185497 ERX163094 ERR188297
2 TSI UNIGE NA20504.1.M_111124_7 ERS185242 ERX162972 ERR188088
3 TSI UNIGE NA20505.1.M_111124_6 ERS185048 ERX163009 ERR188329
4 TSI UNIGE NA20507.1.M_111124_7 ERS185412 ERX163158 ERR188288
5 TSI UNIGE NA20508.1.M_111124_2 ERS185362 ERX163159 ERR188021
6 TSI UNIGE NA20514.1.M_111124_4 ERS185217 ERX163062 ERR188356
files <- file.path(dir, "salmon", samples$run, "quant.sf.gz")
names(files) <- paste0("sample", 1:6)
all(file.exists(files))
[1] TRUE
files <- file.path(dir, "kallisto", samples$run, "abundance.tsv")
names(files) <- paste0("sample", 1:6)
all(file.exists(files))
[1] FALSE
list.files(dir, recursive = TRUE) shows all the kallisto files are present.
Any help is highly appreciated. Thank you.
Hi James,
Thanks a million!!
I very much appreciate your time and help. The issue was indeed the "don't do that" part... Your code fixed this, it works perfect and I just managed to adapt it to process my data.
Thank you,
Anna