Hello,
in the DESeq2 vignette https://bioconductor.org/packages/devel/bioc/vignettes/tximport/inst/doc/tximport.html#Downstream_DGE_in_Bioconductor, there is indication on how to combine salmon transcript quantifications with tximport to generate a suitable gene count input for DESeq2. I'm wondering whether or not the gene counts obtained directly with salmon using the -g option provide a suitable input as well. Namely, the files quant.genes.sf generated by salmon with the -g option contain, besides a TPM column, also have a NumReads column. Are the values in the latter (after casting to integer) appropriate for direct input to DESeq2?
Thanks
Not exactly, because you don't have the offset, which we recommend (see the tximport publication for details why). That's the point of tximport, to easily and quickly provide the counts plus the appropriate offset. By the way the offset also deals with technical biases like GC content as well (if you use --gcBias when running Salmon).
If you want to ignore the offset for changes to average transcript length / technical biases, then yes, you would use that column as the "counts".
Thanks for the clarification!