Hi,
I have to analyze some TnSeq data. Since I am quite new to this kind of data, I would like to post here a couple of questions, one more “conceptual” than the other. I hope this is the right forum for my question.
1- Why do we tend to analyze TnSeq data as ZINB-distributed data? What is it so different from RNASeq? I understand that in TnSeq there are loads of zero counts and, crucially, we do not know where those zeros come from, i.e. we do not know if it is a zero because (i) a given TA site has not been used in the library or (ii) because the gene is “essential” and, therefore, mutants with insertions in that genes have not survived. In RNASeq we also have loads of zeros, and we do not know if that is because the gene is not expressed or because the sample has not been sequenced deep enough. Therefore, I cannot tell the difference, to be honest.
2- Regarding practical issues. I know there is TRANSIT in Python to analyze TnSeq data, including multifactorial designs (which is my case). However, I have not seen similar tools in R. Am I wrong? Could DESeq2 or edgeR be used? Do they provide, for instance, appropriate normalization methods for this type of data
Thanks a lot for any help or hints on those two questions. Best regards, David R.