Hello again,
Actually, we know that you havent recommended data filtration before running DESeq function, claiming that it only affects the speed of the function running. Interestingly, when setting a filtration strategy based on the percentage of samples having zero read counts in our data, we found that indeed the homogeneity of the data, the distribution and the normalization have been improved. The relation between the removed genes post-filtration proceeded for analysis and the number of DEG obtained wasnt linear, a peak of DEG was obtained post 71% filtration and then decreased. We see that this strategy has at least removed the experimental error coming from the low count genes that are at the threshold of detection in mRNA-seq. I would like to know what do you suggest and how can we explain these results, Is it really better to proceed without filtering the data?
Thank you