Hello there, the question is a bit off topic
I am currently using DESEQ2 to normalize 16S microbiome data as advised several times in the recent literature. Currently I am facing the problem that I have 16S data from animals and from water filters. I want to "subtract" the water filters from the animal data as background noise reduction. I do this based on a relative abundance transformation, just selecting ASVs that are present in the Animals but not in Water etc...
In the next step I have a lot of data that occur in both groups, at this point I want to use significance tests to assign ASVs to water filters or animals. For this I would use non-parametric Wilcoxon rank-sum tests.
My problem with the whole story is that the compositional data should be reasonably normalized for this.
My question is how DESeq2 and the VST normalization react when the dataset is artificially reduced. In RNAseqs, of course, you wouldn't just start "cutting out" genes you don't like except for filtering by counts at the beginning. Can I subtract from the dataset ASVs that are obviously present in water or animals. Then normalize the remaining dataset VST, test row-wise significance. Subtract everything that is not assigned to animal here as well. Then test the remaining dataset in DESeq2 for differential abundance in the Animal Groups? Is this approach justifiable?
I can't find any literature that describes the procedure at this point.
Dear Michael, thank you very much for your assessment and super fast answer :)