Entering edit mode
Hello All,
I've got some low quality RNA-seq data or data looking dissimilar in the sample distance analysis from a couple of samples that need removal. The samples were collected before, during and after a treatment. A question is - should the problematic samples be excluded only or the entire subject data to be removed in DESeq2 analysis? Thanks.
Regards Guan
Looks like you have 3 time points, but how many samples for each time point do you have? I think that as long as you have at least 3 samples per time point, you should be fine?
Please post your entire colData (masked if necessary) so we can better answer your question.
I post the entire colData here (masked) below for further evaluation, including 5 time points per subject for 10 subjects. Sample 40 (marked in bold) in S8 is dissimilar to all the rest of samples following the sample distance analysis. This observation is in line with what we expect given less volume of Sample 40 was carried over during RNA-seq library prep. In this case, should Sample 40 or all S8 samples (i.e. 5 samples) be excluded in DESeq2 analysis? Thanks.