We have the results of an RNA-seq experiment with 4 time points and 3 biological replicates. We are doing pairwise comparisons of the time points. We note that we have the same number of outliers for each comparison. This, and the fact that we have one single matrix of Cook's distances, leads us to understand that the Cook's distance is only calculated once for each sample and gene, and that a gene is considered an outlier for all comparisons if the Cook's distances is too big for one of the samples. This means that a gene may be flagged as an outlier even if none of the outlier samples is included in a specific comparison. This also means that a gene may be considered to be differentially expressed with a passing padj while being caused entirely by a single extreme data point, as long as similar values are found in other time points.
We assume that the way to bypass this would be to separate the input data into separate objects for each comparison. Would this make sense? Are there any draw backs to this that we should be aware of?