Hi,
With count data from a single-cell RNA seq experiment (even after filtering to exclude genes with low and very high count numbers) the data range returned from a DESeq2 rlog transformation appears dependent upon the number of samples:
Presumably this is not the expected behaviour of the transformation? (I expect the range to mimic that of a log2(n+1) transform.)
The effect on a subsequent PCA is obvious:
The VST function in DESeq2 does behave as expected (and the transformed data perform reasonably in downstream analyses) but it would be great to be able to use rlog as in this case the size factor DR > 4 (it's ~12),
Thanks for any help,
Steve
P.S. in the plots "log2" indicates a log2(n+1) transform.
Was the log2 transformation performed on normalized or raw counts?