I was hoping to follow quantro and qsmooth methods in my RNA-seq data normalization. My problem is that for my data, the sequence depth (total read counts) of the different groups were not consistent (e.g. clean base number of two of these groups were respect about 5G and 7G) and the quantro results showed there were global differences between these groups. And after qsmooth, the global differences between groups were preserved and it contained a reflection of their total read counts differences. So my question is that in this case, whether an extra total read counts normalization among groups is needed to be done before using the quantro or qsmooth in the RNA-seq data analysis. Thanks a lot!
Do you think differences in the sequencing depth are related to biological or technical variation?
If it's technical,
quantro
actually performs an ANOVA to test if the medians of the distributions are different across groups. There is a parameter inquantro
(useMedianNormalized = TRUE
) to make the medians the same if the medians are different (https://github.com/stephaniehicks/quantro/blob/3fe77cae1c7f8ede0b1a1dda52e17981fd79fbcf/R/AllClasses.R#L196). The default is TRUE, but you can set this to false if you suspect the difference is due to biological variation.If it is biological, you can also try using control genes (e.g. https://support.bioconductor.org/p/113059/#113061).