Hi,
I'm looking through the RUVseq manual and it seems that the set
object used for RUVg normalization has first been normalized using betweenLaneNormalization
from EDASeq? I tried RUVg normalization both with and without doing betweenLaneNormalization
first and I get different results. So I just wanted to confirm whether it's recommended to do betweenLaneNormalization
before RUVg normalization?
Thanks!
Jon
Thanks for the clarification!
If I may ask another related thing, our spike set (ERCC genes) has a lot of zero counts, and this causes infinite and missing values after betweenLaneNormalization. We can solve this by adding +1 to each gene, but I am not sure how this will affect the results, especially for the genes which have zero in the first place. Would you recommend to add 1 to every count?
Error message after betweenLaneNormalization and RUVg normalization:
I would perhaps consider filtering out the spike-ins with a lot of zeros and/or choose a different normalization than upper-quartile, more robust to zeros, e.g., TMM or even scran (developed specifically with data with lots of zeros).
Alternatively, you can use RUVg without normalizing the data first. In our experience, it performs slightly worse, but it's still OK. Remember that the first factor usually picks up sequencing depth, so you will probably need to increase your k by 1.
We tried filtering out those spike-ins, but we were left with so very few... Thanks for these advice, we will check them out!!