Question

Converting scater-normalized UMI data to Monocle CellDataSet

0

Entering edit mode

supremerulersuraj • 0

@supremerulersuraj-12999

Last seen 6.7 years ago

I had a question regarding the appropriate distribution for modeling scater-normalized UMI data. I normalized the UMI dataset (a SingleCellExperiment) to the ERCC spike-ins to capture total differences in RNA content. I then used the convertTo() function in scran to export to Monocle CellDataSet so I could perform some additional analysis there. As I understand it, scater returns counts divided by calculated size factors (not log-transformed). I wasn't sure if the convertTo() function specified a Monocle expressionFamily to use as the distribution (it looked like it was defaulting to negative binomial) - but should I be treating this data as a tobit()? Additionally, I assumed that once in Monocle I shouldn't be recalculating size factors but rather using the size factors calculated by scater?

Thanks!

scater scran umi monocle • 1.9k views

ADD COMMENT • link updated 6.9 years ago by Aaron Lun ★ 28k • written 6.9 years ago by supremerulersuraj • 0

score 0 · Answer 1 · 2018-04-24

I haven't used monocle for a while, so I can only comment on what convertTo does.

Yes, it returns non-log-transformed normalized expression values.
No, it does not specify an expression family. From reading the documentation, my best guess is to use negbinomial(), as the normalized counts are on the same scale as the raw counts. It would be better, though, if we could pass the counts and size factors directly to the CellDataSet; after perusing the function code in monocle, this may be possible, and would make my life easier, actually.
There is no need to recompute size factors if you're already computed them.

Edit: I have now modified the code so that in the next release, convertTo will return a CellDataSet containing raw counts with size factors. This is probably the correct state of affairs with negbinomial().