Hi,
I want to use mfuzz
for clustering of RNA-seq results.
The problem is that I can't find easy solution how to average biological replicates.
I assumed there is simple function for averaging replicates by conditions using info from phenoData
for ExpressionSet
or sample info in case of DESeqDataSet
.
I use DESeq2
so I have DESeqDataSet
object. I've prepared ExpressionSet
object with standardised FPKM values for mfuzz
but I've realised that I need to average value for each condition.
I know, I could use aggregate
to average data in data frame but it is prone to error if used for object with many columns.
Thank you,
I've used limma's method and it works. However I have to change
assay
toexprs
.Why have you used
ntf <- normTransform(dds)
before averaging? I assumed that I'll standardize data inmfuzz
usingstandardise
function.Use whatever you feel is correct, I just put together an example for the sake of demonstration. It's a random example. Note that before standardization, you would still typically normalize and log2-transform your data.
ATpoint Thank you for directing my attention to normalization. I assumed that using FPKM is ok as it was mentioned at
Mfuzz
page. Do you think I can use FPKM?Or do you think that approach presented at sthda in "Normalization using DESeq2 (size factors)" is ok for clustering?
What you usually do is to normalize data first with respect to library size and composition (that is what the size factors do), then log2-transform and then Z-scale aka standardize. In DESeq2 the
vst
function does the two first points plus some magic extra that is beneficial for downstream analysis so I would go with that. Alternatively, log2-transformed normalized counts work as well. That is whatnormTransform
does.Thank you very much for this clear explanation. Since my last post I've read several posts about normalization but haven't found such clear information.