My samples have a pronounced (sample-specific) length bias that I am dealing with by normalizing with EDAseq. As shown in the EDAseq vignette (sections 5.1 and 6.1), I am leaving my count data unchanged, and I add the offset later to the DGEList object created with edgeR. I would like to use the DGEList as matrix input for fry() gene set testing. However, I cannot figure out if fry takes into account the offset.
Would fry use the offset data to normalize the counts in the following code?
#EDAseq normalization with offset:
dataOffset <- withinLaneNormalization(data,"length", which="full",offset=TRUE)
dataOffset <- betweenLaneNormalization(dataOffset, which="full",offset=TRUE)
#edgeR & fry:
y <- DGEList(counts=counts(dataOffset), group=pData(dataOffset)$conditions)
y$offset <- -offst(dataOffset)
y <- estimateDisp(y, design)
fr <- fry(y, index, design, contrast)
Or should I rather do as follows?
#EDAseq normalization with changed counts:
dataWithin <- withinLaneNormalization(data,"length", which="full", offset=FALSE)
dataNorm <- betweenLaneNormalization(dataWithin, which="full", offset=FALSE)
#edgeR & fry:
y <- DGEList(counts=counts(dataNorm), group=pData(dataNorm)$conditions)
y <- estimateDisp(y, design)
fr <- fry(y, index, design, contrast)
I see, thanks a lot!