Question

Roast (fry) gene set testing with offset

0

Entering edit mode

gprezza • 0

@611515a7

Last seen 7 months ago

Germany

My samples have a pronounced (sample-specific) length bias that I am dealing with by normalizing with EDAseq. As shown in the EDAseq vignette (sections 5.1 and 6.1), I am leaving my count data unchanged, and I add the offset later to the DGEList object created with edgeR. I would like to use the DGEList as matrix input for fry() gene set testing. However, I cannot figure out if fry takes into account the offset.

Would fry use the offset data to normalize the counts in the following code?


#EDAseq normalization with offset:

dataOffset <- withinLaneNormalization(data,"length",  which="full",offset=TRUE)
dataOffset <- betweenLaneNormalization(dataOffset,  which="full",offset=TRUE)

#edgeR & fry:
y <- DGEList(counts=counts(dataOffset), group=pData(dataOffset)$conditions)
y$offset <- -offst(dataOffset)
y <- estimateDisp(y, design)
fr <- fry(y, index, design, contrast)

Or should I rather do as follows?


#EDAseq normalization with changed counts:

dataWithin <- withinLaneNormalization(data,"length", which="full", offset=FALSE)
dataNorm <- betweenLaneNormalization(dataWithin, which="full", offset=FALSE)

#edgeR & fry:
y <- DGEList(counts=counts(dataNorm), group=pData(dataNorm)$conditions)
y <- estimateDisp(y, design)
fr <- fry(y, index, design, contrast)

EDASeq edgeR limma • 1.2k views

ADD COMMENT • link 4.0 years ago gprezza • 0

score 1 · Answer 1 · 2020-11-25

1

Entering edit mode

Gordon Smyth 51k

@gordon-smyth

Last seen 50 minutes ago

WEHI, Melbourne, Australia

All edgeR functions that are part of the glm or glmQL pipelines use offsets, including estimateDisp and fry. So the first code will work.

You can be sure any edgeR function that works on a DGEList and has a design argument will use offsets. The offset matrix is a documented component of a DGEList and hence will be used.