Question

VST Failing Error in estimateDispersionsFit(object, quiet = TRUE, fitType)

0

Entering edit mode

Mia ▴ 10

@mia-24145

Last seen 4.4 years ago

Hi all!

First post here ever! Never thought I'd be confused enough and not find an answer on the internet for my problem, lol. Anyway!

I am using a public dataset of cancer brain samples and want to create a heatmap of sample distances as according to: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#heatmap-of-the-sample-to-sample-distances

My code is the following:


de = DESeqDataSetFromMatrix(countData = exprs(recountDataCombat), 
                            colData = pData(recountDataCombat),
                            design = formula)

de <- estimateSizeFactors(de)
de <- estimateDispersionsGeneEst(de)
dispersions(de) <- mcols(de)$dispGeneEst
glm_all_nb_combat <- nbinomWaldTest(de)

res <- results(glm_all_nb_combat, name=resultsNames(glm_all_nb_combat)[2]) 
.myMAPlot(res, name=title_figure_2)

# Sample Distances
dds <- glm_all_nb_combat
vsd <- vst(dds, blind=FALSE)

But I get the following error

Error in estimateDispersionsFit(object, quiet = TRUE, fitType) : 
  all gene-wise dispersion estimates are within 2 orders of magnitude
  from the minimum value, and so the standard curve fitting techniques will not work.
  One can instead use the gene-wise estimates as final estimates:
  dds <- estimateDispersionsGeneEst(dds)
  dispersions(dds) <- mcols(dds)$dispGeneEst
  ...then continue with testing using nbinomWaldTest or nbinomLRT

I am confused by this since I did follow this advice in my code when I did

de <- estimateSizeFactors(de)
de <- estimateDispersionsGeneEst(de)
dispersions(de) <- mcols(de)$dispGeneEst
glm_all_nb_combat <- nbinomWaldTest(de)

I want to avoid rlog at all costs since I have around 600~ samples. But my questions are:

Does anyone know away around this?

If not does anyone know where I can find the source code?

Is VST not how I should be normalizing these samples since the dispersion estimates are so low?

Thank you in advance! :D Mia Altieri

vst varianceStabilizingTransformation DESeq2 • 2.3k views

ADD COMMENT • link 4.4 years ago Mia ▴ 10

score 3 · Accepted Answer · 2020-11-13

3

Entering edit mode

Michael Love 43k

@mikelove

Last seen 1 day ago

United States

The message above is saying that the data you are looking at is close to Poisson (it's not so easy to interpret, but basically all the dispersion estimates are close to 1e-8).

Not sure why you may have near Poisson data, but in that case, the shifted logarithm is a good approach:

ldat <- normTransform(dds)
plotPCA(ldat)
...

ADD COMMENT • link 4.4 years ago Michael Love 43k

1

Entering edit mode

Yes! That worked! I see your posts all the time and I think the world of you, thank you so much for helping me!

And yeah, I am not sure why this is happening either, its odd because it doesn't happen until after I run Combat, so I want to compare with other batch correction methods.

ADD REPLY • link 4.4 years ago Mia ▴ 10