Surprising behavior for DEXseq dispersion estimate plot?
1
0
Entering edit mode
Elijah • 0
@937d4250
Last seen 2 hours ago
United States

Hello,

I am running DEXseq, where I have adapted DEXseq to look at relative polyA site usage rather than relative exon usage. Upstream of running DEXseq, I called polyA sites in my dataset and got counts for the sites.

My concern is that when I look at my dispersion estimate plot: enter image description here

There is a secondary "cloud" of points around 1e-04.

Generally, what I'm wondering is: is this type of shape expected for the DEXseq dispersion estimate plots, or is this indicative of some type of artifact in my upstream data handling?

For a bit more context, I've tried different thresholding of the minimum # of counts for each feature in my dataset, and that doesn't seem to affect the shape of the dispersion estimate plot. There doesn't appear to be any correlations with the lengths of the features (some pA sites nearby each other are clustered together) or with adjusted p-values or with log fold change values.

Some specific sites also appear (to my eyes) to have very similar counts, but different dispersion estimates, such as:

A

enter image description here

where the dispGeneEst is 2.330017e-05

and

B

enter image description here

where the dispGeneEst is 5.466213e-02.

In fact, the only discernible difference between the two is that the dispGeneIter is 71 for A and 2 for B, and this trend of dispGeneIter being much higher for the sites with dispGeneEsts ~1e-04 seems true for many cases.

EDIT: Here is the dispersion estimate plot colored by the # of iterations to run, with red = higher # of iterations and blue = lower # of iterations:

enter image description here

Is this behavior expected in DEXseq?

However, changing parameters that I thought might affect the MLE, such as niter to 20 or changing maxit to 500, does not have any significant affect on the plot either.

Thank you for any advice!

Code run pasted below:


print('set up for DEXseq analysis')

  #Set up for DEXSeq analysis
  dxd <- DEXSeqDataSet(countData, sampleInfo,
                 design= ~ sample + exon + type:exon + condition:exon,
                 featureID, GeneID)

  print('Carry out DEXSeq analysis main steps')

  dxd = estimateSizeFactors( dxd )

  formulaFullModel    =  ~ sample + exon + type:exon + condition:exon
  formulaReducedModel =  ~ sample + exon + type:exon 
  dxd = estimateDispersions( dxd, formula = formulaFullModel)

  plotDispEsts( dxd )

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
DESeq2 DEXSeq • 95 views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 6 hours ago
United States

The red points are just slowly moving toward -Inf. They tend to have variance < mean (if it were a simple design and you could just look at the marginal variance of counts). It's not a concern that those points are there.

ADD COMMENT
0
Entering edit mode

Thank you so much for your quick reply! I was worried if some underlying technical artifact or unusual behavior would be causing the pattern seen with the red points - I appreciate your clarification that it's not concerning.

ADD REPLY

Login before adding your answer.

Traffic: 515 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6