DESeq2: Question regarding changes between Bioconductor 3.2 and 3.4 or recent version changes
2
0
Entering edit mode
@daniel-elsner-11729
Last seen 7.5 years ago
University of Freiburg, Evolutionary Bi…

Dear Dr. Love, dear persons knowledgeable about DESeq2

Yesterday (24. Oct.2016) I repeated an analysis of differentially expressed genes I had previously analyzed. This was after an update from Bioconductor 3.2 to 3.4. I was surprised when the resulting differentially expressed genes differed slightly from previous results that were generated at the end of August (with a presumably older version of DESeq2).

I am using a multifactor comparison, there are 24 samples with two states. In one example, a gene that was barely significant (one of two) in a state with a p-value of about 0.04, was now no longer significant with 0.06. Not all states changed, however.

The input data is identical, the scripts are identical yet the old outcome is no longer reproducible. I am reasonably certain I excluded potential errors on my side, checking with all my input files. Besides the computer with the "new" results, I have a laptop that still produces the "old" expression results. Both run on Kubuntu 16.04 with R 3.3.1. The only difference is that on the computer with the changed results, I updated Bioconductor to 3.4 (from 3.2) and updated all packages. 3.2 seems to have DESeq2 1.22.1

Therefore, I suspect there could have been a change in a calculation in either DESeq2 or one of the dependencies. When investigating potential changes in DESeq2, I found that the changelog (the "NEWS" on the Bioconductor page) don't seem to contain Information on what changed between the last version in the file (1.13.8) and the current 1.14.0.

Is it maybe this bug that could have affected the outcome of the calculation?

    Fixed bug: normalization factors and VST.

I would like to ask if there has been any change between version 1.22.1 and now that could affect results, and if so, which ones are correct?

 

Thanks in advance

Daniel

deseq2 new version bioconductor software error reproducibility • 1.2k views
ADD COMMENT
3
Entering edit mode
@mikelove
Last seen 18 hours ago
United States

From 1.12 to 1.14, there was a change in the routine for estimating gene-wise dispersion values which could have made small changes to dispersion estimates and p-values. This is in the NEWS note for 1.13.8.

"yet the old outcome is no longer reproducible"

Just to make a point here about "reproducible": the old outcome is in fact reproducible using the same versions of software. In this way Bioconductor supports reproducible workflows by hosting all of the final release versions of software back to the initial release of Bioconductor.

If you need to maintain an identical set of p-values you should not change the software versions. As a maintainer, I do not promise that the estimated values will be equal to all decimal places across versions. I can't promise this and at the same time improve the software. 

You can keep multiple versions of R and therefore Bioconductor on a single machine. For example, at any time, I have one branch of R which is the release version and one which is the devel branch. There's no limit to the number of versions you can have. You can have all the versions of R on your machine and choose which to use. Each of these will have their own package destination and therefore their own version of Bioconductor packages.

ADD COMMENT
0
Entering edit mode
@daniel-elsner-11729
Last seen 7.5 years ago
University of Freiburg, Evolutionary Bi…

Thank you for your response.

ADD COMMENT

Login before adding your answer.

Traffic: 596 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6