Analyzing technical replicates with DESeq2
1
0
Entering edit mode
@michael-muratet-3076
Last seen 10.2 years ago
Hi Simon I've been using DESeq2 to analyze RNA-seq data I was given for a multi-factor experiment, three factors with two, three and three levels each with three 'replicates' in each cell. I recently learned that the replicates are actually technical, not biological, and I'm looking for the best way to set up the design matrix to take the correlation into account. I don't see anything in the manual or on the list, I'd appreciate your input. Should I use a weighting scheme via normalization factors? Thanks Mike Michael Muratet, Ph.D. Senior Scientist HudsonAlpha Institute for Biotechnology mmuratet at hudsonalpha.org (256) 327-0473 (p) (256) 327-0966 (f) Room 4005 601 Genome Way Huntsville, Alabama 35806
Normalization DESeq2 Normalization DESeq2 • 3.0k views
ADD COMMENT
0
Entering edit mode
@ryan-c-thompson-5618
Last seen 7 weeks ago
Icahn School of Medicine at Mount Sinaiā€¦
I believe that the simplest way to deal with technical replicates is to simply add their counts together, so that you have one column for each biological replicate. On Wednesday, May 1, 2013, Michael Muratet wrote: > Hi Simon > > I've been using DESeq2 to analyze RNA-seq data I was given for a > multi-factor experiment, three factors with two, three and three levels > each with three 'replicates' in each cell. I recently learned that the > replicates are actually technical, not biological, and I'm looking for the > best way to set up the design matrix to take the correlation into account. > > I don't see anything in the manual or on the list, I'd appreciate your > input. Should I use a weighting scheme via normalization factors? > > Thanks > > Mike > > > Michael Muratet, Ph.D. > Senior Scientist > HudsonAlpha Institute for Biotechnology > mmuratet@hudsonalpha.org <javascript:;> > (256) 327-0473 (p) > (256) 327-0966 (f) > > Room 4005 > 601 Genome Way > Huntsville, Alabama 35806 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org <javascript:;> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Dear Michael I second Ryan's advice. Rationale: unless there is a catastrophic data quality problem, then the variability from the technical replicates only reflects the (more or less Poissonian) counting noise, and is appropriately taken care of by just adding the counts. The counting noise is small compared to the biological variability for all but the genes with the lowest counts, say below ~200. OTOH, if there is a catastrophic data quality problem, then it's better to just drop the affected sample / lane / library. Best wishes Wolfgang El May 1, 2013, a las 8:25 pm, Ryan Thompson <rct at="" thompsonclan.org=""> escribi?: > I believe that the simplest way to deal with technical replicates is to > simply add their counts together, so that you have one column for each > biological replicate. > > On Wednesday, May 1, 2013, Michael Muratet wrote: > >> Hi Simon >> >> I've been using DESeq2 to analyze RNA-seq data I was given for a >> multi-factor experiment, three factors with two, three and three levels >> each with three 'replicates' in each cell. I recently learned that the >> replicates are actually technical, not biological, and I'm looking for the >> best way to set up the design matrix to take the correlation into account. >> >> I don't see anything in the manual or on the list, I'd appreciate your >> input. Should I use a weighting scheme via normalization factors? >> >> Thanks >> >> Mike >> >> >> Michael Muratet, Ph.D. >> Senior Scientist >> HudsonAlpha Institute for Biotechnology >> mmuratet at hudsonalpha.org <javascript:;> >> (256) 327-0473 (p) >> (256) 327-0966 (f) >> >> Room 4005 >> 601 Genome Way >> Huntsville, Alabama 35806 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org <javascript:;> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
On 01/05/13 20:25, Ryan Thompson wrote: > I believe that the simplest way to deal with technical replicates is to > simply add their counts together, so that you have one column for each > biological replicate. > > On Wednesday, May 1, 2013, Michael Muratet wrote: > I've been using DESeq2 to analyze RNA-seq data I was given for a > multi-factor experiment, three factors with two, three and three > levels each with three 'replicates' in each cell. I recently learned > that the replicates are actually technical, not biological, and I'm > looking for the best way to set up the design matrix to take the > correlation into account. Ryan is correct: adding up the technical replicates is the way to go. As your design is two-way, you will (hopefully) still have enough degrees of freedom left to estimate the dispersion. Simon
ADD REPLY

Login before adding your answer.

Traffic: 632 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6