DESeq vs DEXSeq

0

Entering edit mode

Robert M. Flight ▴ 280

@robert-m-flight-4158

Last seen 6 months ago

United States

We have a biological system where we are interested in finding differential expressed exons (on a genome basis) between conditions, and are wondering whether it is more appropriate to use DESeq or DEXSeq for the analysis. Have RNASeq data on three conditions. >From my understanding of the two packages, DESeq (and alternatively edgeR) allow testing for diff. expression of any object one can define counts for, whereas DEXSeq looks for genes (however defined) where there are only one or a few exons that show differential expression. My initial belief was that DEXSeq was the best choice, however we are working with data from Rat, which has rather poorly annotated exons, especially in non-coding regions (i.e. UTRs). Therefore, I am thinking of defining exons based on a combination of the current annotation, known UTRs, and exons assembled by CuffLinks. I am not sure how this set of exons would fit into DEXSeq, and it seems to me that DESeq would be more appropriate, with determination after DE analysis to determine exon location (CDS, UTR, etc). I would appreciate insights or experiences others have had. Regards, -Robert Robert M. Flight, Ph.D. University of Louisville Bioinformatics Laboratory University of Louisville Louisville, KY PH 502-852-1809 (HSC) PH 502-852-0467 (Belknap) EM robert.flight at louisville.edu EM rflight79 at gmail.com robertmflight.blogspot.com bioinformatics.louisville.edu/lab The most exciting phrase to hear in science, the one that heralds new discoveries, is not "Eureka!" (I found it!) but "That's funny ..." - Isaac Asimov

RNASeq Annotation DESeq DEXSeq RNASeq Annotation DESeq DEXSeq • 2.9k views

ADD COMMENT • link updated 12.9 years ago by Simon Anders ★ 3.8k • written 12.9 years ago by Robert M. Flight ▴ 280

0

Entering edit mode

Simon Anders ★ 3.8k

@simon-anders-3855

Last seen 4.5 years ago

Zentrum für Molekularbiologie, Universi…

Dear Robert On 03/26/2012 04:26 PM, Robert M. Flight wrote: >> From my understanding of the two packages, DESeq (and alternatively > edgeR) allow testing for diff. expression of any object one can define > counts for, whereas DEXSeq looks for genes (however defined) where > there are only one or a few exons that show differential expression. The crucial difference between DESeq and DEXSeq is that the latter aims to tease apart changes to the overall expression strength of a gene and changes to only some of its exons. Conceptionally, we consider the for each sample the fraction "number of reads overlapping with the exon (or: counting bin) under consideration" over "number of reads mapping to any exon of the gene". If the gene's overall expression changes but the relative abundances of the different transcripts stay the same, these fractions do not change, and DEXSeq will not call this counting bin significant even if its absolute count does change significantly. (Note that this is a simplified explanation of what DEXSeq does conceptually. To see what it actually does, please see our preprint on Nature Precedings.) > My initial belief was that DEXSeq was the best choice, however we are > working with data from Rat, which has rather poorly annotated exons, > especially in non-coding regions (i.e. UTRs). Therefore, I am thinking > of defining exons based on a combination of the current annotation, > known UTRs, and exons assembled by CuffLinks. I am not sure how this > set of exons would fit into DEXSeq, and it seems to me that DESeq > would be more appropriate, with determination after DE analysis to > determine exon location (CDS, UTR, etc). Once you have defined exons on a combination of information you trust, you can use DEXSeq. All you need is a table of counts, one column for each sample and one row for each exon -- or for whatever counting bins you want to define: It may be useful, for example, to keep the UTR and the coding part of outer exons separate. Then, define a factor to indicate which rows belong to the same gene and use this to call 'createExonCountSet'. Simon

ADD COMMENT • link 12.9 years ago Simon Anders ★ 3.8k

0

Entering edit mode

Thanks Simon, that was a really useful explanation of how we might want to go about it. -Robert Robert M. Flight, Ph.D. University of Louisville Bioinformatics Laboratory University of Louisville Louisville, KY PH 502-852-1809 (HSC) PH 502-852-0467 (Belknap) EM robert.flight at louisville.edu EM rflight79 at gmail.com robertmflight.blogspot.com bioinformatics.louisville.edu/lab The most exciting phrase to hear in science, the one that heralds new discoveries, is not "Eureka!" (I found it!) but "That's funny ..." - Isaac Asimov On Mon, Mar 26, 2012 at 10:52, Simon Anders <anders at="" embl.de=""> wrote: > Dear Robert > > > On 03/26/2012 04:26 PM, Robert M. Flight wrote: >>> >>> From my understanding of the two packages, DESeq (and alternatively >> >> edgeR) allow testing for diff. expression of any object one can define >> counts for, whereas DEXSeq looks for genes (however defined) where >> there are only one or a few exons that show differential expression. > > > The crucial difference between DESeq and DEXSeq is that the latter aims to > tease apart changes to the overall expression strength of a gene and changes > to only some of its exons. Conceptionally, we consider the for each sample > the fraction "number of reads overlapping with the exon (or: counting bin) > under consideration" over "number of reads mapping to any exon of the gene". > If the gene's overall expression changes but the relative abundances of the > different transcripts stay the same, these fractions do not change, and > DEXSeq will not call this counting bin significant even if its absolute > count does change significantly. > > (Note that this is a simplified explanation of what DEXSeq does > conceptually. To see what it actually does, please see our preprint on > Nature Precedings.) > > >> My initial belief was that DEXSeq was the best choice, however we are >> working with data from Rat, which has rather poorly annotated exons, >> especially in non-coding regions (i.e. UTRs). Therefore, I am thinking >> of defining exons based on a combination of the current annotation, >> known UTRs, and exons assembled by CuffLinks. I am not sure how this >> set of exons would fit into DEXSeq, and it seems to me that DESeq >> would be more appropriate, with determination after DE analysis to >> determine exon location (CDS, UTR, etc). > > > Once you have defined exons on a combination of information you trust, you > can use DEXSeq. All you need is a table of counts, one column for each > sample and one row for each exon -- or for whatever counting bins you want > to define: It may be useful, for example, to keep the UTR and the coding > part of outer exons separate. Then, define a factor to indicate which rows > belong to the same gene and use this to call 'createExonCountSet'. > > ?Simon > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 12.9 years ago Robert M. Flight ▴ 280

Login before adding your answer.