I'm confused about LIMMA statistics

0

Entering edit mode

Amy Johnson ▴ 40

@amy-johnson-3014

Last seen 10.4 years ago

Hi, I'm new in microarray data analysis and I have a quick and maybe "simple" question about LIMMA and hope some one can help. We have 6 Agilent data (3 controls and 3 treated samples). I like to use LIMMA to figure out differentially expressing genes. I'm confused about LIMMA statistics (logFC, AveExpr, t, P.Value, adj.P.Val, and B). Which one researchers typically use to select differentially expressing genes? Can I simple use adj.P.Val < 0.05 or P.Value < 0.05? Or, should I use combination of these statistics? Thanks. Amy [[alternative HTML version deleted]]

Microarray limma Microarray limma • 1.2k views

ADD COMMENT • link updated 13.8 years ago by Sean Davis 21k • written 13.8 years ago by Amy Johnson ▴ 40

0

Entering edit mode

Mete Civelek ▴ 180

@mete-civelek-4566

Last seen 10.4 years ago

Hi All, I want countGenomicOverlaps to output a weighted hit count such that when a read maps to, for example four loci, a feature at one of those loci would get 1/4th of a count from that read. At the moment, countGenomicOverlaps doesn't behave the way I expect it to. Consider this example: subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), strand='+')) qry = GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100),c(20,70,110)), strand='+')) countGenomicOverlaps(qry, subj, resolution='divide') I would have expected the hit count to be 1/3 but instead it reports it as 1/2. Am I using this function correctly? My sessioninfo is: R version 2.12.2 (2011-02-25) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] GenomicRanges_1.4.0 IRanges_1.10.0 IMPORTANT WARNING: This email (and any attachments) is ...{{dropped:9}}

ADD COMMENT • link 13.8 years ago Mete Civelek ▴ 180

0

Entering edit mode

Hi Mete, Yes, you are using the function correctly and you have found a bug. I'll let you know as soon as it's fixed. Thanks, Valerie On 04/25/2011 04:38 PM, Mete Civelek wrote: > Hi All, > > I want countGenomicOverlaps to output a weighted hit count such that when a > read maps to, for example four loci, a feature at one of those loci would > get 1/4th of a count from that read. > At the moment, countGenomicOverlaps doesn't behave the way I expect it to. > > Consider this example: > > subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), strand='+')) > qry = GRangesList(read1=GRanges(seq='1', IRanges(c(10,60,100),c(20,70,110)), > strand='+')) > countGenomicOverlaps(qry, subj, resolution='divide') > > I would have expected the hit count to be 1/3 but instead it reports it as > 1/2. Am I using this function correctly? > > My sessioninfo is: > > > R version 2.12.2 (2011-02-25) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] GenomicRanges_1.4.0 IRanges_1.10.0 > > > > IMPORTANT WARNING: This email (and any attachments) is ...{{dropped:9}} > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 13.8 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

Mete, The bug is now fixed in the devel trunk (version 1.5.5) and the release branch (version 1.4.2). It will be a day before the new package versions propagate through the build system and are available through biocLite. If you want to retrieve them directly they are available via svn at release : https://hedgehog.fhcrc.org/bioconductor/branches/RELEASE_2_8/madman/Rp acks/GenomicRanges devel : https://hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/GenomicRan ges I've included an additional example on the man page for countGenomicOverlaps illustrating the handling of split reads. Let me know if you run into other problems. Take care, Valerie On 04/26/2011 03:56 PM, Valerie Obenchain wrote: > Hi Mete, > > Yes, you are using the function correctly and you have found a bug. > I'll let you know as soon as it's fixed. > > Thanks, > Valerie > > > On 04/25/2011 04:38 PM, Mete Civelek wrote: >> Hi All, >> >> I want countGenomicOverlaps to output a weighted hit count such that >> when a >> read maps to, for example four loci, a feature at one of those loci >> would >> get 1/4th of a count from that read. >> At the moment, countGenomicOverlaps doesn't behave the way I expect >> it to. >> >> Consider this example: >> >> subj = GRangesList(feature1=GRanges(seq='1', IRanges(10,30), >> strand='+')) >> qry = GRangesList(read1=GRanges(seq='1', >> IRanges(c(10,60,100),c(20,70,110)), >> strand='+')) >> countGenomicOverlaps(qry, subj, resolution='divide') >> >> I would have expected the hit count to be 1/3 but instead it reports >> it as >> 1/2. Am I using this function correctly? >> >> My sessioninfo is: >> >> >> R version 2.12.2 (2011-02-25) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] GenomicRanges_1.4.0 IRanges_1.10.0 >> >> >> >> IMPORTANT WARNING: This email (and any attachments) is ...{{dropped:9}} >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 13.8 years ago Valerie Obenchain ★ 6.8k

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 5 months ago

United States

On Mon, Apr 25, 2011 at 6:30 PM, Amy Johnson <a7johnson at="" gmail.com=""> wrote: > Hi, > > I'm new in microarray data analysis and I have a quick and maybe "simple" > question about LIMMA and hope some one can help. > > We have 6 Agilent data (3 controls and 3 treated samples). I like to use > LIMMA to figure out differentially expressing genes. I'm confused about > LIMMA statistics (logFC, AveExpr, t, P.Value, adj.P.Val, and B). Which one > researchers typically use to select differentially expressing genes? Can I > simple use adj.P.Val < 0.05 or P.Value < 0.05? Or, should I use combination > of these statistics? Thanks. Hi, Amy. Your best bet is to thoroughly read the Limma User Guide and the help pages for ALL the commands you used to generate your topTable results. Also, it will help to get some basics of statistics under your belt. There are no hard-and-fast rules about what should be used, but many folks will use adj.P.Val (adjusted to be a False Discovery Rate) with or without logFC cutoffs. Sean

ADD COMMENT • link 13.8 years ago Sean Davis 21k

Login before adding your answer.