Do my Limma results look "normal"?
1
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia
Dear Paul, > Date: Thu, 5 Jun 2008 13:42:40 +0100 > From: "Paul Geeleher" <paulgeeleher at="" gmail.com=""> > Subject: [BioC] Do my Limma results look "normal"? > To: Bioconductor <bioconductor at="" stat.math.ethz.ch=""> > > Hi, > > This is the first time I've ever analyzed a microarray experiment > using Limma (or anything else for that matter) and I was hoping that > somebody could look at my results and tell me if they look normal. You're asking a question that doesn't really have an answer, because all experiments are different and give different results. Your results suggest a lot of probes are strongly DE, with a predominance of down over up regulated results. You're the only one who knows the background to your experiment, so you're the only one who knows whether this makes sense from a biological point of view. > The experiment is measuring differential expression between miRNAs of > HER2+ and HER2- breast cancer tissue. There are 3 HER2+ arrays and 4 > HER2- arrays and each of the 399 miRNAs is replicated 4 times in each > array. > > TopTable() reveals the following miRNAs with a fold change above 1.5, > which I thought was a reasonable cutoff: If you want a fold change of 1.5, you need lfc=log2(1.5) not lfc=1.5. > ID logFC t P.Value adj.P.Val B > 273 hsa-miR-451 -4.645060 -8.226854 4.510441e-09 9.246404e-07 10.8484797 > 128 hsa-miR-205 3.551495 7.574564 2.370061e-08 3.239083e-06 9.2222865 > 13 hsa-miR-101 -2.310652 -6.569497 3.374177e-07 2.567796e-05 6.6146751 > 282 hsa-miR-486 -2.686910 -6.542808 3.626060e-07 2.567796e-05 6.5439656 > 55 hsa-miR-144 -2.890719 -5.889594 2.152998e-06 1.261042e-04 4.7952480 > 387 mmu-miR-463 -2.609257 -5.764143 3.042120e-06 1.559086e-04 4.4561920 > 388 mmu-miR-464 -2.080402 -5.696976 3.662006e-06 1.668247e-04 4.2743601 > 151 hsa-miR-223 -1.722956 -5.637290 4.318942e-06 1.770766e-04 4.1126276 > 51 hsa-miR-142-3p -3.262824 -5.397809 8.386312e-06 3.125807e-04 3.4626378 > 14 hsa-miR-101_MM1 -1.922710 -5.224075 1.358743e-05 4.175776e-04 2.9905370 > 159 hsa-miR-26b_MM2 -2.221853 -5.206724 1.425875e-05 4.175776e-04 2.9433849 > 236 hsa-miR-376a_MM1 -1.633555 -4.653220 6.637043e-05 1.700742e-03 1.4433277 > 266 hsa-miR-432* 1.512622 4.627293 7.131510e-05 1.719952e-03 1.3734422 > 168 hsa-miR-29b -1.954087 -4.198854 2.323860e-04 4.763912e-03 0.2280262 > 31 hsa-miR-126*_MM2 -1.537988 -3.209957 3.233842e-03 5.099520e-02 -2.2888897 > 52 hsa-miR-142-5p -1.881192 -2.831493 8.332384e-03 9.002153e-02 -3.1731794 > > > Another person is sanity testing this data using GeneSpring and they > are getting much higher p-values compared to mine. This is not surprising considering that GeneSpring has chosen not to use any statistical test invented since 1947. In particular, it has not taken any advantage of the last 8 years' intensive research on differential expression for microarray data. > They are also taking the step of excluding quite a few of the miRNAs > from the experiment based on their standard deviation across the arrays > of each group. Should I be doing this also or is this taken into account > by the eBayes() function or lmFit()? You could choose to filter on raw standard deviation across all arrays. Some authors recommend this. Or you could filter on mean intensity. With only 399 probes on your arrays, I doubt either of these things would make much difference, but they might. It does not make sense to filter on standard deviation computed within groups. Best wishes Gordon > If you are interested the script I wrote to do the analysis is here: > http://article.gmane.org/gmane.science.biology.informatics.conductor /18032/match=miRNA > > Thanks for any advice, > > -Paul.
Microarray Cancer Breast limma GeneSpring Microarray Cancer Breast limma GeneSpring • 1.3k views
ADD COMMENT
0
Entering edit mode
Paul Geeleher ★ 1.3k
@paul-geeleher-2679
Last seen 10.2 years ago
Gordon thanks for your advice again, excellent as always. One more thing though, can you clear up for me if the standard deviation across all arrays is taken into account by Limma when calculating the p-values etc? I presume it is? The main reason I need to be clear on this is that my colleague using GeneSpring is booting my top miRNA (hsa-miR-451) out of his analysis completely based on standard deviation across the arrays. -Paul On Sat, Jun 7, 2008 at 7:37 AM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: > > Dear Paul, > >> Date: Thu, 5 Jun 2008 13:42:40 +0100 >> From: "Paul Geeleher" <paulgeeleher at="" gmail.com=""> >> Subject: [BioC] Do my Limma results look "normal"? >> To: Bioconductor <bioconductor at="" stat.math.ethz.ch=""> >> >> Hi, >> >> This is the first time I've ever analyzed a microarray experiment >> using Limma (or anything else for that matter) and I was hoping that >> somebody could look at my results and tell me if they look normal. > > You're asking a question that doesn't really have an answer, because all > experiments are different and give different results. Your results suggest > a lot of probes are strongly DE, with a predominance of down over up > regulated results. You're the only one who knows the background to your > experiment, so you're the only one who knows whether this makes sense from a > biological point of view. > >> The experiment is measuring differential expression between miRNAs of >> HER2+ and HER2- breast cancer tissue. There are 3 HER2+ arrays and 4 >> HER2- arrays and each of the 399 miRNAs is replicated 4 times in each >> array. >> >> TopTable() reveals the following miRNAs with a fold change above 1.5, >> which I thought was a reasonable cutoff: > > If you want a fold change of 1.5, you need lfc=log2(1.5) not lfc=1.5. > >> ID logFC t P.Value adj.P.Val >> B >> 273 hsa-miR-451 -4.645060 -8.226854 4.510441e-09 9.246404e-07 >> 10.8484797 >> 128 hsa-miR-205 3.551495 7.574564 2.370061e-08 3.239083e-06 >> 9.2222865 >> 13 hsa-miR-101 -2.310652 -6.569497 3.374177e-07 2.567796e-05 >> 6.6146751 >> 282 hsa-miR-486 -2.686910 -6.542808 3.626060e-07 2.567796e-05 >> 6.5439656 >> 55 hsa-miR-144 -2.890719 -5.889594 2.152998e-06 1.261042e-04 >> 4.7952480 >> 387 mmu-miR-463 -2.609257 -5.764143 3.042120e-06 1.559086e-04 >> 4.4561920 >> 388 mmu-miR-464 -2.080402 -5.696976 3.662006e-06 1.668247e-04 >> 4.2743601 >> 151 hsa-miR-223 -1.722956 -5.637290 4.318942e-06 1.770766e-04 >> 4.1126276 >> 51 hsa-miR-142-3p -3.262824 -5.397809 8.386312e-06 3.125807e-04 >> 3.4626378 >> 14 hsa-miR-101_MM1 -1.922710 -5.224075 1.358743e-05 4.175776e-04 >> 2.9905370 >> 159 hsa-miR-26b_MM2 -2.221853 -5.206724 1.425875e-05 4.175776e-04 >> 2.9433849 >> 236 hsa-miR-376a_MM1 -1.633555 -4.653220 6.637043e-05 1.700742e-03 >> 1.4433277 >> 266 hsa-miR-432* 1.512622 4.627293 7.131510e-05 1.719952e-03 >> 1.3734422 >> 168 hsa-miR-29b -1.954087 -4.198854 2.323860e-04 4.763912e-03 >> 0.2280262 >> 31 hsa-miR-126*_MM2 -1.537988 -3.209957 3.233842e-03 5.099520e-02 >> -2.2888897 >> 52 hsa-miR-142-5p -1.881192 -2.831493 8.332384e-03 9.002153e-02 >> -3.1731794 >> >> >> Another person is sanity testing this data using GeneSpring and they >> are getting much higher p-values compared to mine. > > This is not surprising considering that GeneSpring has chosen not to use any > statistical test invented since 1947. In particular, it has not taken any > advantage of the last 8 years' intensive research on differential expression > for microarray data. > >> They are also taking the step of excluding quite a few of the miRNAs from >> the experiment based on their standard deviation across the arrays of each >> group. Should I be doing this also or is this taken into account by the >> eBayes() function or lmFit()? > > You could choose to filter on raw standard deviation across all arrays. Some > authors recommend this. Or you could filter on mean intensity. With only > 399 probes on your arrays, I doubt either of these things would make much > difference, but they might. > > It does not make sense to filter on standard deviation computed within > groups. > > Best wishes > Gordon > >> If you are interested the script I wrote to do the analysis is here: >> >> http://article.gmane.org/gmane.science.biology.informatics.conducto r/18032/match=miRNA >> >> Thanks for any advice, >> >> -Paul. > -- Paul Geeleher Department of Mathematics National University of Ireland Galway Ireland
ADD COMMENT
0
Entering edit mode
Paul Geeleher wrote: > Gordon thanks for your advice again, excellent as always. > > One more thing though, can you clear up for me if the standard > deviation across all arrays is taken into account by Limma when > calculating the p-values etc? I presume it is? The main reason I need > to be clear on this is that my colleague using GeneSpring is booting > my top miRNA (hsa-miR-451) out of his analysis completely based on > standard deviation across the arrays. You are not telling us quite enough here, some folks filter because there is too little variability and others might filter for too much. Which is it (and a plot or some summary stats would not hurt, if you really want informed opinion). Other potential differences that are likely to have big impact with microRNA arrays are how it was normalized (as I tried to point out previously); almost all assumptions used to normalize mRNA expression arrays are not valid for miRNA arrays. You seemed to have used vsn for normalization (I doubt that is an option in GeneSpring, but don't know as I don't have access to it). So you might also want to see if you can run their pipeline on data normalized by vsn so you can more clearly see what the differences are between the algorithms for assessing DE (you have too many factors in the mix). But really you and your colleague have all the data (and we don't) and a few plots (as suggested earlier) should reveal what the real issues are here. best wishes Robert > > -Paul > > On Sat, Jun 7, 2008 at 7:37 AM, Gordon K Smyth <smyth at="" wehi.edu.au=""> wrote: >> Dear Paul, >> >>> Date: Thu, 5 Jun 2008 13:42:40 +0100 >>> From: "Paul Geeleher" <paulgeeleher at="" gmail.com=""> >>> Subject: [BioC] Do my Limma results look "normal"? >>> To: Bioconductor <bioconductor at="" stat.math.ethz.ch=""> >>> >>> Hi, >>> >>> This is the first time I've ever analyzed a microarray experiment >>> using Limma (or anything else for that matter) and I was hoping that >>> somebody could look at my results and tell me if they look normal. >> You're asking a question that doesn't really have an answer, because all >> experiments are different and give different results. Your results suggest >> a lot of probes are strongly DE, with a predominance of down over up >> regulated results. You're the only one who knows the background to your >> experiment, so you're the only one who knows whether this makes sense from a >> biological point of view. >> >>> The experiment is measuring differential expression between miRNAs of >>> HER2+ and HER2- breast cancer tissue. There are 3 HER2+ arrays and 4 >>> HER2- arrays and each of the 399 miRNAs is replicated 4 times in each >>> array. >>> >>> TopTable() reveals the following miRNAs with a fold change above 1.5, >>> which I thought was a reasonable cutoff: >> If you want a fold change of 1.5, you need lfc=log2(1.5) not lfc=1.5. >> >>> ID logFC t P.Value adj.P.Val >>> B >>> 273 hsa-miR-451 -4.645060 -8.226854 4.510441e-09 9.246404e-07 >>> 10.8484797 >>> 128 hsa-miR-205 3.551495 7.574564 2.370061e-08 3.239083e-06 >>> 9.2222865 >>> 13 hsa-miR-101 -2.310652 -6.569497 3.374177e-07 2.567796e-05 >>> 6.6146751 >>> 282 hsa-miR-486 -2.686910 -6.542808 3.626060e-07 2.567796e-05 >>> 6.5439656 >>> 55 hsa-miR-144 -2.890719 -5.889594 2.152998e-06 1.261042e-04 >>> 4.7952480 >>> 387 mmu-miR-463 -2.609257 -5.764143 3.042120e-06 1.559086e-04 >>> 4.4561920 >>> 388 mmu-miR-464 -2.080402 -5.696976 3.662006e-06 1.668247e-04 >>> 4.2743601 >>> 151 hsa-miR-223 -1.722956 -5.637290 4.318942e-06 1.770766e-04 >>> 4.1126276 >>> 51 hsa-miR-142-3p -3.262824 -5.397809 8.386312e-06 3.125807e-04 >>> 3.4626378 >>> 14 hsa-miR-101_MM1 -1.922710 -5.224075 1.358743e-05 4.175776e-04 >>> 2.9905370 >>> 159 hsa-miR-26b_MM2 -2.221853 -5.206724 1.425875e-05 4.175776e-04 >>> 2.9433849 >>> 236 hsa-miR-376a_MM1 -1.633555 -4.653220 6.637043e-05 1.700742e-03 >>> 1.4433277 >>> 266 hsa-miR-432* 1.512622 4.627293 7.131510e-05 1.719952e-03 >>> 1.3734422 >>> 168 hsa-miR-29b -1.954087 -4.198854 2.323860e-04 4.763912e-03 >>> 0.2280262 >>> 31 hsa-miR-126*_MM2 -1.537988 -3.209957 3.233842e-03 5.099520e-02 >>> -2.2888897 >>> 52 hsa-miR-142-5p -1.881192 -2.831493 8.332384e-03 9.002153e-02 >>> -3.1731794 >>> >>> >>> Another person is sanity testing this data using GeneSpring and they >>> are getting much higher p-values compared to mine. >> This is not surprising considering that GeneSpring has chosen not to use any >> statistical test invented since 1947. In particular, it has not taken any >> advantage of the last 8 years' intensive research on differential expression >> for microarray data. >> >>> They are also taking the step of excluding quite a few of the miRNAs from >>> the experiment based on their standard deviation across the arrays of each >>> group. Should I be doing this also or is this taken into account by the >>> eBayes() function or lmFit()? >> You could choose to filter on raw standard deviation across all arrays. Some >> authors recommend this. Or you could filter on mean intensity. With only >> 399 probes on your arrays, I doubt either of these things would make much >> difference, but they might. >> >> It does not make sense to filter on standard deviation computed within >> groups. >> >> Best wishes >> Gordon >> >>> If you are interested the script I wrote to do the analysis is here: >>> >>> http://article.gmane.org/gmane.science.biology.informatics.conduct or/18032/match=miRNA >>> >>> Thanks for any advice, >>> >>> -Paul. > > > -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 rgentlem at fhcrc.org
ADD REPLY

Login before adding your answer.

Traffic: 786 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6