question on the cutoff for limma package
1
0
Entering edit mode
@yi-ming-nihnci-c-4571
Last seen 9.4 years ago
United States
Dear List: Recently, I used limma package to analyze some miRNA array data. One of the differential lists I derived for one of the contrasts in our limma model just used P.Value <0.01 as cutoff combined FC cutoff, we noticed that in this particular contrast, all the differential miRNAs have rather high adj.P.Val almost all miRNAs are 1 or very close to 1 (e.g., 0.973 etc) (I used adj="fdr" in topTable...) although the other contrasts in the same model we set up in limma does have "normal" looking adj.P.Val ranged from 1 to about 0.01. >From our previous experience, sometimes, even with very high adj.P.Val, with decent P.Value (e.g., <0.01), we can have good validation. In this case, now we validated two miRNAs from the list both with good P.Value <0.01 but with rather high adj.P.Val (both are around 0.97 or 1). We did validate one of them as good miRNA but the other one is bad (we can not validate it as differential). I understood it is more subjective aspect and we only validated two of chosen miRNAs in this case (and we encountered similar situation before for validation of other dataset), and many people used FDR or adjusted p-value varied from 5% to 30% commonly. my first question is: what kind of situation could lead to adj.P.Val for all of genes in the list as high as 0.97 to 1 (there are about 6k features in the dataset? What shall be the cutoff for P.Value and adj.P.Val in the situation like this? Considering both or more specifically on adj.P.Val? In our case, if rely on adj.P.Val only for cutoff, which are all so high, we do not have any single miRNA that we can choose, however, our biological validation experiment indeed validate a good one (although we only validated just two of them, still kind of much higher than expected considering the fact that none of them has decent adj.P.Val but rather bad ones). If rely on P.Value (e.g. <0.01), we do have quite a few mRNAs in the list, but each one with sky high adj.P.Val! and we only can validate 1 of the 2 chosen candidates as good one. Any insight or experience to share with? Thanks a lot! Ming ABCC NCI-Frederick, Frederick, MD
miRNA limma miRNA limma • 1.6k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Mon, May 23, 2011 at 8:48 PM, Yi, Ming (NIH/NCI) [C] <yiming at="" mail.nih.gov=""> wrote: > > Dear List: > > Recently, I used limma package to analyze some miRNA array data. One of the differential lists I derived for one of the contrasts in our limma model just used P.Value <0.01 as cutoff combined FC cutoff, we noticed that in this particular contrast, all the differential miRNAs have rather high adj.P.Val almost all miRNAs are 1 or very close to 1 (e.g., 0.973 etc) ?(I used adj="fdr" in topTable...) although the other contrasts in the same model we set up in limma does have "normal" looking adj.P.Val ranged from 1 to about 0.01. > > >From our previous experience, sometimes, even with very high adj.P.Val, with decent P.Value (e.g., <0.01), we can have good validation. In this case, now we validated two miRNAs from the list both with good P.Value <0.01 but with rather high adj.P.Val (both are around 0.97 or 1). We did validate one of them as good miRNA but the other one is bad (we can not validate it as differential). > > I understood it is more subjective aspect and we only validated two of chosen miRNAs in this case (and we encountered similar situation before for validation of other dataset), and many people used FDR or adjusted p-value varied from 5% to 30% commonly. > my first question is: what kind of situation could lead to adj.P.Val for all of genes in the list as high as 0.97 to 1 (there are about 6k features in the dataset? > > What shall be the cutoff for P.Value and adj.P.Val in the situation like this? Considering both or more specifically on adj.P.Val? In our case, if rely on adj.P.Val only for cutoff, which are all so high, we do not have any single miRNA that we can choose, however, our biological validation experiment indeed validate a good one (although we only validated just two of them, still kind of much higher than expected considering the fact that none of them has decent adj.P.Val but rather bad ones). If rely on P.Value (e.g. <0.01), we do have quite a few mRNAs in the list, but each one with sky high adj.P.Val! and we only can validate 1 of the 2 chosen candidates as good one. > > Any insight or experience to share with? Hi, Ming. The problem with using raw p-values is that there is no control for multiple testing. There are many methods to control for multiple testing, of which one is the 'FDR'. So, I would tend to rely on a statistical measure that attempts to control for multiple testing (such as the FDR); the raw p-values from limma do not do so. Whether or not you include a further fold change filter will be a matter of experimental specifics. That is not to say that one cannot do what you have done and "rank" genes, even those not statistically significant, by some measures, but one cannot easily conclude that there is evidence of differential expression without a multiple-testing-corrected statistical measure being significant. As for your situation, there are multiple reasons that might lead to lack of evidence of differential expression. First, there may truly be no difference for a contrast. Second, technical artifacts or noise may make such a difference difficult or impossible to detect. Third (and related to the second), the sample size may be too small to detect a difference. Remember that not rejecting the null hypothesis (of no differential expression) is not the same thing as proving the null hypothesis; we cannot prove the null hypothesis, typically. Some of the more statistically-minded might have clearer explanations for some of what I said above, but I think the rule-of-thumb is to rely on multiple-testing-corrected p-values and not on uncorrected p-values for determining statistical significance. Sean
ADD COMMENT
0
Entering edit mode
Hi, Sean: Thanks a lot for your very nice comments and diagnosis on the issues and share with me. Yes, in general I would more rely on the multiple- test based statistics such as FDR or adjusted p-value etc. However, in this particular situation, we do not have a single candidate for further experiment if we do that and bench scientist has nothing to pursue further. But more trickily is as I mentioned, we indeed pick up some candidates based on raw p-value and they can successfully validate some of them with multiple-sample q-PCR approach and now actively pursue further. My concerns for multiple-test is maybe in some cases, tended to be too stringent (this may also depend upon the multi-test method, I used fdr or BH method, which is popular) especially for our case where we have no single candidates. In fact, I do hear similar cases from others as well that candidates selected based on raw p-values some times work quite well in terms of validation rate. Yes, the third reason you mentioned the sample size (5 vs 5 for the comparison in our case, these are mouse primary tumor-derived cell line clones and indeed much better than the high variations amongst human samples) would apply to us indeed. But for biologists and q-PCR they used to validate, that kind of level of replicates seem already make them happy to pursue further. Thanks for sharing! Best Ming -----Original Message----- From: Davis, Sean (NCI) On Behalf Of Davis, Sean (NIH/NCI) [E] Sent: Tuesday, May 24, 2011 12:27 PM To: Yi, Ming (NIH/NCI) [C] Cc: Bioconductor mailing list Subject: Re: [BioC] question on the cutoff for limma package On Mon, May 23, 2011 at 8:48 PM, Yi, Ming (NIH/NCI) [C] <yiming at="" mail.nih.gov=""> wrote: > > Dear List: > > Recently, I used limma package to analyze some miRNA array data. One of the differential lists I derived for one of the contrasts in our limma model just used P.Value <0.01 as cutoff combined FC cutoff, we noticed that in this particular contrast, all the differential miRNAs have rather high adj.P.Val almost all miRNAs are 1 or very close to 1 (e.g., 0.973 etc) ?(I used adj="fdr" in topTable...) although the other contrasts in the same model we set up in limma does have "normal" looking adj.P.Val ranged from 1 to about 0.01. > > >From our previous experience, sometimes, even with very high adj.P.Val, with decent P.Value (e.g., <0.01), we can have good validation. In this case, now we validated two miRNAs from the list both with good P.Value <0.01 but with rather high adj.P.Val (both are around 0.97 or 1). We did validate one of them as good miRNA but the other one is bad (we can not validate it as differential). > > I understood it is more subjective aspect and we only validated two of chosen miRNAs in this case (and we encountered similar situation before for validation of other dataset), and many people used FDR or adjusted p-value varied from 5% to 30% commonly. > my first question is: what kind of situation could lead to adj.P.Val for all of genes in the list as high as 0.97 to 1 (there are about 6k features in the dataset? > > What shall be the cutoff for P.Value and adj.P.Val in the situation like this? Considering both or more specifically on adj.P.Val? In our case, if rely on adj.P.Val only for cutoff, which are all so high, we do not have any single miRNA that we can choose, however, our biological validation experiment indeed validate a good one (although we only validated just two of them, still kind of much higher than expected considering the fact that none of them has decent adj.P.Val but rather bad ones). If rely on P.Value (e.g. <0.01), we do have quite a few mRNAs in the list, but each one with sky high adj.P.Val! and we only can validate 1 of the 2 chosen candidates as good one. > > Any insight or experience to share with? Hi, Ming. The problem with using raw p-values is that there is no control for multiple testing. There are many methods to control for multiple testing, of which one is the 'FDR'. So, I would tend to rely on a statistical measure that attempts to control for multiple testing (such as the FDR); the raw p-values from limma do not do so. Whether or not you include a further fold change filter will be a matter of experimental specifics. That is not to say that one cannot do what you have done and "rank" genes, even those not statistically significant, by some measures, but one cannot easily conclude that there is evidence of differential expression without a multiple-testing-corrected statistical measure being significant. As for your situation, there are multiple reasons that might lead to lack of evidence of differential expression. First, there may truly be no difference for a contrast. Second, technical artifacts or noise may make such a difference difficult or impossible to detect. Third (and related to the second), the sample size may be too small to detect a difference. Remember that not rejecting the null hypothesis (of no differential expression) is not the same thing as proving the null hypothesis; we cannot prove the null hypothesis, typically. Some of the more statistically-minded might have clearer explanations for some of what I said above, but I think the rule-of-thumb is to rely on multiple-testing-corrected p-values and not on uncorrected p-values for determining statistical significance. Sean
ADD REPLY

Login before adding your answer.

Traffic: 684 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6