decideTests with nestedF
1
0
Entering edit mode
@pedro-lopez-romero-1618
Last seen 10.2 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20060616/ bcd1e1a9/attachment.pl
• 1.1k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On 6/16/06 5:30 AM, "Pedro L?pez Romero" <plopez at="" cnic.es=""> wrote: > Dear list, > > > > I have an experiment with several contrasts associated to different > experimental conditions for each gene, so I want to select genes in a first > step by their moderated-F statistic in order to reduce the number of genes > for multiple contrasts across genes. Hi, Pedro. If I understand what you are trying to do, choosing genes using F-stats like this for later statistical testing is not really valid. If you want to reduce the number of genes for subsequent testing, you should be using an unsupervised approach (like looking for high-variance genes, as one example). Did I misunderstand the approach you are trying to take? Sean
ADD COMMENT
0
Entering edit mode
Hi Sean, thanks a lot for the replay. Yes, you understood my doubt. I have several experimental conditions, and I thought that filtering first by the moderated F-statistic would be a right approach, since by this I would be selecting the genes that are DE in any of the contrasts. Now, from this reduced list, I thought that I could select the genes for every independent contrast given by my experimental conditions. Now, I do not understand why this is not a valid strategy. In fact, I understood that what limma user guide propose in p 51 (chapter 10) is doing this but using decideTests ( ) with nestedF.- Pedro. -----Mensaje original----- De: Sean Davis [mailto:sdavis2 at mail.nih.gov] Enviado el: viernes, 16 de junio de 2006 13:04 Para: Pedro L ? pez Romero; Bioconductor Asunto: Re: [BioC] decideTests with nestedF On 6/16/06 5:30 AM, "Pedro L?pez Romero" <plopez at="" cnic.es=""> wrote: > Dear list, > > > > I have an experiment with several contrasts associated to different > experimental conditions for each gene, so I want to select genes in a first > step by their moderated-F statistic in order to reduce the number of genes > for multiple contrasts across genes. Hi, Pedro. If I understand what you are trying to do, choosing genes using F-stats like this for later statistical testing is not really valid. If you want to reduce the number of genes for subsequent testing, you should be using an unsupervised approach (like looking for high-variance genes, as one example). Did I misunderstand the approach you are trying to take? Sean
ADD REPLY
0
Entering edit mode
Hi Pedro, Pedro L?pez Romero wrote: > Hi Sean, > > thanks a lot for the replay. > > Yes, you understood my doubt. I have several experimental conditions, and I > thought that filtering first by the moderated F-statistic would be a right > approach, since by this I would be selecting the genes that are DE in any of > the contrasts. Now, from this reduced list, I thought that I could select > the genes for every independent contrast given by my experimental > conditions. I don't think you understand what decideTests() is doing. This function fits all the contrasts in your contrasts matrix, using one of several different methods. So you cannot use decideTests() to filter your genes before doing the contrasts, because decideTests() _is_ doing the contrasts. As Sean mentioned, if you want to pre-filter your genes, you should be filtering using an agnostic approach such as removing those genes that have a low variance across all samples. To answer your original question, I think the problem arises in this line of your code: decideTests(fit2,method="nestedF",adjust.method=?fdr?,p.value=0,05) If this is a direct copy-paste of your code (and not a re-typing), then you made a mistake in your p.value argument. It should read 0.05, not 0,05. HTH, Jim -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD REPLY
0
Entering edit mode
Hi Jim, thank you so much for the replay.- I am not using decideTests ( ) to filter the genes. I am using p.adjust () instead: ord=which(p.adjust(fit2$F.p.value,method="fdr") < 0.05) Then, I use the list of genes that result to be significant to apply decideTests (..., method="separate"). I thought that this could be conceptually similar as using decideTests (..., method="nestedF"). I understand that "nestedF" follows a step-wise procedure selecting first genes using their moderated F-statistic, and second (for the selected gene) selecting the contrast that contributes to the significance of the F by the largest value of their moderated t-statistics. I think that this is clear to me. As a matter of fact you explained this recently: https://stat.ethz.ch/pipermail/bioconductor/2006-March/012182.html So now, instead of using "nestedF", why is not possible to select genes in a two-step-wise procedure?, firstly using a F-test to select genes that are differentially expressed in at least one contrast and from this list of selected-genes, to select the genes that are significant contrast by contrast. It would be similar to nestedF, but instead of selecting the contrast by the largest t-statistic, I perform a t-statistic for the whole set of "selected-F-genes" for every contrast. On the other hand, when using "nestedF" I get some genes with a 0.9 adjusted p-value, that probably should not be considered as diferentially expressed. >To answer your original question, I think the problem arises in this >line of your code: > decideTests(fit2,method="nestedF",adjust.method=?fdr?,p.value=0,05) Yes, this I re-typed the code wrongly. In my R-script is 0.05 Again, thanks for your time.- Pedro -----Mensaje original----- De: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] En nombre de James W. MacDonald Enviado el: viernes, 16 de junio de 2006 15:01 CC: 'Bioconductor' Asunto: Re: [BioC] decideTests with nestedF Hi Pedro, Pedro L?pez Romero wrote: > Hi Sean, > > thanks a lot for the replay. > > Yes, you understood my doubt. I have several experimental conditions, and I > thought that filtering first by the moderated F-statistic would be a right > approach, since by this I would be selecting the genes that are DE in any of > the contrasts. Now, from this reduced list, I thought that I could select > the genes for every independent contrast given by my experimental > conditions. I don't think you understand what decideTests() is doing. This function fits all the contrasts in your contrasts matrix, using one of several different methods. So you cannot use decideTests() to filter your genes before doing the contrasts, because decideTests() _is_ doing the contrasts. As Sean mentioned, if you want to pre-filter your genes, you should be filtering using an agnostic approach such as removing those genes that have a low variance across all samples. To answer your original question, I think the problem arises in this line of your code: decideTests(fit2,method="nestedF",adjust.method=?fdr?,p.value=0,05) If this is a direct copy-paste of your code (and not a re-typing), then you made a mistake in your p.value argument. It should read 0.05, not 0,05. HTH, Jim -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues. _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
Hi Pedro, Pedro L?pez Romero wrote: > Hi Jim, thank you so much for the replay.- > > I am not using decideTests ( ) to filter the genes. I am using p.adjust () > instead: > > ord=which(p.adjust(fit2$F.p.value,method="fdr") < 0.05) > > > Then, I use the list of genes that result to be significant to apply > decideTests (..., method="separate"). I thought that this could be > conceptually similar as using decideTests (..., method="nestedF"). > > I understand that "nestedF" follows a step-wise procedure selecting first > genes using their moderated F-statistic, and second (for the selected gene) > selecting the contrast that contributes to the significance of the F by the > largest value of their moderated t-statistics. I think that this is clear to > me. As a matter of fact you explained this recently: > https://stat.ethz.ch/pipermail/bioconductor/2006-March/012182.html > > So now, instead of using "nestedF", why is not possible to select genes in a > two-step-wise procedure?, firstly using a F-test to select genes that are > differentially expressed in at least one contrast and from this list of > selected-genes, to select the genes that are significant contrast by > contrast. It would be similar to nestedF, but instead of selecting the > contrast by the largest t-statistic, I perform a t-statistic for the whole > set of "selected-F-genes" for every contrast. This is the conventional method for analyzing data with ANOVA. You first fit the ANOVA model, then if it is significant based on the F-statistic, you look at contrasts to see which contrast(s) contributed to the result. In other words, I don't think there is a reason you couldn't do things this way. > > On the other hand, when using "nestedF" I get some genes with a 0.9 adjusted > p-value, that probably should not be considered as diferentially expressed. > Not sure I understand your point. Are you saying that a particular contrast that appears to be significant using your method ends up having a very large p-value if you use nestedF? Jim -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD REPLY
0
Entering edit mode
Hi Jim, > Not sure I understand your point. Are you saying that a particular > contrast that appears to be significant using your method ends up having > a very large p-value if you use nestedF? Not exactly. The problem is that using decideTests(..., method="nestedF") with the whole set of genes (not filtered by any method), some of the selected genes have an adjusted p-value quite large, corresponds to genes with a very small M value.- Theses genes are in the list of differentially exprssed genes selected with decideTests(..., method="nestedF"). For example, here I show you two of the genes for a particular contrast that would be selected using decideTests(...,method="nestedF" ) M A t P.Val adj.P.Val 1.064610409 13.0019494 4.18633292 0.000240894 0.239943977 0.489386597 8.228402648 2.86816475 0.007621841 0.817992583 I am a bit confused with this result?, Any clue? Pedro.-
ADD REPLY
0
Entering edit mode
Hi Pedro, Pedro L?pez Romero wrote: > > Hi Jim, > > >>Not sure I understand your point. Are you saying that a particular >>contrast that appears to be significant using your method ends up having >>a very large p-value if you use nestedF? > > > > Not exactly. The problem is that using decideTests(..., method="nestedF") > with the whole set of genes (not filtered by any method), some of the > selected genes have an adjusted p-value quite large, corresponds to genes > with a very small M value.- Theses genes are in the list of differentially > exprssed genes selected with decideTests(..., method="nestedF"). > > For example, here I show you two of the genes for a particular contrast that > would be selected using decideTests(...,method="nestedF" ) > > > M A t P.Val adj.P.Val > 1.064610409 13.0019494 4.18633292 0.000240894 0.239943977 > 0.489386597 8.228402648 2.86816475 0.007621841 0.817992583 > > > I am a bit confused with this result?, Any clue? Your above statement is a bit confusing, but if I assume that you are comparing decideTests() with method="nestedF" using all genes, then with only those genes that have a significant F-statistic, then yes I think I know what is happening. You have to understand that the adjusted p-value is a function of n, where n is the number of simultaneous comparisons you are doing. If you are filtering out a large number of genes, then the p-value adjustment will be less extreme because there are much fewer comparisons to adjust for. This is one of the main reasons people like to pre-filter their data - it increases their ability to detect differential expression because n is smaller. Does that help? Best, Jim > > Pedro.- > > > -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623 ********************************************************** Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.
ADD REPLY
0
Entering edit mode
Hi Pedro, you can check if something went wrong using the following code: padj=apply(limma.fit$p.value,2,p.adjust,method="fdr") maux=padj*abs(resDecideTest) #report genes with at least one pvalue significant i<-apply(maux,1,function(x){any(x)}) cat(paste("# DE genes:",sum(i),"\n")) #report how many contrasts classified as +1 or -1 by decideTests #get a separated adjusted pvalue >0.05 i<-apply(maux,1,function(x){!any(x<0.05)}) cat(paste("#inconsistencies:",sum(i),"\n")) limma.fit is your limma fit result resDecideTest is the output of decideFunction (with "nestedF" option specified) Hope this helps Ariel./ Pedro L?pez Romero wrote: >Hi Jim, > > > >>Not sure I understand your point. Are you saying that a particular >>contrast that appears to be significant using your method ends up having >>a very large p-value if you use nestedF? >> >> > > >Not exactly. The problem is that using decideTests(..., method="nestedF") >with the whole set of genes (not filtered by any method), some of the >selected genes have an adjusted p-value quite large, corresponds to genes >with a very small M value.- Theses genes are in the list of differentially >exprssed genes selected with decideTests(..., method="nestedF"). > >For example, here I show you two of the genes for a particular contrast that >would be selected using decideTests(...,method="nestedF" ) > > > M A t P.Val adj.P.Val >1.064610409 13.0019494 4.18633292 0.000240894 0.239943977 >0.489386597 8.228402648 2.86816475 0.007621841 0.817992583 > > >I am a bit confused with this result?, Any clue? > >Pedro.- > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >
ADD REPLY

Login before adding your answer.

Traffic: 727 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6