Entering edit mode
On Mon, Feb 1, 2010 at 8:55 AM, Susan Bosco <susanbosco86@yahoo.com>
wrote:
> Dear Sean,
>
> Thanks for your reply.
>
> Before getting into ROC we have gone through many research papers on
> ROC, but our understanding on ROC was wrong. We fit filter on
continuous
> data (logratio information) to classify data as 0 and 1. As on
discussing
> with list we came to know that the cut off should not applied on the
data,
> since we are looking for the cut off using ROC. Hence we used
rbinorm()
> function as used in the ROC documentation.
>
> Later, as you suggested,we conacted local Biostatistician . Local
> statistician is new to Microaaray. So, explained us ROC analysis
taking
> example of diseased vs normal patients in the context of Blood
Pressure and
> also mentioned classification should be based on categorical data.
But our
> microarray data (Medip enriched) does not contain any comparision
groups and
> are in duplicates. So, we are literally confused a lot now to
implement ROC
>
> So, if you do not mind, can you please share your experience with
ROC from
> scratch? and can you please provide suggestions on implementing ROC
in our
> context?
>
> Thanking you in anticipation.
>
>
Hi, Susan.
I cannot think of a way to explain better than I have up to this point
(but
others might be able to, I admit). The answer to your questions from
your
local statistician sounds perfectly correct and your reaction makes me
suspect that ROC analysis may not be what you need here. I would
suggest
that if you are still unclear about where to go from here that you
continue
to work with a local statistician or find a collaborator willing to
work
with you to complete your study. There is a limit to what can be done
by
email, unfortunately.
Sean
>
> --- On *Mon, 25/1/10, Sean Davis <seandavi@gmail.com>* wrote:
>
>
> From: Sean Davis <seandavi@gmail.com>
> Subject: Re: [BioC] Seeking assistance on ROC
> To: "Susan Bosco" <susanbosco86@yahoo.com>
> Cc: bioconductor@stat.math.ethz.ch, "prashantha hebbar" <
> prashantha.hebbar@manipal.edu>
> Date: Monday, 25 January, 2010, 6:54 PM
>
>
> On Sat, Jan 23, 2010 at 6:28 AM, Susan Bosco <susanbosco86@yahoo.com <http:="" in.mc953.mail.yahoo.com="" mc="" compose?to="susanbosco86@yahoo.com">>
> wrote:
> > Dear Sean,
> >
> > Thanks again.
> >
> > I corrected the script changing the value of 'truth' variable with
> rbinom() function. Since my data size is quite large(data is of
244K),I
> tried with the first 200,for which I was able to find proper ROC
curve.
> However, when I include the complete data, the plot changes. For the
whole
> data,I get
> > a linear graph with small variations.
> >
> > My sessionInfo() looks like this:
> > For 100 values of the data:
> > library(ROC)
> > load("RGKma.RData")
> > state= rbinom(length(RGKma$M[1:100,3]),1,0.33)
> > data = RGKma$M[1:200,3]
> > R1<-rocdemo.sca(truth=state,data,dxrule.sca)
> > pdf("ROCk.pdf")
> > plot(R1, show.thresh=TRUE,col = "red")
> > dev.off()
> >
> > For the complete data:
> > library(ROC)
> > load("RGKma.RData")
> > state= rbinom(length(RGKma$M[,3]),1,0.33)
> > data = RGKma$M[,3]
> > R1<-rocdemo.sca(truth=state,data,dxrule.sca)
> > pdf("ROCallk.pdf")
> > plot(R1, show.thresh=TRUE,col = "red")
> > dev.off()
> >
> > I've hereby attached the pdfs of the plots.I would appreciate if
you
> could help me out with this problem that I encountered with a large
data
> size.
>
> Hi, Susan. The problem is not the large data size, in particular.
> You need to know the TRUTH. You cannot assign the TRUTH using a
> random binomial. You need to KNOW which samples are of one class
> versus the other. Do you know that information? If not, then ROC
> analysis is not a useful thing to apply.
>
> Sean
>
> > Thanking you sincerely,
> > Susan.
> >
> >
> > --- On Wed, 20/1/10, Sean Davis
> > <seandavi@gmail.com<http: in.mc953.mail.yahoo.com="" mc="" compose?to="seandavi@gmail.com">>
> wrote:
> >
> > From: Sean Davis
> > <seandavi@gmail.com<http: in.mc953.mail.yahoo.com="" mc="" compose?to="seandavi@gmail.com">
> >
> > Subject: Re: [BioC] Seeking assistance on ROC
> > To: "Susan Bosco" <susanbosco86@yahoo.com<http: in.mc953.mail.yah="" oo.com="" mc="" compose?to="susanbosco86@yahoo.com">
> >
> > Cc: bioconductor@stat.math.ethz.ch<http: in.mc953.mail.yahoo.com="" mc="" compose?to="bioconductor@stat.math.ethz.ch">,
> "prashantha hebbar" <prashantha.hebbar@manipal.edu<http: in.mc953.m="" ail.yahoo.com="" mc="" compose?to="prashantha.hebbar@manipal.edu">
> >
> > Date: Wednesday, 20 January, 2010, 12:05 PM
> >
> >
> >
> > On Wed, Jan 20, 2010 at 12:39 AM, Susan Bosco <susanbosco86@yahoo. com<http:="" in.mc953.mail.yahoo.com="" mc="" compose?to="susanbosco86@yahoo.co" m="">>
> wrote:
> >
> >
> > Dear
> > Sean,
> >
> > Thank you so much for the help.
> >
> >
> > I tried with a range of thresholds from 0-0.9..As you had
mentioned,the
> > true positive rates no doubt increased with thresholds below
> > 0.9.However I did get some false positive rates even at a minimum
> threshold
> > of 0.1.Could you kindly explain the reason?
> >
> >
> >
> > Is
> > there any method of finding the optimal threshold,maximizing the
true
> > positive rates while minimizing the false positives,instead of
randomly
> > choosing between 0-0.9?
> >
> >
> > Hi, Susan. The ROC curve IS that method. The ROC curve
represents ALL
> thresholds as applied to the data. If you plot with
show.thresh=TRUE, you
> will see the thresholds that were tried and where they are on the
curve.
> >
> >
> > If the threshold to which you are referring is the one that you
used to
> determine the variable you called "state", then we are talking about
two
> different things. The "truth" variable is meant to be assigned by
some
> source other than the data themselves. If you do not know the true
state of
> your samples and find yourself assigning the state the data, then
ROC curve
> analysis will not be of any use.
> >
> >
> > Sean
> >
> >
> > Thanks in advance,
> >
> > Susan.
> >
> >
> >
> >
> >
> >
> > The INTERNET now has a personality. YOURS! See your Yahoo!
Homepage.
> >
> >
> >
> >
> > Your Mail works best with the New Yahoo Optimized IE8. Get it
NOW!
> http://downloads.yahoo.com/in/internetexplorer/
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor@stat.math.ethz.ch<http: in.mc953.mail.yahoo.com="" mc="" c="" ompose?to="Bioconductor@stat.math.ethz.ch">
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >
>
>
> ------------------------------
> Your Mail works best with the New Yahoo Optimized IE8. Get it NOW!<h ttp:="" in.rd.yahoo.com="" tagline_ie8_new="" *http:="" downloads.yahoo.com="" in="" i="" nternetexplorer=""/>
> .
>
[[alternative HTML version deleted]]