Hi Adrian,
First, please don't take things off the BioC list.
Adrian Johnson wrote:
> Hi Jim,
> I am not sure if Ive said it correctly that is correct for
> statisticians (i am a biology student). Say I want to find
prognostic
> potential of the top 100 genes differentially expressed between
normal
> and cancer tissues. I have the survival data, remission status, sex
> and other covariate information.
>
> In a tutorial by Drs. Gentleman and Dudoit and others, they used
> bootstrap based MTP and cox-t statistics from multtest package to
> associate gene expression measure and survival data. (Website :
> www.stat.berkeley.edu/~sandrine/Docs/Talks/MBI04/mbi.html)
>
> If I am not mistaken, the aim there was to identify differentially
> expressed genes (using either "f" or "t" stististics) on a filtered
> expression matrix derived from RMA on affy study. The filter is a.
> the coefficient of variation is between 0.7 and 10 b. at least 20%
of
> the samples have a measured intensity of at least 100 (100 on
linear
> scale).
> (
http://www.bioconductor.org/workshops/2006/BioC2006/labs/kdhansen/
multtest.html
> at section 1: getting started)
That was the goal of that particular workshop, but they didn't mix the
two (t-tests and survival analysis).
>
> This above step seems to be old, instead I wanted to test the
> prognostic potential of top 100 genes filtered using adj.P.value (be
> it BH method) from limma topTable function on eBayes fit object, by
> applying cox-proportional hazard method on 100 genes using ER or
> mutation status and survival data.
>
> The hypothesis is that the differentially expressed genes between
> cancer and normal samples are prognostic genes. Instead of applying
> cox model on every row of the gene expression matrix, I want to
apply
> on the genes that I know are differentially expressed.
> I have no idea how this can be done.
The problem with your methodology IMO, is that a gene may be
differentially expressed between cancer and normal yet have no
prognostic ability vis a vis survival.
Two examples:
Normal - c(4.5, 4.1, 4.7,4.5)
Cancer - c(6.8, 7.2, 7.3, 6.6)
Surv.time - c(3, 4.5, 15, 20) ## months
These are likely significantly different, but I doubt there would be
any
significance for the cancer samples in a Cox model.
Normal - c(4.5, 4.1, 4.7,4.5)
Cancer - c(8.3, 6.4, 5.1, 3.4)
Surv.time - c(3, 4.5, 15, 20) ## months
These might be significantly different between cancer and normal
(probably not), but the Cox model would likely have a very small
p-value.
Granted these are probably extreme examples, but the point here is
that
the t-test is probably not the best way to filter samples for a
survival
analysis.
Best,
Jim
>
> Is my question still valid or is it still naive way of connecting
two
> totally different things. I appreciate your suggestion and help.
>
> thank you.
>
> Adrian
>
>
>
>
>
> On Thu, Nov 13, 2008 at 9:01 AM, James W. MacDonald
> <jmacdon at="" med.umich.edu=""> wrote:
>> Hi Adrian,
>>
>> Adrian Johnson wrote:
>>> Dear group,
>>>
>>> I have two types of samples (cancer and normal) with covariate
data
>>> including survival times.
>>>
>>> I applied limma (and filtered genes that are significantly
>>> differentially expressed between cancer and normal. Say I have 500
>>> genes after (adj.P.Value using BH) filtering.
>>>
>>> Is it meaningful to apply coxfilter on those 500 genes (by
supplying
>>> expression values for those 500 genes and survival times for all
>>> samples) instead of using kOverA flter.
>> What is the hypothesis being tested here? A t-test and a Cox model
test for
>> completely different things, so I don't see why you would follow
one with
>> the other.
>>
>> Best,
>>
>> Jim
>>> Thanks
>>> Ad.
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Hildebrandt Lab
>> 8220D MSRB III
>> 1150 W. Medical Center Drive
>> Ann Arbor MI 48109-0646
>> 734-936-8662
>>
--
James W. MacDonald, M.S.
Biostatistician
Hildebrandt Lab
8220D MSRB III
1150 W. Medical Center Drive
Ann Arbor MI 48109-5646
734-936-8662