Hi,
This question regards how various gene set testing methods deal with
pre-filtered (non-specifically) data sets.
Dont genes in a gene set that did not pass a filter constitute
important
evidence against that gene set? Not taking them into account when
calculating whatever gene set summary statistic seems wrong (e.g. as
recommended in chapters 13 & 14 of Bioconductor Case Studies, if I
read
correctly). To put it differently, excluding a gene in the summary
statistic
calculation because it is likely not to be interesting seems
different
than excluding it because it was not on the chip in the first place.
Best,
François Lefebvre
[[alternative HTML version deleted]]
Does that mean that there may be valid and invalid reasons for
excluding or filtering out genes prior to a statistical filtering
(e.g. t-test). Some examples are low expression values throughout
dataset, and low variation across dataset (i.e. likely to be
'housekeeping-type' genes). I would say its fine to exclude these if
you are fully aware of the risks and you're just looking for a few
genes for an informal study. Whether this is acceptable as publication
standard data is a separate question, and I don't know the answer.
What do the MIAME chaps say about this? Are there any guidelines
there?
Matt
----------------------
Matthew Arno, Ph.D.
Genomics Centre Manager
King's College London
?
The contents of this email are strictly confidential. It may not be
transmitted in part or in whole to any other individual or groups of
individuals.
This email is intended solely for the use of the individual(s) to whom
they are addressed and should not be released to any third party
without the consent of the sender.
>-----Original Message-----
>From: bioconductor-bounces at r-project.org [mailto:bioconductor-
bounces at r-
>project.org] On Behalf Of Fran?ois Lefebvre
>Sent: 14 June 2011 20:59
>To: bioconductor at r-project.org
>Subject: [BioC] Pre-filtering and the gene universe of gene set tests
in
>microarray analysis
>
>Hi,
>
>This question regards how various gene set testing methods deal with
>pre-filtered (non-specifically) data sets.
>
>Don't genes in a gene set that did not pass a filter constitute
>important evidence "against" that gene set? Not taking them into
account
>when calculating whatever gene set summary statistic seems wrong
(e.g.
>as recommended in chapters 13 & 14 of "Bioconductor Case Studies", if
I
>read correctly). To put it differently, excluding a gene in the
summary
>statistic calculation "because it is likely not to be interesting"
seems
>different than excluding it because it was not on the chip in the
first
>place.
>Best,
>Fran?ois Lefebvre
>
> [[alternative HTML version deleted]]