Hi Dhaarini,
What is in Peder's code but is NOT in the genefilter vignette is what
you should do with the output of genefilter(), which is a logical
vector the same length as the number of genes. You can use this
vector to subset your expression data object like so:
> ans <- genefilter(tumor, flist)
> sum(ans) # tells you how many genes pass your filter
> tumor.filt <- tumor[ans,] # subsetting your expression object
by the TRUE/FALSE vector
IMO, the vignette, genefilter/doc/howtogenefilter.pdf should also
give an example of how to use the output of genefilter() to subset
your expression object (hint, hint Biocore Team c/o BioC user list,
the maintainer of genefilter).
Cheers,
Jenny
At 06:12 AM 2/25/2009, Peder Worning wrote:
>Hi Dhaarini,
>
>Filtering genes are a delicate but important matter and you can
filter
>them on low values and variance. Expression values that do not chance
>over you samples are not very informative.
>
>I have made my own function that use the genefilter package that
>combines low value, NA's, low variation and range. I use that but I
>always try it out with different parameter to see what happens to my
>data.
>I am working with microRNA arrays and my data are in logscale, but
the
>principles should be the same.
>
>Here is the code, be ware of line shifts introduced by outlook:
>
>Data.filter <-
>function(e.matrix,kk=as.integer(ncol(e.matrix)/8),aa=7,na=5,var=0.1,e
r=3
>00){
># This function takes an expression matrix with genes in rows and
>samples in columns
># It filter genes out that do not meet the criteria
># kk minimal number of values > aa; na maximun number of NA; var
minimal
>variation of values; er minimal range of 2^values
> e.matrix.f <- e.matrix [genefilter(e.matrix , kOverA(k= kk, A=aa,
>na.rm=TRUE)),]
> nna <- apply(e.matrix.f,1,function(x){(sumis.na(x)))})
> e.matrix.f <- e.matrix.f[nna<=na,]
> rvar <- apply(e.matrix.f,1,function(x){var(x, na.rm = TRUE)})
> e.matrix.f = e.matrix.f[(rvar>=var),]
> exp.range <-
>apply(e.matrix.f,1,function(x){2**max(x,na.rm=TRUE)-2**min(x,na.rm=TR
UE)
>})
> e.matrix.f <- e.matrix.f[exp.range>er,]
> e.matrix.f
>}
>
>Good luck
>Peder
>
>Best regards
>
>Exiqon A/S
>
> Peder Worning, Ph.D.
>
>Senior Scientist, Biomarker Discovery
>
>-----Original Message-----
>From: bioconductor-bounces at stat.math.ethz.ch
>[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
dhaarini s
>Sent: Wednesday, February 25, 2009 9:40 AM
>To: bioconductor at stat.math.ethz.ch
>Subject: [BioC] genefilter displaying the expression set
>
>Hi all!
>I am new to R and Bioconductor. I am having a dataset of 22283 genes
and
>190
>samples. Due to the huge size of the data, I want to filter some
>irrelevant
>genes. I tried the "genefilter" package of BioC, but then understand
>that it
>does gene filtering by simply displaying whether the gene satifies
the
>filter condition or not by marking it as TRUE. This is how I
proceeded:
> > library(genefilter)
> > f1 <- kOverA(5, 10)
> > flist <- filterfun(f1)
> > ans <- genefilter(tumor, flist)
>(The object "tumor" contains my expression dataset.) The output is
>something
>like this:
>"x"
>"1007_s_at" TRUE
>"1053_at" FALSE
>"117_at" FALSE
>"121_at" FALSE
>"200001_at" TRUE
>"200002_at" TRUE
>..........................
>But, Iwould like to know whether the genefilter will return me an
>expression
>set containing the filtered genes and their expression values for the
>samples. Please help me out!
>Thanks in advance.
>Regards,
>Dhaarini
>
> [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives:
>
http://news.gmane.org/gmane.science.biology.informatics.conductor
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois, Urbana-Champaign
330 ERML
1201 W. Gregory Dr.
Urbana, IL 61801
USA
ph: 217-244-7355
fax: 217-265-5066
e-mail: drnevich at illinois.edu