Hi Justin,
Some time ago I posted some code that you may find useful.
Basically it was a function to remove a list of probes and/or
probesets from CDF and Probe environments. Usual calls to any affy
processing method performed afterwards, will disregard
the 'offending' probes.
It was originally intended to handle cdf and probes environments at
the
same time. I slightly modified the code to allow you working only
with the cdf environment (as I think it's your interest). I am
including here
the updated R code that should be sourced.
Please refer also to previous posts
http://files.protsuggest.org/biocond/html/7350.html
http://files.protsuggest.org/biocond/html/7366.html
http://files.protsuggest.org/biocond/html/7367.html
Regards,
Ariel./
# removeProbes v 0.2
# Now it is not necessary to provide a probe environment name.
# If probepackagename=NULL only the cdf environment is (re)-loaded
# with ResetEnvir, and no attempt to modify a probe environment is
performed
# in RemoveProbes
# (of course, probe env SHOULD be modified if gcrma processing is
wanted)
#
# ResetEnvir
# goal:Detach, if necessary, cdf and probe environments,
# and (re)load them.
# in :
# cdfpackagename (e.g. "hgu95av2cdf")
# probepackagename (e.g. "hgu95av2probe")
# out : ---
#
ResetEnvir<-function(cdfpackagename,probepackagename){
ll<-search()
cdfpackagepos <- grep(cdfpackagename,ll)
if(length(cdfpackagepos)>0) detach(pos=cdfpackagepos)
library(cdfpackagename,character.only=T)
if(!is.null(probepackagename)){
ll<-search()
probepackagepos <- grep(probepackagename,ll)
if(length(probepackagepos)>0) detach(pos=probepackagepos)
library(probepackagename,character.only=T)
}
}
#
# RemoveProbes
# goal: Modification of CDF and PROBE environments
# in :
# listOutProbes: list of probes to be removed from cdf and probe
# environments (e.g. c("1001_at1","1032_at6").
# If NULL no probe will be taken out.
# listOutProbeSets: list of probesets to be removed from cdf and
probe
# environments (e.g. c("1006_at","1032_f_at").
# If NULL no probeset will be removed.
# cdfpackagename: (e.g. "hgu95av2cdf")
# probepackagename: (e.g. "hgu95av2probe")
# destructive: unimplemented option, see NOTE
#
# out : ---
#
# NOTE 1: After a call to RemoveProbes the probenames reported by
# pm and mm accessing functions implemented in the affy package,
will
# be in general differents from the original ones in probesets
where
# probes have been removed. This happens as in this functions
the probe
names
# are always assigned sequentially.
# RemoveProbes modifies the specified CDF and PROBE environments
# in a consistent BUT destructive way. Take this in
consideartion if
# your code relays on absolute references to probe names.
#
# A chunk of code to illustrate this
#
# library(affy)
# library(affydata)
# source("removeProbes.R")
#
# data(Dilution)
# Dilution at cdfName<-"hgu95av2" # fix cdf name
#
# cleancdf <- cleancdfname(Dilution at
cdfName,addcdf=FALSE)
# cdfpackagename <- paste(cleancdf,"cdf",sep="")
# probepackagename <- paste(cleancdf,"probe",sep="")
#
# ResetEnvir(cdfpackagename,probepackagename)
# pm(Dilution,"1000_at")
# as.data.frame(get(probepackagename))[1:16,1:4]
#
#
RemoveProbes(c("1000_at2"),NULL,cdfpackagename,probepackagename)
# pm(Dilution,"1000_at")
# as.data.frame(get(probepackagename))[1:15,1:4]
#
# NOTE2 : See my April 20 post to BioC mailing list pots (and
eventually its
# continuation) regarding differences reported
# between GCRMA 1.1.0 and GCRMA 1.1.3
#
RemoveProbes<-function(listOutProbes=NULL,
listOutProbeSets=NULL,
cdfpackagename,probepackagename,destructive=TRUE){
#default probe dataset values
if(!is.null(probepackagename)){
probe.env.orig <- get(probepackagename)
}
if(!is.null(listOutProbes)){
# taking probes out from CDF env
probes<- unlist(lapply(listOutProbes,function(x){
a<-strsplit(x,"at")
aux1<-paste(a[[1]][1],"at",sep="")
aux2<-as.integer(a[[1]][2])
c(aux1,aux2)
}))
n1<-as.character(probes[seq(1,(length(probes)/2))*2-1])
n2<-as.integer(probes[seq(1,(length(probes)/2))*2])
probes<-data.frame(I(n1),n2)
probes[,1]<-as.character(probes[,1])
probes[,2]<-as.integer(probes[,2])
pset<-unique(probes[,1])
for(i in seq(along=pset)){
ii <-grep(pset[i],probes[,1])
iout<-probes[ii,2]
a<-get(pset[i],env=get(cdfpackagename))
a<-a[-iout,]
assign(pset[i],a,env=get(cdfpackagename))
}
}
# taking probesets out from CDF env
if(!is.null(listOutProbeSets)){
rm(list=listOutProbeSets,envir=get(cdfpackagename))
}
# setting the PROBE env accordingly (idea from gcrma
compute.affinities.R)
if(!is.null(probepackagename)){
tmp <- get("xy2i",paste("package:",cdfpackagename,sep=""))
newAB <- new("AffyBatch",cdfName=cleancdf)
pmIndex <- unlist(indexProbes(newAB,"pm"))
subIndex<- match(tmp(probe.env.orig$x,probe.env.orig$y),pmIndex)
rm(newAB)
iNA <- whichis.na(subIndex))
if(length(iNA)>0){
ipos<-grep(probepackagename,search())
assign(probepackagename,probe.env.orig[-iNA,],pos=ipos)
}
}
}
On July 15, 2005 03:45 pm, Justin Borevitz wrote:
> Hi Ben et al.
>
> We're working with new tiling arrays for Arabidopsis and
drosophila. We
> can create a new cdfenv for the arrays and would like to read in
new
> .CEL files to execute bg.correct and normalize.quantiles. We would
like
> to mask many features that are not unique in the genome and are
> considering using the rm.extra option in read.affybatch. How should
this
> be specified? Can we feed a matrix of T/F of
dim(array.size,array.size)
> for which features to use? How are these setting specified
externally?
>
> rm.mask = FALSE, rm.outliers = FALSE, rm.extra = FALSE
>
> Another option we considered is to use bg.correct on a matrix of cel
> intensities rather than on an affy.batch, how could I call
> bg.correct(matrixofCelintensities)?
>
> Thanks,
> Justin
>
> -----
> Justin Borevitz
> Ecology and Evolution (CLSC 915E)
> University of Chicago
> 1101 E. 57th St.
> Chicago, IL 60637 USA
> (773) 702-5948 office
> (773) 834-4055 lab
> borevitz at uchicago.edu
>
http://naturalvariation.org
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
>
https://stat.ethz.ch/mailman/listinfo/bioconductor
--
Ariel Chernomoretz, Ph.D.
Centre de recherche du CHUL
2705 Blv Laurier, bloc T-367
Sainte-Foy, Qc
G1V 4G2
(418)-525-4444 ext 46339