Updated affy package? ReadAffy/ AffyBatch has changed and breaks my pipeline!
0
0
Entering edit mode
Benjamin Otto ▴ 830
@benjamin-otto-1519
Last seen 10.3 years ago
Hi Benjamin, Jim, Martin, Is this problem sorted out already? I seem to have the same/similar problem with a code that has been working under R 2.12 and now breaks under 2.14. However I do use the standard human HG-U133-Plus2 package, so I don't think it's a non-Bioc-package problem. Rather I suppose something has changed either in the current handling by R or in the HG-U133-Plus2 package. It's an old code originally posted by Ariel Chernomoretz here in the mailing list. Basically, as with Benjamin's code it removes whole probesets or single probes from the environment. The error code I get is (translated from german to english): >> cannot change value of locked binding for 'hgu133plus2probe' As, in contrast to the CDF 'hgu133plus2cdf', the probe package is has not environment class I cannot change/unlock any binding with unlockBinding(). Here is my sessionInfo() :: R version 2.14.1 (2011-12-22) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] de_DE.UTF-8/de_DE.UTF-8/de_DE.UTF-8/C/de_DE.UTF-8/de_DE.UTF-8 attached base packages: [1] tcltk stats graphics grDevices utils datasets methods [8] base other attached packages: [1] hgu133plus2probe_2.9.0 hgu133plus2cdf_2.9.1 AnnotationDbi_1.16.11 [4] altcdfenvs_2.16.0 hypergraph_1.26.0 graph_1.32.0 [7] Biostrings_2.22.0 IRanges_1.12.5 makecdfenv_1.32.0 [10] simpleaffy_2.30.0 gcrma_2.26.0 genefilter_1.36.0 [13] affydata_1.11.15 affy_1.32.1 Biobase_2.14.0 loaded via a namespace (and not attached): [1] affyio_1.22.0 annotate_1.32.1 BiocInstaller_1.2.1 [4] DBI_0.2-5 preprocessCore_1.16.0 RSQLite_0.11.1 [7] splines_2.14.1 survival_2.36-10 tools_2.14.1 [10] xtable_1.6-0 zlibbioc_1.0.0 And here is Ariel's Code # # Ariel Chernomoretz, Ph.D. # Centre de recherche du CHUL # 2705 Blv Laurier, bloc T-367 # Sainte-Foy, Qc # G1V 4G2 # (418)-525-4444 ext 46339 # ###################################################################### ######### # # RemoveProbes # goal: Modification of CDF and PROBE environments # in : # listOutProbes: list of probes to be removed from cdf and probe # environments (e.g. c("1001_at1","1032_at6"). # If NULL no probe will be taken out. # listOutProbeSets: list of probesets to be removed from cdf and probe # environments (e.g. c("1006_at","1032_f_at"). # If NULL no probeset will be removed. # cdfpackagename: (e.g. "hgu95av2cdf") # probepackagename: (e.g. "hgu95av2probe") # destructive: unimplemented option, see NOTE # # out : --- # # NOTE 1: After a call to RemoveProbes the probenames reported by # pm and mm accessing functions implemented in the affy package, will # be in general differents from the original ones in probesets where # probes have been removed. This happens as in this functions the probe names # are always assigned sequentially. # RemoveProbes modifies the specified CDF and PROBE environments # in a consistent BUT destructive way. Take this in consideartion if # your code relays on absolute references to probe names. # # A chunk of code to illustrate this # # library(affy) # library(affydata) # source("removeProbes.R") # # data(Dilution) # Dilution@cdfName<-"hgu95av2" # fix cdf name # # cleancdf <- cleancdfname(Dilution@cdfName,addcdf=FALSE) # cdfpackagename <- paste(cleancdf,"cdf",sep="") # probepackagename <- paste(cleancdf,"probe",sep="") # # ResetEnvir(cdfpackagename,probepackagename) # pm(Dilution,"1000_at") # as.data.frame(get(probepackagename))[1:16,1:4] # # RemoveProbes(c("1000_at2"),NULL,cdfpackagename,probepackagename) # pm(Dilution,"1000_at") # as.data.frame(get(probepackagename))[1:15,1:4] # # NOTE2 : See my April 20 post to BioC mailing list (and eventually its # continuation) regarding differences reported # between GCRMA 1.1.0 and GCRMA 1.1.3 # RemoveProbes<-function(listOutProbes=NULL, listOutProbeSets=NULL, cdfpackagename,probepackagename,cleancdf,destructive=TRUE){ require(tcltk) # source("fixed.indexProbes.R") #default probe dataset values probe.env.orig <- get(probepackagename) time <- system.time( if(!is.null(listOutProbes)){ # taking probes out from CDF env probes<- unlist(lapply(listOutProbes,function(x){ a<-strsplit(x,"at") aux1<-paste(a[[1]][1],"at",sep="") aux2<-as.integer(a[[1]][2]) c(aux1,aux2) })) n1<-as.character(probes[seq(1,(length(probes)/2))*2-1]) n2<-as.integer(probes[seq(1,(length(probes)/2))*2]) probes<-data.frame(I(n1),n2) probes[,1]<-as.character(probes[,1]) probes[,2]<-as.integer(probes[,2]) pset<-unique(probes[,1]) psetlen <- length(pset) #pbt <- new("ProgressBarText", as.integer(length(pset)), barsteps = as.integer(20)) pbt <- tkProgressBar(title = "Removing probes from probesets", min = 0, max = psetlen, width = 300) #open(pbt) for(i in seq(along=pset)){ ii <-grep(paste("^",pset[i],sep=""),probes[,1]) iout<-probes[ii,2] a<-get(pset[i],env=get(cdfpackagename)) a<-a[-iout,] assign(pset[i],a,env=get(cdfpackagename)) setTkProgressBar(pbt, i, label=paste( round(i/psetlen*100, 0), "% done")) #update(pbt) } close(pbt) } ) cat("Time for removing ",length(listOutProbes)," Probes: ",time[3],"\n") time <- system.time( # taking probesets out from CDF env if(!is.null(listOutProbeSets)){ rm(list=listOutProbeSets,envir=get(cdfpackagename)) } ) cat("Time for removing ",length(listOutProbeSets)," ProbeSets: ",time[3],"\n") # setting the PROBE env accordingly (idea from gcrma compute.affinities.R) cat("Setting the PROBE environment.\n ") tmp <- get("xy2i",paste("package:",cdfpackagename,sep="")) newAB <- new("AffyBatch",cdfName=cleancdf) pmIndex <- unlist(indexProbes(newAB,"pm")) subIndex<- match(tmp(probe.env.orig$x,probe.env.orig$y),pmIndex) rm(newAB) iNA <- whichis.na(subIndex)) if(length(iNA)>0){ ipos<-grep(probepackagename,search()) # browser() -> Code breaks here !!! assign(probepackagename,probe.env.orig[-iNA,],pos=ipos) } } regards Benjamin ___________________________________________ Benjamin Otto, PhD University Medical Center Hamburg-Eppendorf Center for Diagnostics Institute For Clinical Chemistry / Central Laboratories Center for Internal Medicine I. Department of Internal Medicine Campus Forschung N27 Martinistr. 52, D-20246 Hamburg Tel.: +49 40 7410 51908 Fax.: +49 40 7410 54971 ___________________________________________ On 02/06/2012 07:35 AM, James W. MacDonald wrote: > Hi Benjamin, > > You need to supply us with a small reproducible example of the problem > you are experiencing, so we can either track down the error, or explain > how you can fix things on your end. The packages asprgdtua520520fcdf, asprgdtua520520fprobe are not Bioconductor, and seem like a likely source of problems, so you will want to make sure that these (a) have not changed unexpectedly and (b) are available to people like Jim. Martin > > Best, > > Jim > > > > On 2/6/2012 7:43 AM, Benjamin Høyer wrote: >> Hi >> >> I have made a pipeline for automating analysis of custom affymetrix >> chips for a colleague. After an R-update, the pipeline no longer >> runs. Working backwards from where the error occurs, I have found >> that the structure produced by ReadAffy() is no longer the same. I am >> using exactly the same data set. (The error itself a failed assign() >> due to a locked binding, but I don't think that's related.) >> >> Below are the lists of packages installed before and after the update, >> with version numbers. The 'old' R version is 2.13.2, the 'new' one is >> 2.14.1. >> >> OldPackVers<- structure(c("1.30.0", "1.20.0", "1.28.5", "1.14.1", >> "1.30.0", >> "0.0.1", "0.0.1", "2.13.2", "2.12.2", "2.20.4", "1.0-4.1", "1.3-2", >> "1.12", "7.3-3", "1.14.1", "0.2-8", "2.13.2", "1.6.2", "2.13.2", >> "0.2-5", "1.30.0", "1.6", "1.2.8", "0.8-48", "2.24.1", "2.8.2", >> "1.2.5", "2.10.1", "2.13.2", "2.13.2", "2.13.2", "2.6.2", "1.10.6", >> "2.23-6", "0.19-33", "3.8.3", "1.1.7", "1.30.0", "1.30.0", "7.3-14", >> "1.0-3", "2.13.2", "2.10.0", "1.7-11", "3.1-103", "7.3-1", "1.14.0", >> "3.1-51", "0.11.1", "7.3-3", "2.13.2", "2.13.2", "2.13.2", "2.36-10", >> "2.13.2", "1.24.0", "1.30.0", "2.13.2", "2.13.2", "1.30.0"), .Names = >> c("affy", >> "affyio", "affyPLM", "AnnotationDbi", "asprgdtua520520fcdf", >> "asprgdtua520520fprobe", "bac01a520746fprobe", "base", "Biobase", >> "Biostrings", "bitops", "boot", "caTools", "class", "cluster", >> "codetools", "compiler", "corpcor", "datasets", "DBI", "DynDoc", >> "e1071", "fdrtool", "foreign", "gcrma", "gdata", "GeneNet", "gplots", >> "graphics", "grDevices", "grid", "gtools", "IRanges", "KernSmooth", >> "lattice", "limma", "longitudinal", "makecdfenv", "marray", "MASS", >> "Matrix", "methods", "Mfuzz", "mgcv", "nlme", "nnet", "preprocessCore", >> "rpart", "RSQLite", "spatial", "splines", "stats", "stats4", >> "survival", "tcltk", "timecourse", "tkWidgets", "tools", "utils", >> "widgetTools")) >> >> NewPackVers<- structure(c("1.32.1", "1.22.0", "1.30.0", "1.16.11", >> "1.32.0", >> "0.0.1", "2.14.1", "2.14.0", "1.2.1", "2.22.0", "1.0-4.1", "1.3-4", >> "1.12", "7.3-3", "1.14.1", "0.2-8", "2.14.1", "1.6.2", "2.14.1", >> "0.2-5", "1.32.0", "1.6", "1.2.8", "0.8-48", "2.26.0", "2.8.2", >> "1.2.5", "2.10.1", "2.14.1", "2.14.1", "2.14.1", "2.6.2", "1.12.5", >> "2.23-7", "0.20-0", "3.10.2", "1.1.7", "1.32.0", "1.32.0", "7.3-16", >> "1.0-3", "2.14.1", "2.12.0", "1.7-13", "3.1-103", "7.3-1", "2.14.1", >> "1.16.0", "3.1-51", "0.11.1", "7.3-3", "2.14.1", "2.14.1", "2.14.1", >> "2.36-10", "2.14.1", "1.26.0", "1.32.0", "2.14.1", "2.14.1", >> "1.32.0", "1.0.0"), .Names = c("affy", "affyio", "affyPLM", >> "AnnotationDbi", >> "asprgdtua520520fcdf", "asprgdtua520520fprobe", "base", "Biobase", >> "BiocInstaller", "Biostrings", "bitops", "boot", "caTools", "class", >> "cluster", "codetools", "compiler", "corpcor", "datasets", "DBI", >> "DynDoc", "e1071", "fdrtool", "foreign", "gcrma", "gdata", "GeneNet", >> "gplots", "graphics", "grDevices", "grid", "gtools", "IRanges", >> "KernSmooth", "lattice", "limma", "longitudinal", "makecdfenv", >> "marray", "MASS", "Matrix", "methods", "Mfuzz", "mgcv", "nlme", >> "nnet", "parallel", "preprocessCore", "rpart", "RSQLite", "spatial", >> "splines", "stats", "stats4", "survival", "tcltk", "timecourse", >> "tkWidgets", "tools", "utils", "widgetTools", "zlibbioc")) >> >> The differences in structure are thus: >> >> datOld<- ReadAffy(celfile.path=subdircel) >> datOld >>> AffyBatch object >>> size of arrays=716x716 features (24 kb) >>> cdf=AsprgDTUa520520F (11186 affyids) >>> number of samples=27 >>> number of genes=11186 >>> annotation=asprgdtua520520f >>> notes= >> datNew<- ReadAffy(celfile.path=subdircel) >> datNew >>> AffyBatch object >>> size of arrays=716x716 features (15 kb) >>> cdf=AsprgDTUa520520F (11186 affyids) >>> number of samples=27 >>> number of genes=11186 >>> annotation=asprgdtua520520f >>> notes= >> >> The particular slot which causes trouble is phenoData; >> datOld <at> phenoData >>> An object of class "AnnotatedDataFrame" >>> sampleNames: 1a_N6h_ t1.CEL, 1b_N6h_t2.CEL, ..., 9c_N192h_t3.CEL (27 >>> total) >>> varLabels and varMetadata description: >>> sample: arbitrary numbering >> datNew <at> phenoData >>> An object of class "AnnotatedDataFrame" >>> sampleNames: 1a_N6h_ t1.CEL 1b_N6h_t2.CEL ... 9c_N192h_t3.CEL (27 total) >>> varLabels: sample >>> varMetadata: labelDescription >> >> Is this a likely culprit? Another option could be how 'get' works. >> Basically my function 'get's the probe package name (the script >> follows Gillespie's doi:10.1186/1756-0500-3-81). In the first case, I >> get >> >> get(probepackagename) >>> Object of class probetable data.frame with 121507 rows and 6 columns. >> And in the second case, I get: >>> Object of class probetable data.frame with 468384 rows and 6 columns. >>> First 3 rows are: >>> sequence x y Probe.Set.Name Probe.Interrogation.Position >>> 1<seq1> 5 1<name1> 13 >>> 2<seq2> 6 1<name2> 13 >>> 3<seq3> 7 1<name3> 13 >>> Target.Strandedness >>> 1 Antisense >>> 2 Antisense >>> 3 Antisense >> Basically, I would like to know what has changed in the newer versions >> that I have! Could also be that the 'probepackagename' (which is >> generated earlier in the script) has changed. This would be the >> result of an update in makeProbePackage() in AnnotationDbi. >> >> I'll keep looking for ways to get round it, but thanks in advance! >> >> Regards, >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@... >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Guido Sauter (Vertreter des Vorsitzenden), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus [[alternative HTML version deleted]]
Cancer cdf probe affy gcrma AnnotationDbi Cancer cdf probe affy gcrma AnnotationDbi • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 620 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6