How to use datapackage homology

0

Entering edit mode

Claudio Lottaz ▴ 110

@claudio-lottaz-211

Last seen 10.3 years ago

Hi all, I tried to determine probesets of hgu133a which correspond to transcripts which are homologous to all rat transcripts measured on rae230a. I thought using the homology datapackage would be a good idea but didn't find any HGIDs in rae230aHGID for which entries in homology10116HGID2HGID exist in the released datapackages. While trying the development data packages I discovered that the rae230aHGID hash is gone in version 1.6.2 of rae230a. Any suggestions? Claudio -- ------------------------------------------------------------ Claudio Lottaz Computational Diagnostics Group Department for Computational Molecular Biology Max-Planck-Institute for Molecular Genetics Ihnestr. 73, D-14195 Berlin (Germany) office: 2.116, phone: +49 30 8413 1177, fax: +49 30 8413 1176 [[alternative HTML version deleted]]

hgu133a rae230a hgu133a rae230a • 1.1k views

ADD COMMENT • link updated 20.4 years ago by rgentleman ★ 5.5k • written 20.4 years ago by Claudio Lottaz ▴ 110

0

Entering edit mode

John Zhang ★ 2.9k

@john-zhang-6

Last seen 10.3 years ago

>I tried to determine probesets of hgu133a which correspond to transcripts which are homologous to all rat transcripts measured on rae230a. I thought using the homology datapackage would be a good idea but didn't find any HGIDs in rae230aHGID for which entries in homology10116HGID2HGID exist in the released datapackages. While trying the development data packages I discovered that the rae230aHGID hash is gone in version 1.6.2 of rae230a. The developmental track (1.6.2) of homology has a homologyLL2HGID environment that maps LocusLink to HGID. The HGID environment has been removed from version 1.6.2 of the packages for chips because the LOCUSID environment can be used to make the link. > >Any suggestions? >Claudio >-- >------------------------------------------------------------ >Claudio Lottaz >Computational Diagnostics Group >Department for Computational Molecular Biology >Max-Planck-Institute for Molecular Genetics >Ihnestr. 73, D-14195 Berlin (Germany) >office: 2.116, phone: +49 30 8413 1177, fax: +49 30 8413 1176 > [[alternative HTML version deleted]] > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor Jianhua Zhang Department of Biostatistics Dana-Farber Cancer Institute 44 Binney Street Boston, MA 02115-6084

ADD COMMENT • link 20.4 years ago John Zhang ★ 2.9k

0

Entering edit mode

rgentleman ★ 5.5k

@rgentleman-7725

Last seen 9.6 years ago

United States

Hi, And so, using version 1.6.2 of all data packages, v1=unlist(as.list(rae230aLOCUSID)) ##drop those with no mappings v2=v1[!is.na(v1)] ##get the homology codes ##at most one per LL so we can unlist v3 = unlist(mget(as.character(v2), homologyLL2HGID, ifnotfound=NA)) #see how many we got sumis.na(v3)) length(v3) #I get about 11000 now you have them - note that there is a vignette on using the homology data in the annotate package; any comments, contributions would be welcome Robert -- +--------------------------------------------------------------------- ------+ | Robert Gentleman phone : (617) 632-5250 | | Associate Professor fax: (617) 632-2444 | | Department of Biostatistics office: M1B20 | | Harvard School of Public Health email: rgentlem@jimmy.harvard.edu | +--------------------------------------------------------------------- ------+

ADD COMMENT • link 20.4 years ago rgentleman ★ 5.5k

0

Entering edit mode

Hello there, I have been pondering over the help files of 'read.maimages' from the limma package when it is used for generic import. I have data that I have extracted the data portion of the quantarray files, since alot of the file is just wasting space on the HD. Now I want to import but the data section just looks like a regular square tab delimited file. Is there a parameter (like for specifying which column is R and which is G etc) for specifying the ignore filter column so the ignore filter function can be used to generate the $weights matrix on import? I looked at the source code, and I did not see that this was the case, am I correct? Peter W.

ADD REPLY • link 20.4 years ago Peter Wilkinson ▴ 80

0

Entering edit mode

This is a request for Gordon Smyth ... well, it seems that I have answered my own question "Limma read.maimages generic import ".... I think: The following seems to import the weights just fine, if I am importing from a file that contains only the data portion of the quantarray file. RG.test <- read.maimages("SEKH0406.data", columns=list(Rf="ch2 Intensity", Gf="ch1 Intensity" , Rb ="ch2 Background" , Gb ="ch1 Background"), wt.fun=wtIgnore.Filter) The was not immediately evident, since I was using a generic import, however the wtIgnore.Filter function is: > wtIgnore.Filter function (qta) { qta[, "Ignore Filter"] } and that seems to find the right column. So then I tried the following so that I could include a Garea = "ch1 Area" in the column list, because I would like to include area information for quality control. This DOES NOT work. I would like to propose that the function that handles the column input be altered so that we can extract any column we wish from the quantarray file with the read.maimages function: If the initialization of the RG object could initialize itself with the arbitrary list that is supplied with the columns parameter, then we can import whatever data we wish with the generic importer :) The reason is that I am exploring the use of other information from the quantarray output that may help in refining quality weight assignments, but I do not want the calculation to be done on import, as I want the data available in R. I would have to import every time I wanted to apply a new weight function, which would be taxing. I think that some of the relevant code portions for this modification are: # Initialize RG list object Y <- matrix(0,nspots,nslides) colnames(Y) <- names RG <- list(R=Y,G=Y,Rb=Y,Gb=Y) if(!is.null(wt.fun)) RG$weights <- Y change RG assigment to reflect list in columns parameter # Now read remainder of files for (i in 1:nslides) { if(i > 1) { fullname <- slides[i] if(!is.null(path)) fullname <- file.path(path,fullname) obj <- read.table(fullname,skip=skip,header=TRUE,sep=sep,as.is=TRUE,quote=quo te,check.names=FALSE,comment.char="",nrows=nspots,...) } RG$R[,i] <- obj[,columns$Rf] RG$G[,i] <- obj[,columns$Gf] RG$Rb[,i] <- obj[,columns$Rb] RG$Gb[,i] <- obj[,columns$Gb] if(!is.null(wt.fun)) RG$weights[,i] <- wt.fun(obj) if(verbose) cat(paste("Read",fullname,"\n")) } new("RGList",RG) } modify this portion in the same way. Would this be sensible? or does there exist another alternative. Peter At 09:25 AM 7/21/2004, you wrote: >Hello there, > >I have been pondering over the help files of 'read.maimages' from the >limma package when it is used for generic import. > >I have data that I have extracted the data portion of the quantarray >files, since alot of the file is just wasting space on the HD. Now I want >to import but the data section just looks like a regular square tab >delimited file. Is there a parameter (like for specifying which column is >R and which is G etc) for specifying the ignore filter column so the >ignore filter function can be used to generate the $weights matrix on import? > >I looked at the source code, and I did not see that this was the case, am >I correct? > >Peter W. > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

ADD REPLY • link 20.4 years ago Peter Wilkinson ▴ 80

0

Entering edit mode

At 02:00 AM 22/07/2004, Peter Wilkinson wrote: >This is a request for Gordon Smyth ... > >well, it seems that I have answered my own question "Limma read.maimages >generic import ".... I think: > >The following seems to import the weights just fine, if I am importing >from a file that contains only the data portion of the quantarray file. > >RG.test <- read.maimages("SEKH0406.data", columns=list(Rf="ch2 Intensity", > Gf="ch1 Intensity" , Rb ="ch2 Background" , Gb ="ch1 Background"), > wt.fun=wtIgnore.Filter) > >The was not immediately evident, since I was using a generic import, >however the wtIgnore.Filter function is: > > > wtIgnore.Filter >function (qta) >{ > qta[, "Ignore Filter"] >} > >and that seems to find the right column. > >So then I tried the following so that I could include a Garea = "ch1 Area" >in the column list, because I would like to include area information for >quality control. This DOES NOT work. > >I would like to propose that the function that handles the column input be >altered so that we can extract any column we wish from the quantarray file >with the read.maimages function: > >If the initialization of the RG object could initialize itself with the >arbitrary list that is supplied with the columns parameter, then we can >import whatever data we wish with the generic importer :) I understand that you may like to read in extra columns to play with, but initializing with an arbitrary column list conflicts with the basic philosophy of the software development environment. read.maimages() produces a microarray data object (an RGList object) which contains specified components which have pre-determined meanings. This means that any programmer can write functions which take RGList objects as arguments. ("Design by contract"). Allowing an object to be produced with completely arbitrary components defeats the purpose of having a classed object. If you want to read in arbitrary columns you can do it using read.series() or simply do it yourself using read.table(). But in the end if you are playing with your own analysis you may have to write some private functions. Gordon >The reason is that I am exploring the use of other information from the >quantarray output that may help in refining quality weight assignments, >but I do not want the calculation to be done on import, as I want the data >available in R. I would have to import every time I wanted to apply a new >weight function, which would be taxing. > >I think that some of the relevant code portions for this modification are: > ># Initialize RG list object > Y <- matrix(0,nspots,nslides) > colnames(Y) <- names > RG <- list(R=Y,G=Y,Rb=Y,Gb=Y) > if(!is.null(wt.fun)) RG$weights <- Y > >change RG assigment to reflect list in columns parameter > > ># Now read remainder of files > for (i in 1:nslides) { > if(i > 1) { > fullname <- slides[i] > if(!is.null(path)) fullname <- > file.path(path,fullname) > obj <- > read.table(fullname,skip=skip,header=TRUE,sep=sep,as.is=TRUE,quote=q uote,check.names=FALSE,comment.char="",nrows=nspots,...) > } > RG$R[,i] <- obj[,columns$Rf] > RG$G[,i] <- obj[,columns$Gf] > RG$Rb[,i] <- obj[,columns$Rb] > RG$Gb[,i] <- obj[,columns$Gb] > if(!is.null(wt.fun)) RG$weights[,i] <- wt.fun(obj) > if(verbose) cat(paste("Read",fullname,"\n")) > } > new("RGList",RG) >} > >modify this portion in the same way. > > >Would this be sensible? or does there exist another alternative. > >Peter > > > >At 09:25 AM 7/21/2004, you wrote: > >>Hello there, >> >>I have been pondering over the help files of 'read.maimages' from the >>limma package when it is used for generic import. >> >>I have data that I have extracted the data portion of the quantarray >>files, since alot of the file is just wasting space on the HD. Now I want >>to import but the data section just looks like a regular square tab >>delimited file. Is there a parameter (like for specifying which column is >>R and which is G etc) for specifying the ignore filter column so the >>ignore filter function can be used to generate the $weights matrix on import? >> >>I looked at the source code, and I did not see that this was the case, am >>I correct? >> >>Peter W.

ADD REPLY • link 20.4 years ago Gordon Smyth 52k

Login before adding your answer.