How to permute the CEL files
1
0
Entering edit mode
@javier-perez-florido-3121
Last seen 6.8 years ago
Hello everybody, I'd like to permutate the values (intensity raw data) of the genes within a CEL file of an Affymetrix experiment for using the "new" CEL file as a control. The key idea is that I want to keep the order of the genes in the chip, but permutate the intensities of the genes, that is, asign to each gene contained in the array the intensity of other gene selected randomly (if and only if the number of probes is the same). To do so, and using only AFFX genes, the code is: data<-Dilution dataBackup<-Dilution ->geneNames<-featureNames(data) # Get gene names ->AFFX_genes<-grep("^AFFX",geneNames) # Get positions of AFFX genes ->numArrays<-length(sampleNames(data)) # Num of arrays in the experiment ->for (i in 1:numArrays) #Permutate each array ->{ ->permutation_AFFX<-sample(1:numAFFX_genes) # Get a permutation for AFFX genes # Indexes of PMs and MMs probes for AFFX genes ->indexes_pm<-indexProbes(data,which="pm",genenames=geneNames[AFFX_gen es]) ->indexes_mm<-indexProbes(data,which="mm",genenames=geneNames[AFFX_gen es]) # Indexes of PMs and MMs probes for the AFFX genes selected randomly ->indexes_pm_permutation<-indexProbes(data,which="pm",genenames=geneNa mes[AFFX_genes[permutation_AFFX]]) ->indexes_mm_permutation<-indexProbes(data,which="mm",genenames=geneNa mes[AFFX_genes[permutation_AFFX]]) -> for (j in 1:numAFFX_genes)# Change the intensity value of each AFFX gene (the intensity value of each probe) using the intensities of other AFFX gene selected randomly... ->{ # ...if and only if the number of PM and MM probes for the current gene and the one used for permutation is the same -> if(length(indexes_pm[[j]])==length(indexes_pm_permutation[[j]])) ->{ intensity(data)[indexes_pm[[j]],i]<-intensity(dataBackup)[indexes_pm_p ermutation[[j]],i] # Assign new intensity to the PM probes -> intensity(data)[indexes_mm[[j]],i]<-intensity(dataBackup)[indexes_mm_p ermutation[[j]],i] # Assign new intensity to the MM probes ->} ->} The problem is that, this way, it takes so much time to perform the operation. I've also tried using lapply functions, but the are not improvements in time. Is there another way of changing the intensities rather than using indexProbes and intensity R functions? Thanks in advance, Javier
ASSIGN ASSIGN • 1.0k views
ADD COMMENT
0
Entering edit mode
@martin-morgan-1513
Last seen 5 months ago
United States
Hi Javier -- Have you tried to use Rprof() to profile your code, and see where it is slow? my guess is below... Javier P?rez Florido <jpflorido at="" gmail.com=""> writes: > Hello everybody, > I'd like to permutate the values (intensity raw data) of the genes > within a CEL file of an Affymetrix experiment for using the "new" CEL > file as a control. The key idea is that I want to keep the order of > the genes in the chip, but permutate the intensities of the genes, > that is, asign to each gene contained in the array the intensity of > other gene selected randomly (if and only if the number of probes is > the same). To do so, and using only AFFX genes, the code is: > > data<-Dilution > dataBackup<-Dilution > > ->geneNames<-featureNames(data) # Get gene names > ->AFFX_genes<-grep("^AFFX",geneNames) # Get positions of AFFX genes > ->numArrays<-length(sampleNames(data)) # Num of arrays in the experiment > > > ->for (i in 1:numArrays) #Permutate each array > ->{ > > ->permutation_AFFX<-sample(1:numAFFX_genes) # Get a permutation for > AFFX genes > > # Indexes of PMs and MMs probes for AFFX genes > ->indexes_pm<-indexProbes(data,which="pm",genenames=geneNames[AFF X_genes]) > ->indexes_mm<-indexProbes(data,which="mm",genenames=geneNames[AFF X_genes]) > # Indexes of PMs and MMs probes for the AFFX genes selected randomly > ->indexes_pm_permutation<-indexProbes(data,which="pm",genenames=g eneNames[AFFX_genes[permutation_AFFX]]) > > ->indexes_mm_permutation<-indexProbes(data,which="mm",genenames=g eneNames[AFFX_genes[permutation_AFFX]]) > > > -> for (j in 1:numAFFX_genes)# Change the intensity value of each > AFFX gene (the intensity value of each probe) using the intensities of > other AFFX gene selected randomly... > ->{ > # ...if and only if the number of PM and MM probes for the > current gene and the one used for permutation is the same > -> if(length(indexes_pm[[j]])==length(indexes_pm_permutation[[j]])) > ->{ > intensity(data)[indexes_pm[[j]],i]<-intensity(dataBackup)[indexes_pm _permutation[[j]],i] > # Assign new intensity to the PM probes > -> > intensity(data)[indexes_mm[[j]],i]<-intensity(dataBackup)[indexes_mm _permutation[[j]],i] > # Assign new intensity to the MM probes > ->} > ->} I suspect the slow part is this 'for'. I did the test as ok <- sapply(indexes_pm, length) == sapply(indexes_mm, length) To perform the assignment I made a helper function ulist <- function(x) unlist(x, uses.names=FALSE) and then intensity(data)[ulist(indexes_pm[ok]),i] <- intensity(dataBackup)[ulist(indexes_pm_permutation[ok]),i] intensity(data)[ulist(indexes_mm[ok]),i] <- intensity(dataBackup)[ulist(indexes_mm_permutation[ok]),i] your code wasn't directly reproducible (data Dilution not loaded, numAFFX_genes not defined, unbalanced }, -> at start of lines, long comment lines wrapped by your or my email client) so I don't know whether this actually works, but hopefully gets you going. Martin > The problem is that, this way, it takes so much time to perform the > operation. > I've also tried using lapply functions, but the are not improvements > in time. Is there another way of changing the intensities rather than > using indexProbes and intensity R functions? > > Thanks in advance, > Javier > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793
ADD COMMENT
0
Entering edit mode
Thanks for the tips. With the use of "unlist" function, now is pretty faster (the inner loop is avoided). Thanks Martin, Javier 2009/6/21 Martin Morgan <mtmorgan@fhcrc.org> > Hi Javier -- > > Have you tried to use Rprof() to profile your code, and see where it > is slow? my guess is below... > > Javier Pérez Florido <jpflorido@gmail.com> writes: > > > Hello everybody, > > I'd like to permutate the values (intensity raw data) of the genes > > within a CEL file of an Affymetrix experiment for using the "new" CEL > > file as a control. The key idea is that I want to keep the order of > > the genes in the chip, but permutate the intensities of the genes, > > that is, asign to each gene contained in the array the intensity of > > other gene selected randomly (if and only if the number of probes is > > the same). To do so, and using only AFFX genes, the code is: > > > > data<-Dilution > > dataBackup<-Dilution > > > > ->geneNames<-featureNames(data) # Get gene names > > ->AFFX_genes<-grep("^AFFX",geneNames) # Get positions of AFFX genes > > ->numArrays<-length(sampleNames(data)) # Num of arrays in the experiment > > > > > > ->for (i in 1:numArrays) #Permutate each array > > ->{ > > > > ->permutation_AFFX<-sample(1:numAFFX_genes) # Get a permutation for > > AFFX genes > > > > # Indexes of PMs and MMs probes for AFFX genes > > > ->indexes_pm<-indexProbes(data,which="pm",genenames=geneNames[AFFX_ genes]) > > > ->indexes_mm<-indexProbes(data,which="mm",genenames=geneNames[AFFX_ genes]) > > # Indexes of PMs and MMs probes for the AFFX genes selected randomly > > > ->indexes_pm_permutation<-indexProbes(data,which="pm",genenames=gen eNames[AFFX_genes[permutation_AFFX]]) > > > > > ->indexes_mm_permutation<-indexProbes(data,which="mm",genenames=gen eNames[AFFX_genes[permutation_AFFX]]) > > > > > > -> for (j in 1:numAFFX_genes)# Change the intensity value of each > > AFFX gene (the intensity value of each probe) using the intensities of > > other AFFX gene selected randomly... > > ->{ > > # ...if and only if the number of PM and MM probes for the > > current gene and the one used for permutation is the same > > -> if(length(indexes_pm[[j]])==length(indexes_pm_permutation[[j]])) > > ->{ > > > intensity(data)[indexes_pm[[j]],i]<-intensity(dataBackup)[indexes_pm _permutation[[j]],i] > > # Assign new intensity to the PM probes > > -> > > > intensity(data)[indexes_mm[[j]],i]<-intensity(dataBackup)[indexes_mm _permutation[[j]],i] > > # Assign new intensity to the MM probes > > ->} > > ->} > > I suspect the slow part is this 'for'. I did the test as > > ok <- sapply(indexes_pm, length) == sapply(indexes_mm, length) > > To perform the assignment I made a helper function > > ulist <- function(x) unlist(x, uses.names=FALSE) > > and then > > intensity(data)[ulist(indexes_pm[ok]),i] <- > intensity(dataBackup)[ulist(indexes_pm_permutation[ok]),i] > intensity(data)[ulist(indexes_mm[ok]),i] <- > intensity(dataBackup)[ulist(indexes_mm_permutation[ok]),i] > > your code wasn't directly reproducible (data Dilution not loaded, > numAFFX_genes not defined, unbalanced }, -> at start of lines, long > comment lines wrapped by your or my email client) so I don't know > whether this actually works, but hopefully gets you going. > > Martin > > > The problem is that, this way, it takes so much time to perform the > > operation. > > I've also tried using lapply functions, but the are not improvements > > in time. Is there another way of changing the intensities rather than > > using indexProbes and intensity R functions? > > > > Thanks in advance, > > Javier > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > Martin Morgan > Computational Biology / Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. > PO Box 19024 Seattle, WA 98109 > > Location: Arnold Building M1 B861 > Phone: (206) 667-2793 > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 688 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6