vsn in BioConductor 1.2
3
0
Entering edit mode
@white-charles-e-wrair-wash-dc-241
Last seen 10.3 years ago
Could someone help me interpret (develop an action plan to correct ...) the error message that follows? Thanks. > Monkey.sub<-Monkey.expr[!is.na(Monkey.matrix[,1]),1] > Monkey.sub Expression Set (exprSet) with 13838 genes 1 samples phenoData object with 6 variables and 1 cases varLabels : Slide : FileName : Cy3 : Cy5 : date : Comments > Monkey.sub<-exprs(Monkey.expr[!is.na(Monkey.matrix[,1]),1]) > Monkey.vsn<-vsn(Monkey.sub) vsn is working on a 13838 x 1 matrix, with lts.quantile=0.5; please wait for 11 dots: .Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", control = control, : L-BFGS-B needs finite values of fn [[alternative HTML version deleted]]
• 1.4k views
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 4 months ago
EMBL European Molecular Biology Laborat…
Hi Charles, vsn is a normalization method that brings the different columns (colors, arrays) of an expression matrix on the same scale. As input, it takes an n*d matrix, with d>=2. You passed it a matrix with d=1. Apparently this results in some of the likelihood calculations becoming singular, hence the error message you received. Action plan: 1. For you: read the paper on vsn, then call it with expression matrices of size d>=2. 2. For me: fix vsn so that it throws an intelligible error message if called with d<=1. Best regards Wolfgang ------------------------------------- Wolfgang Huber Division of Molecular Genome Analysis German Cancer Research Center Heidelberg, Germany Phone: +49 6221 424709 Fax: +49 6221 42524709 Http: www.dkfz.de/mga/whuber ------------------------------------- On Thu, 17 Jul 2003, White, Charles E WRAIR-Wash DC wrote: > Could someone help me interpret (develop an action plan to correct ...) the > error message that follows? > > > > Thanks. > > > > > Monkey.sub<-Monkey.expr[!is.na(Monkey.matrix[,1]),1] > > > Monkey.sub > > Expression Set (exprSet) with > > 13838 genes > > 1 samples > > phenoData object with 6 variables and 1 cases > > varLabels > > : Slide > > : FileName > > : Cy3 > > : Cy5 > > : date > > : Comments > > > Monkey.sub<-exprs(Monkey.expr[!is.na(Monkey.matrix[,1]),1]) > > > Monkey.vsn<-vsn(Monkey.sub) > > vsn is working on a 13838 x 1 matrix, with lts.quantile=0.5; please wait for > 11 dots: > > .Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", control = > control, : > > L-BFGS-B needs finite values of fn > > > [[alternative HTML version deleted]] >
ADD COMMENT
0
Entering edit mode
Hi Wolfgang: I am also getting the same error with a matrix that is 4992 x 376. Here are the R commands: > data <- read.table("file.msk", header=T, sep = "\t", row.names=1) > data <- as.matrix(data) > vsn.data <- vsn(data) vsn is working on a 4992 x 376 matrix, with lts.quantile=0.5; please wait for 11 dots: . and then dies. Any suggestions? after I do a traceback I get the following: > traceback() 2: optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", control = control, lower = plower) 1: vsn(data) Any help is greatly appreciated. Isaac w.huber@dkfz-heidelberg.de wrote: >Hi Charles, > >vsn is a normalization method that brings the different columns (colors, >arrays) of an expression matrix on the same scale. As input, it takes an >n*d matrix, with d>=2. You passed it a matrix with d=1. Apparently this >results in some of the likelihood calculations becoming singular, hence >the error message you received. > >Action plan: >1. For you: read the paper on vsn, then call it with expression matrices >of size d>=2. > >2. For me: fix vsn so that it throws an intelligible error message if >called with d<=1. > >Best regards > Wolfgang > >------------------------------------- >Wolfgang Huber >Division of Molecular Genome Analysis >German Cancer Research Center >Heidelberg, Germany >Phone: +49 6221 424709 >Fax: +49 6221 42524709 >Http: www.dkfz.de/mga/whuber >------------------------------------- > > >On Thu, 17 Jul 2003, White, Charles E WRAIR-Wash DC wrote: > > > >>Could someone help me interpret (develop an action plan to correct ...) the >>error message that follows? >> >> >> >>Thanks. >> >> >> >> >> >>>Monkey.sub<-Monkey.expr[!is.na(Monkey.matrix[,1]),1] >>> >>> >>>Monkey.sub >>> >>> >>Expression Set (exprSet) with >> >> 13838 genes >> >> 1 samples >> >> phenoData object with 6 variables and 1 cases >> >> varLabels >> >> : Slide >> >> : FileName >> >> : Cy3 >> >> : Cy5 >> >> : date >> >> : Comments >> >> >> >>>Monkey.sub<-exprs(Monkey.expr[!is.na(Monkey.matrix[,1]),1]) >>> >>> >>>Monkey.vsn<-vsn(Monkey.sub) >>> >>> >>vsn is working on a 13838 x 1 matrix, with lts.quantile=0.5; please wait for >>11 dots: >> >>.Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", control = >>control, : >> >> L-BFGS-B needs finite values of fn >> >> >> [[alternative HTML version deleted]] >> >> >> > >_______________________________________________ >Bioconductor mailing list >Bioconductor@stat.math.ethz.ch >https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > >
ADD REPLY
0
Entering edit mode
@white-charles-e-wrair-wash-dc-241
Last seen 10.3 years ago
I spent the weekend getting to know this program better than I wanted <grin>, but I probably still don't know it well enough. The fool's gold of my wisdom is as follows: 1) I would seriously consider reducing the amount of data you feed this program. It took 4.5 hours to process a 15,552 x 38 matrix on a 1.2 GHz Pentium III. There is a reason why the function vsnh exists. Unless you have some serious GHz, you probably want to run vsn on a random sample of genes or on one array at a time. 2) Assuming that you are using data from a two channel microarray, I strongly suspect that the red and green channels need to be side by side in your matrix. I think the point is to quantify measurement variation without contamination from any unnecessary source. I don't see any other way that pair information is being passed to vsn. 3) I think that your problem and my old problem are likely to be quite different. I fed the program data in a format it didn't understand and you probably fed the program more data than it could process in a reasonable amount of time. (Since the program doesn't use "much" memory, you wouldn't have heard the hard drive running even if the program was still running.) 4) I am pleased with the results I'm now getting from vsn. My initial problems with this program were related to how I understand the relationships between data elements and Bioconductor objects verses what appears to be a somewhat different relationship in vsn. -----Original Message----- From: Isaac Neuhaus [mailto:isaac.neuhaus@bms.com] Sent: Monday, July 21, 2003 1:37 PM To: w.huber@dkfz-heidelberg.de Cc: White, Charles E WRAIR-Wash DC; 'bioconductor@stat.math.ethz.ch' Subject: Re: [BioC] vsn in BioConductor 1.2 Hi Wolfgang: I am also getting the same error with a matrix that is 4992 x 376. Here are the R commands: > data <- read.table("file.msk", header=T, sep = "\t", row.names=1) > data <- as.matrix(data) > vsn.data <- vsn(data) vsn is working on a 4992 x 376 matrix, with lts.quantile=0.5; please wait for 11 dots: . and then dies. Any suggestions? after I do a traceback I get the following: > traceback() 2: optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", control = control, lower = plower) 1: vsn(data) Any help is greatly appreciated. Isaac w.huber@dkfz-heidelberg.de wrote:....
0
Entering edit mode
Hi Charles, > 1) I would seriously consider reducing the amount of data you feed this > program. It took 4.5 hours to process a 15,552 x 38 matrix on a 1.2 GHz > Pentium III. There is a reason why the function vsnh exists. Unless you have > some serious GHz, you probably want to run vsn on a random sample of genes > or on one array at a time. The program is indeed quite slow. The run time is about t = c * no.rows * no. columns and according to your numbers c = about 3ms on your machine. There is a lot of number crunching in vsn. With Dennis Kostka I have an experimental version that is written in C, but even that is "only" faster by a factor of 2-3. A good strategy is indeed to run the program on a random sample of genes (rows), and then use vsnh to apply the transformation to the whole data matrix. See normalize.AffyBatch.vsn for an example. A random subset of a few thousand spots should usually do. It will not be helpful to split up the task by arrays (e.g. one array at a time) since the net run time will be the same. > 2) Assuming that you are using data from a two channel microarray, I > strongly suspect that the red and green channels need to be side by side in > your matrix. I think the point is to quantify measurement variation without > contamination from any unnecessary source. I don't see any other way that > pair information is being passed to vsn. If you pass a 2*k data matrix from k red/green slides, with the colors next to each other, vsn does not care about the ordering of the columns - so it does not a make a difference whether the columns are ordered R1, G1, R1, G2, ... Gk or R1,... Rk, G1, ... Gk. If someone is not confortable with this, they can also call in vsn in turn for each array separately. Empirically, I've found that this makes hardly a difference. (The parameter estimation is not affected by the different correlations within and between arrays.) However, there should not be pronounced batch effects (e.g. arrays 1..50 looking technically very different from arrays 51...100). > 3) I think that your problem and my old problem are likely to be quite > different. I fed the program data in a format it didn't understand and you > probably fed the program more data than it could process in a reasonable > amount of time. (Since the program doesn't use "much" memory, you wouldn't > have heard the hard drive running even if the program was still running.) Yes. The error message about infinite likelihood has nothing to do with the program's long, but finite, CPU time consumption. > 4) I am pleased with the results I'm now getting from vsn. ... That's always nice to hear :) Best regards Wolfgang
ADD REPLY
0
Entering edit mode
@wolfgang-huber-3550
Last seen 4 months ago
EMBL European Molecular Biology Laborat…
Hi Isaac, does your data matrix contain Inf (infinity) or an excessive number of 0s (e.g. through "flooring" the negative values?). If there are infinities in the data, this will probably also lead to an infinite likelihood, which could explain your error message. If there are other singularities (e.g. if a whole column of the data matrix has the same value), this may also lead to infinite values in the likelihood calculations. If these suggestions do not lead to the solution of your problem, you could send me your data matrix (anonymized) and I could try to figure out where things go wrong. The calculations in vsn are not that complicated. This may be useful in making it more robust or at least in making it produce more intelligible error messagess. Note that I'll be away from my email from now till Thursday. Best regards Wolfgang ------------------------------------- Wolfgang Huber Division of Molecular Genome Analysis German Cancer Research Center Heidelberg, Germany Phone: +49 6221 424709 Fax: +49 6221 42524709 Http: www.dkfz.de/mga/whuber ------------------------------------- On Mon, 21 Jul 2003, Isaac Neuhaus wrote: > Wolfgang: > > I chopped some of the ouptup here is everything. > > Isaac > > > data <- read.table("file.msk", header=T, sep = "\t", row.names=1) > > data <- as.matrix(data) > > vsn.data <- vsn(data) > vsn is working on a 4992 x 376 matrix, with lts.quantile=0.5; please > wait for 11 dots: > .Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", > control = control, : > non-finite value supplied by optim >
ADD COMMENT
0
Entering edit mode
w.huber@dkfz-heidelberg.de wrote: >Hi Isaac, > >does your data matrix contain Inf (infinity) or an excessive number of 0s >(e.g. through "flooring" the negative values?). If there are infinities >in the data, this will probably also lead to an infinite likelihood, which >could explain your error message. > Yes, In some cases it contains up to 75% of 0s. I will exclude these samples and try to run the vsn again. Thanks for your help. Isaac > >If there are other singularities (e.g. if a whole column of the data >matrix has the same value), this may also lead to infinite values in the >likelihood calculations. > >If these suggestions do not lead to the solution of your problem, you >could send me your data matrix (anonymized) and I could try to figure out >where things go wrong. The calculations in vsn are not that complicated. >This may be useful in making it more robust or at least in making it >produce more intelligible error messagess. > >Note that I'll be away from my email from now till Thursday. > >Best regards > Wolfgang > >------------------------------------- >Wolfgang Huber >Division of Molecular Genome Analysis >German Cancer Research Center >Heidelberg, Germany >Phone: +49 6221 424709 >Fax: +49 6221 42524709 >Http: www.dkfz.de/mga/whuber >------------------------------------- > > >On Mon, 21 Jul 2003, Isaac Neuhaus wrote: > > > >>Wolfgang: >> >>I chopped some of the ouptup here is everything. >> >>Isaac >> >> > data <- read.table("file.msk", header=T, sep = "\t", row.names=1) >> > data <- as.matrix(data) >> > vsn.data <- vsn(data) >>vsn is working on a 4992 x 376 matrix, with lts.quantile=0.5; please >>wait for 11 dots: >>.Error in optim(par = p0, fn = ll, gr = grll, method = "L-BFGS-B", >>control = control, : >> non-finite value supplied by optim >> >> >>
ADD REPLY

Login before adding your answer.

Traffic: 551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6