Subtraction with NA values

0

Entering edit mode

Daniel F. Simola ▴ 10

@daniel-f-simola-632

Last seen 10.3 years ago

Hello, I have a microarray experiment using dye-swapped slides. I am trying to combine (average) the intensities of a gene from a slide and its dye-swapped pair, but just discovered that the subtraction operator in R does not work the way I would like it to for missing (NA) values. I am doing: ( M - M' ) / 2, where M is an array of intensities for genes, and M' is the same, except dye-swapped. Say I want the result of " 5 - NA ", where 5 is the intensity of one spot and NA is that of the same spot on the dye-swapped slide, then I get NA for an answer. Because I want to average the values ( 5 - NA / 2 ), then I would like my average value to be 5, instead of NA. Thus it's better to make use of the available data than disregard a gene completely. So, does anyone know either of a workaround for this, or of a function that I can use to perform element-wise subtraction over a matrix that will work how I want (or that will let me define my own function to be applied on an element wise basis)? Thanks a lot, Dan Simola

Microarray Microarray • 3.8k views

ADD COMMENT • link updated 20.9 years ago by A.J. Rossini ▴ 810 • written 20.9 years ago by Daniel F. Simola ▴ 10

0

Entering edit mode

A.J. Rossini ▴ 810

@aj-rossini-209

Last seen 10.3 years ago

"Daniel F. Simola" <simola@mail.med.upenn.edu> writes: > Hello, > > I have a microarray experiment using dye-swapped slides. I am trying > to combine (average) the intensities of a gene from a slide and its > dye-swapped pair, but just discovered that the subtraction operator in > R does not work the way I would like it to for missing (NA) values. > > I am doing: ( M - M' ) / 2, where M is an array of intensities for > genes, and M' is the same, except dye-swapped. > > Say I want the result of " 5 - NA ", where 5 is the intensity of one > spot and NA is that of the same spot on the dye-swapped slide, then I > get NA for an answer. Because I want to average the values ( 5 - NA / > 2 ), then I would like my average value to be 5, instead of NA. Thus > it's better to make use of the available data than disregard a gene > completely. > > So, does anyone know either of a workaround for this, or of a function > that I can use to perform element-wise subtraction over a matrix that > will work how I want (or that will let me define my own function to be > applied on an element wise basis)? Comments: 1. you could impute 5 by replacing the missing values with the values from the other slide. Judicious application of is.na and assignment will help with this. 2. Do you really want to do this? The end result will be some dye-corrected genes, and some dye-non-corrected genes (don't get me started on whether this is a reasonable thing to do, I still think the jury is out). and then you want to compare how extreme they are (so you've got some genes with more inherent variance in the measure, not being averages, and if there is a gene by dye effect... Now, I wish I had a positive suggestion, and would appreciate hearing any (rather than the negative one I have above!). best, -tony -- rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

ADD COMMENT • link 20.9 years ago A.J. Rossini ▴ 810

0

Entering edit mode

This is pretty standard analysis with linear models (limma package): dat <- cbind(M', M) fit <- lmFit(dat, design=c(1,-1)) Then fit$coef contains the combined log-ratios you want. Gordon At 02:54 PM 11/02/2004, A.J. Rossini wrote: >"Daniel F. Simola" <simola@mail.med.upenn.edu> writes: > > > Hello, > > > > I have a microarray experiment using dye-swapped slides. I am trying > > to combine (average) the intensities of a gene from a slide and its > > dye-swapped pair, but just discovered that the subtraction operator in > > R does not work the way I would like it to for missing (NA) values. > > > > I am doing: ( M - M' ) / 2, where M is an array of intensities for > > genes, and M' is the same, except dye-swapped. > > > > Say I want the result of " 5 - NA ", where 5 is the intensity of one > > spot and NA is that of the same spot on the dye-swapped slide, then I > > get NA for an answer. Because I want to average the values ( 5 - NA / > > 2 ), then I would like my average value to be 5, instead of NA. Thus > > it's better to make use of the available data than disregard a gene > > completely. > > > > So, does anyone know either of a workaround for this, or of a function > > that I can use to perform element-wise subtraction over a matrix that > > will work how I want (or that will let me define my own function to be > > applied on an element wise basis)? > >Comments: > >1. you could impute 5 by replacing the missing values with the values >from the other slide. Judicious application of is.na and assignment >will help with this. > >2. Do you really want to do this? The end result will be some >dye-corrected genes, and some dye-non-corrected genes (don't get me >started on whether this is a reasonable thing to do, I still think the >jury is out). and then you want to compare how extreme they are >(so you've got some genes with more inherent variance in the measure, >not being averages, and if there is a gene by dye effect... > >Now, I wish I had a positive suggestion, and would appreciate hearing >any (rather than the negative one I have above!). > >best, >-tony

ADD REPLY • link 20.9 years ago Gordon Smyth 52k

0

Entering edit mode

A.J. Rossini ▴ 810

@aj-rossini-209

Last seen 10.3 years ago

Gordon -- Is it really standard? Mathematically, I understand it, but statistically/scientifically, how does that jibe for trusting the order of calls with/without those genes? We've been seeing some really interesting dye effects, though it isn't clear how much is background and cross-hyb vs. low-copy results (2-channel agilent chips). Of course, restricting to the genes above median expression, or some slightly more advanced filtering, seems to solve this problem, but I'm still worrying about the use of averaging for dye effects. best, -tony Gordon Smyth <smyth@wehi.edu.au> writes: > This is pretty standard analysis with linear models (limma package): > > dat <- cbind(M', M) > fit <- lmFit(dat, design=c(1,-1)) > > Then fit$coef contains the combined log-ratios you want. > > Gordon > > At 02:54 PM 11/02/2004, A.J. Rossini wrote: >>"Daniel F. Simola" <simola@mail.med.upenn.edu> writes: >> >> > Hello, >> > >> > I have a microarray experiment using dye-swapped slides. I am trying >> > to combine (average) the intensities of a gene from a slide and its >> > dye-swapped pair, but just discovered that the subtraction operator in >> > R does not work the way I would like it to for missing (NA) values. >> > >> > I am doing: ( M - M' ) / 2, where M is an array of intensities for >> > genes, and M' is the same, except dye-swapped. >> > >> > Say I want the result of " 5 - NA ", where 5 is the intensity of one >> > spot and NA is that of the same spot on the dye-swapped slide, then I >> > get NA for an answer. Because I want to average the values ( 5 - NA / >> > 2 ), then I would like my average value to be 5, instead of NA. Thus >> > it's better to make use of the available data than disregard a gene >> > completely. >> > >> > So, does anyone know either of a workaround for this, or of a function >> > that I can use to perform element-wise subtraction over a matrix that >> > will work how I want (or that will let me define my own function to be >> > applied on an element wise basis)? >> >>Comments: >> >>1. you could impute 5 by replacing the missing values with the values >>from the other slide. Judicious application of is.na and assignment >>will help with this. >> >>2. Do you really want to do this? The end result will be some >>dye-corrected genes, and some dye-non-corrected genes (don't get me >>started on whether this is a reasonable thing to do, I still think the >>jury is out). and then you want to compare how extreme they are >>(so you've got some genes with more inherent variance in the measure, >>not being averages, and if there is a gene by dye effect... >> >>Now, I wish I had a positive suggestion, and would appreciate hearing >>any (rather than the negative one I have above!). >> >>best, >>-tony > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > -- rossini@u.washington.edu http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

ADD COMMENT • link 20.9 years ago A.J. Rossini ▴ 810

0

Entering edit mode

At 03:28 PM 11/02/2004, A.J. Rossini wrote: >Gordon -- > >Is it really standard? It is for us. We try to balance for dye-effects but seldom find them very important in themselves. > Mathematically, I understand it, but >statistically/scientifically, how does that jibe for trusting the >order of calls with/without those genes? We've been seeing some >really interesting dye effects, though it isn't clear how much is >background and cross-hyb vs. low-copy results (2-channel agilent >chips). I'm not entirely sure what you mean here - maybe send me more details off-line and we can discuss? Cheers Gordon > Of course, restricting to the genes above median expression, >or some slightly more advanced filtering, seems to solve this >problem, but I'm still worrying about the use of averaging for dye >effects. > >best, >-tony > > >Gordon Smyth <smyth@wehi.edu.au> writes: > > > This is pretty standard analysis with linear models (limma package): > > > > dat <- cbind(M', M) > > fit <- lmFit(dat, design=c(1,-1)) > > > > Then fit$coef contains the combined log-ratios you want. > > > > Gordon > > > > At 02:54 PM 11/02/2004, A.J. Rossini wrote: > >>"Daniel F. Simola" <simola@mail.med.upenn.edu> writes: > >> > >> > Hello, > >> > > >> > I have a microarray experiment using dye-swapped slides. I am trying > >> > to combine (average) the intensities of a gene from a slide and its > >> > dye-swapped pair, but just discovered that the subtraction operator in > >> > R does not work the way I would like it to for missing (NA) values. > >> > > >> > I am doing: ( M - M' ) / 2, where M is an array of intensities for > >> > genes, and M' is the same, except dye-swapped. > >> > > >> > Say I want the result of " 5 - NA ", where 5 is the intensity of one > >> > spot and NA is that of the same spot on the dye-swapped slide, then I > >> > get NA for an answer. Because I want to average the values ( 5 - NA / > >> > 2 ), then I would like my average value to be 5, instead of NA. Thus > >> > it's better to make use of the available data than disregard a gene > >> > completely. > >> > > >> > So, does anyone know either of a workaround for this, or of a function > >> > that I can use to perform element-wise subtraction over a matrix that > >> > will work how I want (or that will let me define my own function to be > >> > applied on an element wise basis)? > >> > >>Comments: > >> > >>1. you could impute 5 by replacing the missing values with the values > >>from the other slide. Judicious application of is.na and assignment > >>will help with this. > >> > >>2. Do you really want to do this? The end result will be some > >>dye-corrected genes, and some dye-non-corrected genes (don't get me > >>started on whether this is a reasonable thing to do, I still think the > >>jury is out). and then you want to compare how extreme they are > >>(so you've got some genes with more inherent variance in the measure, > >>not being averages, and if there is a gene by dye effect... > >> > >>Now, I wish I had a positive suggestion, and would appreciate hearing > >>any (rather than the negative one I have above!). > >> > >>best, > >>-tony

ADD REPLY • link 20.9 years ago Gordon Smyth 52k

Login before adding your answer.