Question

Two populations on microarray

0

Entering edit mode

Ben Tupper ▴ 60

@ben-tupper-5045

Last seen 10.6 years ago

Hello, By virtue of experiment design we have two populations to analyze on each of a suite of Genepix microarrays. You can see an example in an MA plot here (generated using the excellent limma package) : http://dl.dropbox.com/u/8433654/BE%20T46h%20slide%2052.png We have been following the steps in the limma user guide, and Ben Bolstad's helpful notes http://tinyurl.com/7346mh9 All of the examples we see appear to have just one population to contend with, which gives us an inkling that we are being naive about our analysis. We suspect that we'll have to separate the two populations before normalization and analysis. Are there any guides available for managing two populations like this? Thanks! Ben Ben Tupper Bigelow Laboratory for Ocean Sciences 180 McKown Point Rd. P.O. Box 475 West Boothbay Harbor, Maine 04575-0475 http://www.bigelow.org

limma limma • 1.2k views

ADD COMMENT • link updated 13.3 years ago by Joaquin Martinez ▴ 50 • written 13.3 years ago by Ben Tupper ▴ 60

score 0 · Answer 1 · 2012-01-15

Dear Ben, Are you saying that you have deliberately designed two different populations of probes onto your arrays? Your MA-plot suggests that there is substantial body of spots on the array for which the green channel has failed, hence the 45-degree line at the top of the plot. These dots likely represent spots with a normal red channel value but close to zero for green. Normally this would have a technical rather than biological cause. An imageplot may help you identify where the offending spots are on your array. On the other hand, if you have deliberately spotted your arrays with two quite different populations of probes, then they probably need to be analysed as separate arrays. Best wishes Gordon > Date: Thu, 12 Jan 2012 14:28:36 -0500 > From: Ben Tupper <btupper at="" bigelow.org=""> > To: bioconductor at r-project.org > Subject: [BioC] Two populations on microarray > > Hello, > > By virtue of experiment design we have two populations to analyze on > each of a suite of Genepix microarrays. You can see an example in an MA > plot here (generated using the excellent limma package) : > > http://dl.dropbox.com/u/8433654/BE%20T46h%20slide%2052.png > > We have been following the steps in the limma user guide, and Ben > Bolstad's helpful notes http://tinyurl.com/7346mh9 All of the examples > we see appear to have just one population to contend with, which gives > us an inkling that we are being naive about our analysis. We suspect > that we'll have to separate the two populations before normalization and > analysis. Are there any guides available for managing two populations > like this? > > Thanks! > Ben > > > Ben Tupper > Bigelow Laboratory for Ocean Sciences > 180 McKown Point Rd. P.O. Box 475 > West Boothbay Harbor, Maine 04575-0475 > http://www.bigelow.org ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:4}}

score 0 · Answer 2 · 2012-01-19

0

Entering edit mode

Joaquin Martinez ▴ 50

@joaquin-martinez-5060

Last seen 10.6 years ago

Dear Naomi, Gordon and Ben, Thank you for your replies to Ben Tuppers (and my) question. We are using spotted oligonucleotide microarrays containing probes for both host and virus genes. In our experiment we had cultures grown under high and low phosphate conditions, inoculated with 2 different viruses (separately) or kept virus-free, in triplicate. RNA purified from those cultures at different time points was fluorescently labeled (with Cy- dyes) and hybridized onto the microarray slides. You can see a flow chart of our experimental design here: http://dl.dropbox.com/u/8433654/design-concept.pdf One slide contains 2 samples which had different experimental treatments. Each sample was split into 3, labeled (dye swap) and hybridized onto 3 different microarray slides in combination with another sample to allow technical replication. I quantified labeling efficiency prior to hybridizing the samples onto the microarray slide, for both dyes I got between 30 and 60 dye molecules per 1000 nt (what is the range indicated by the manufacturer for good labeling). Also we produced FB plots for the green and the red channels, both had similar z-range and saturation range, which we interpreted as a proof of good labeling (?). See example: http://dl.dropbox.com/u/8433654/R-G-imageplot.png Both MA clusters that we observe contain a mixture of both host and virus probes, ruling out that one complete set of probes failed. Naomi mentioned that the nondifferentially expressing genes should cluster around M=0, so does that mean that the top cluster corresponds to differentially expressed genes? We used GenePix Pro to scan and analyze the microarrays. Could we use the normalization function in the software (normalize the data in each image so that the mean of the median of ratios of all features is equal to 1) as an alternative to MA? Or would that simply hide the problem? And then do normalization between arrays using the quantile method? Thanks, Joaquin > > From: Naomi Altman <naomi@stat.psu.edu> > > Date: January 18, 2012 9:56:45 AM EST > > To: Gordon K Smyth <smyth@wehi.edu.au>, Ben Tupper <btupper@bigelow.org> > > Cc: Bioconductor mailing list <bioconductor@r-project.org> > > Subject: Re: [BioC] Two populations on microarray > > > > Dear Ben, > > A typical MA plot has most of the points scattered around the line M=0. > Even if you have 2 populations of probes, the nondifferentially expressing > genes should be in that central ellipse. (The lower cluster does look > somewhat like the typical MA plot for raw data.) I suggest that you do > separate MA plots for each population of probes, to see if one set of > probes failed. Or, as Gordon suggests, a population for which labelling > failed. > > > > --Naomi > > > > > > At 05:48 PM 1/14/2012, Gordon K Smyth wrote: > >> Dear Ben, > >> > >> Are you saying that you have deliberately designed two different > populations of probes onto your arrays? > >> > >> Your MA-plot suggests that there is substantial body of spots on the > array for which the green channel has failed, hence the 45-degree line at > the top of the plot. These dots likely represent spots with a normal red > channel value but close to zero for green. Normally this would have a > technical rather than biological cause. An imageplot may help you identify > where the offending spots are on your array. > >> > >> On the other hand, if you have deliberately spotted your arrays with > two quite different populations of probes, then they probably need to be > analysed as separate arrays. > >> > >> Best wishes > >> Gordon > >> > >>> Date: Thu, 12 Jan 2012 14:28:36 -0500 > >>> From: Ben Tupper <btupper@bigelow.org> > >>> To: bioconductor@r-project.org > >>> Subject: [BioC] Two populations on microarray > >>> > >>> Hello, > >>> > >>> By virtue of experiment design we have two populations to analyze on > each of a suite of Genepix microarrays. You can see an example in an MA > plot here (generated using the excellent limma package) : > >>> > >>> http://dl.dropbox.com/u/8433654/BE%20T46h%20slide%2052.png > >>> > >>> We have been following the steps in the limma user guide, and Ben > Bolstad's helpful notes http://tinyurl.com/7346mh9 All of the examples we > see appear to have just one population to contend with, which gives us an > inkling that we are being naive about our analysis. We suspect that we'll > have to separate the two populations before normalization and analysis. > Are there any guides available for managing two populations like this? > >>> > >>> Thanks! > >>> Ben > >>> > >>> > > [[alternative HTML version deleted]]

ADD COMMENT • link 13.3 years ago Joaquin Martinez ▴ 50

0

Entering edit mode

Dear Joaquin, What I had in mind was that you would make a vector z which takes values TRUE or FALSE depending on whether each probe on the array belongs to group 1 or group 2 according to your MA plot. Then imageplot(z,layout,low="white",high="blue") There is no way for you normalize out this problem, and certainly not within the limited capabilities of GenePix software. Best wishes Gordon --------------------------------------------- Professor Gordon K Smyth, Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Vic 3052, Australia. smyth at wehi.edu.au http://www.wehi.edu.au http://www.statsci.org/smyth On Thu, 19 Jan 2012, Joaquin Martinez wrote: > Dear Naomi, Gordon and Ben, > > > > Thank you for your replies to Ben Tupper?s (and my) question. > > > > We are using spotted oligonucleotide microarrays containing probes for both > host and virus genes. In our experiment we had cultures grown under high > and low phosphate conditions, inoculated with 2 different viruses > (separately) or kept virus-free, in triplicate. RNA purified from those > cultures at different time points was fluorescently labeled (with Cy-dyes) > and hybridized onto the microarray slides. You can see a flow chart of our > experimental design here: > > http://dl.dropbox.com/u/8433654/design-concept.pdf > > > > One slide contains 2 samples which had different experimental treatments. > Each sample was split into 3, labeled (dye swap) and hybridized onto 3 > different microarray slides in combination with another sample to allow > technical replication. > > > > I quantified labeling efficiency prior to hybridizing the samples onto the > microarray slide, for both dyes I got between 30 and 60 dye molecules per > 1000 nt (what is the range indicated by the manufacturer for good > labeling). Also we produced FB plots for the green and the red channels, > both had similar z-range and saturation range, which we interpreted as a > proof of good labeling (?). See example: > > http://dl.dropbox.com/u/8433654/R-G-imageplot.png > > > > Both MA clusters that we observe contain a mixture of both host and virus > probes, ruling out that one complete set of probes failed. Naomi mentioned > that the nondifferentially expressing genes should cluster around M=0, so > does that mean that the top cluster corresponds to differentially expressed > genes? > > > > We used GenePix Pro to scan and analyze the microarrays. Could we use the > normalization function in the software (normalize the data in each image so > that the mean of the median of ratios of all features is equal to 1) as an > alternative to MA? Or would that simply hide the problem? And then do > normalization between arrays using the quantile method? > > > Thanks, > > Joaquin > > > >>> From: Naomi Altman <naomi at="" stat.psu.edu=""> >>> Date: January 18, 2012 9:56:45 AM EST >>> To: Gordon K Smyth <smyth at="" wehi.edu.au="">, Ben Tupper <btupper at="" bigelow.org=""> >>> Cc: Bioconductor mailing list <bioconductor at="" r-project.org=""> >>> Subject: Re: [BioC] Two populations on microarray >>> >>> Dear Ben, >>> A typical MA plot has most of the points scattered around the line M=0. >> Even if you have 2 populations of probes, the nondifferentially expressing >> genes should be in that central ellipse. (The lower cluster does look >> somewhat like the typical MA plot for raw data.) I suggest that you do >> separate MA plots for each population of probes, to see if one set of >> probes failed. Or, as Gordon suggests, a population for which labelling >> failed. >>> >>> --Naomi >>> >>> >>> At 05:48 PM 1/14/2012, Gordon K Smyth wrote: >>>> Dear Ben, >>>> >>>> Are you saying that you have deliberately designed two different >> populations of probes onto your arrays? >>>> >>>> Your MA-plot suggests that there is substantial body of spots on the >> array for which the green channel has failed, hence the 45-degree line at >> the top of the plot. These dots likely represent spots with a normal red >> channel value but close to zero for green. Normally this would have a >> technical rather than biological cause. An imageplot may help you identify >> where the offending spots are on your array. >>>> >>>> On the other hand, if you have deliberately spotted your arrays with >> two quite different populations of probes, then they probably need to be >> analysed as separate arrays. >>>> >>>> Best wishes >>>> Gordon >>>> >>>>> Date: Thu, 12 Jan 2012 14:28:36 -0500 >>>>> From: Ben Tupper <btupper at="" bigelow.org=""> >>>>> To: bioconductor at r-project.org >>>>> Subject: [BioC] Two populations on microarray >>>>> >>>>> Hello, >>>>> >>>>> By virtue of experiment design we have two populations to analyze on >> each of a suite of Genepix microarrays. You can see an example in an MA >> plot here (generated using the excellent limma package) : >>>>> >>>>> http://dl.dropbox.com/u/8433654/BE%20T46h%20slide%2052.png >>>>> >>>>> We have been following the steps in the limma user guide, and Ben >> Bolstad's helpful notes http://tinyurl.com/7346mh9 All of the examples we >> see appear to have just one population to contend with, which gives us an >> inkling that we are being naive about our analysis. We suspect that we'll >> have to separate the two populations before normalization and analysis. >> Are there any guides available for managing two populations like this? >>>>> >>>>> Thanks! >>>>> Ben >>>>> >>>>> >> >> > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:5}}

ADD REPLY • link 13.3 years ago Gordon Smyth 52k

0

Entering edit mode

I agree with Gordon. I doubt that the double cloud has anything to do with differential expression. There is something odd going on technically. The usual types of normalization are not going to fix the problem. --Naomi At 12:03 AM 1/20/2012, Gordon K Smyth wrote: >Dear Joaquin, > >What I had in mind was that you would make a vector z which takes >values TRUE or FALSE depending on whether each probe on the array >belongs to group 1 or group 2 according to your MA plot. Then > > imageplot(z,layout,low="white",high="blue") > >There is no way for you normalize out this problem, and certainly not >within the limited capabilities of GenePix software. > >Best wishes >Gordon > >--------------------------------------------- >Professor Gordon K Smyth, >Bioinformatics Division, >Walter and Eliza Hall Institute of Medical Research, >1G Royal Parade, Parkville, Vic 3052, Australia. >smyth at wehi.edu.au >http://www.wehi.edu.au >http://www.statsci.org/smyth > > >On Thu, 19 Jan 2012, Joaquin Martinez wrote: > >>Dear Naomi, Gordon and Ben, >> >> >> >>Thank you for your replies to Ben Tupper's (and my) question. >> >> >> >>We are using spotted oligonucleotide microarrays containing probes for both >>host and virus genes. In our experiment we had cultures grown under high >>and low phosphate conditions, inoculated with 2 different viruses >>(separately) or kept virus-free, in triplicate. RNA purified from those >>cultures at different time points was fluorescently labeled (with Cy-dyes) >>and hybridized onto the microarray slides. You can see a flow chart of our >>experimental design here: >> >>http://dl.dropbox.com/u/8433654/design-concept.pdf >> >> >> >>One slide contains 2 samples which had different experimental treatments. >>Each sample was split into 3, labeled (dye swap) and hybridized onto 3 >>different microarray slides in combination with another sample to allow >>technical replication. >> >> >> >>I quantified labeling efficiency prior to hybridizing the samples onto the >>microarray slide, for both dyes I got between 30 and 60 dye molecules per >>1000 nt (what is the range indicated by the manufacturer for good >>labeling). Also we produced FB plots for the green and the red channels, >>both had similar z-range and saturation range, which we interpreted as a >>proof of good labeling (?). See example: >> >>http://dl.dropbox.com/u/8433654/R-G-imageplot.png >> >> >> >>Both MA clusters that we observe contain a mixture of both host and virus >>probes, ruling out that one complete set of probes failed. Naomi mentioned >>that the nondifferentially expressing genes should cluster around M=0, so >>does that mean that the top cluster corresponds to differentially expressed >>genes? >> >> >> >>We used GenePix Pro to scan and analyze the microarrays. Could we use the >>normalization function in the software (normalize the data in each image so >>that the mean of the median of ratios of all features is equal to 1) as an >>alternative to MA? Or would that simply hide the problem? And then do >>normalization between arrays using the quantile method? >> >> >>Thanks, >> >>Joaquin >> >> >> >>>>From: Naomi Altman <naomi at="" stat.psu.edu=""> >>>>Date: January 18, 2012 9:56:45 AM EST >>>>To: Gordon K Smyth <smyth at="" wehi.edu.au="">, Ben Tupper <btupper at="" bigelow.org=""> >>>>Cc: Bioconductor mailing list <bioconductor at="" r-project.org=""> >>>>Subject: Re: [BioC] Two populations on microarray >>>> >>>>Dear Ben, >>>>A typical MA plot has most of the points scattered around the line M=0. >>> Even if you have 2 populations of probes, the nondifferentially expressing >>>genes should be in that central ellipse. (The lower cluster does look >>>somewhat like the typical MA plot for raw data.) I suggest that you do >>>separate MA plots for each population of probes, to see if one set of >>>probes failed. Or, as Gordon suggests, a population for which labelling >>>failed. >>>> >>>>--Naomi >>>> >>>> >>>>At 05:48 PM 1/14/2012, Gordon K Smyth wrote: >>>>>Dear Ben, >>>>> >>>>>Are you saying that you have deliberately designed two different >>>populations of probes onto your arrays? >>>>> >>>>>Your MA-plot suggests that there is substantial body of spots on the >>>array for which the green channel has failed, hence the 45-degree line at >>>the top of the plot. These dots likely represent spots with a normal red >>>channel value but close to zero for green. Normally this would have a >>>technical rather than biological cause. An imageplot may help you identify >>>where the offending spots are on your array. >>>>> >>>>>On the other hand, if you have deliberately spotted your arrays with >>>two quite different populations of probes, then they probably need to be >>>analysed as separate arrays. >>>>> >>>>>Best wishes >>>>>Gordon >>>>> >>>>>>Date: Thu, 12 Jan 2012 14:28:36 -0500 >>>>>>From: Ben Tupper <btupper at="" bigelow.org=""> >>>>>>To: bioconductor at r-project.org >>>>>>Subject: [BioC] Two populations on microarray >>>>>> >>>>>>Hello, >>>>>> >>>>>>By virtue of experiment design we have two populations to analyze on >>>each of a suite of Genepix microarrays. You can see an example in an MA >>>plot here (generated using the excellent limma package) : >>>>>> >>>>>> http://dl.dropbox.com/u/8433654/BE%20T46h%20slide%2052.png >>>>>> >>>>>>We have been following the steps in the limma user guide, and Ben >>>Bolstad's helpful notes http://tinyurl.com/7346mh9 All of the examples we >>>see appear to have just one population to contend with, which gives us an >>>inkling that we are being naive about our analysis. We suspect that we'll >>>have to separate the two populations before normalization and analysis. >>> Are there any guides available for managing two populations like this? >>>>>> >>>>>>Thanks! >>>>>>Ben >>>>>> >>> > >_____________________________________________________________________ _ >The information in this email is confidential and inten...{{dropped:9}}

ADD REPLY • link 13.3 years ago Naomi Altman ★ 6.0k