Entering edit mode
Ben Tupper
▴
60
@ben-tupper-5045
Last seen 10.3 years ago
Hi,
On Jan 21, 2012, at 2:59 PM, Naomi Altman wrote:
> I agree with Gordon.
>
> I doubt that the double cloud has anything to do with differential
expression. There is something odd going on technically. The usual
types of normalization are not going to fix the problem.
Thanks for the assistance - we took up the suggestions that Gordon
proposed. We have successfully assigned weight = 0 to the problematic
points. I encouraged us to use a brute force identify-and-kill
approach, but Joaquin's more nuanced inter-slide comparison approach
prevailed. The MA plots look great now but the subsequent between-
array normalizations seem problematic, or at least the diagnostic
plotDensities() graphics points to continuing issues. This plot shows
4 diagnostic plots for one array ...
http://dl.dropbox.com/u/8433654/slide-52-MA-diagnostics.png
In the left column are shown the results of plotMA(MA,...) with
zero.weights set to TRUE/FALSE so that we can show/hide the weight = 0
spots.
In the right column are shown the results of a slightly modified
plotDensities(MA,...) where I have added a zero.weights argument to
the original plotDensities() function. The upper plot is identical to
the output from the original plotDensities() function, while the lower
plot simply removes the weight = 0 spots before computing the density
distribution. Because the MA-to-RG transformation in the original
plotDensities() function doesn't take weights into account, it becomes
difficult to use the function with our data to visually diagnose the
effect the normalization functions.
The upper right plot leads us to believe that we have some serious
issues. But the lower right plot tells us that we are ok - obviously
we like the lower right one better!
So, are we fooling ourselves by thinking the histogram at lower right
is enough to tell us that we are good to go on to the next step? If
we are fooling ourselves, then what would you advise us to do instead?
Thanks so much!
Ben Tupper
>
> --Naomi
>
>
> At 12:03 AM 1/20/2012, Gordon K Smyth wrote:
>> Dear Joaquin,
>>
>> What I had in mind was that you would make a vector z which takes
values TRUE or FALSE depending on whether each probe on the array
belongs to group 1 or group 2 according to your MA plot. Then
>>
>> imageplot(z,layout,low="white",high="blue")
>>
>> There is no way for you normalize out this problem, and certainly
not
>> within the limited capabilities of GenePix software.
>>
>> Best wishes
>> Gordon
>>
>> ---------------------------------------------
>> Professor Gordon K Smyth,
>> Bioinformatics Division,
>> Walter and Eliza Hall Institute of Medical Research,
>> 1G Royal Parade, Parkville, Vic 3052, Australia.
>> smyth at wehi.edu.au
>> http://www.wehi.edu.au
>> http://www.statsci.org/smyth
>>
>>
>> On Thu, 19 Jan 2012, Joaquin Martinez wrote:
>>
>>> Dear Naomi, Gordon and Ben,
>>>
>>>
>>>
>>> Thank you for your replies to Ben Tupper's (and my) question.
>>>
>>>
>>>
>>> We are using spotted oligonucleotide microarrays containing probes
for both
>>> host and virus genes. In our experiment we had cultures grown
under high
>>> and low phosphate conditions, inoculated with 2 different viruses
>>> (separately) or kept virus-free, in triplicate. RNA purified from
those
>>> cultures at different time points was fluorescently labeled (with
Cy-dyes)
>>> and hybridized onto the microarray slides. You can see a flow
chart of our
>>> experimental design here:
>>>
>>> http://dl.dropbox.com/u/8433654/design-concept.pdf
>>>
>>>
>>>
>>> One slide contains 2 samples which had different experimental
treatments.
>>> Each sample was split into 3, labeled (dye swap) and hybridized
onto 3
>>> different microarray slides in combination with another sample to
allow
>>> technical replication.
>>>
>>>
>>>
>>> I quantified labeling efficiency prior to hybridizing the samples
onto the
>>> microarray slide, for both dyes I got between 30 and 60 dye
molecules per
>>> 1000 nt (what is the range indicated by the manufacturer for good
>>> labeling). Also we produced FB plots for the green and the red
channels,
>>> both had similar z-range and saturation range, which we
interpreted as a
>>> proof of good labeling (?). See example:
>>>
>>> http://dl.dropbox.com/u/8433654/R-G-imageplot.png
>>>
>>>
>>>
>>> Both MA clusters that we observe contain a mixture of both host
and virus
>>> probes, ruling out that one complete set of probes failed. Naomi
mentioned
>>> that the nondifferentially expressing genes should cluster around
M=0, so
>>> does that mean that the top cluster corresponds to differentially
expressed
>>> genes?
>>>
>>>
>>>
>>> We used GenePix Pro to scan and analyze the microarrays. Could we
use the
>>> normalization function in the software (normalize the data in each
image so
>>> that the mean of the median of ratios of all features is equal to
1) as an
>>> alternative to MA? Or would that simply hide the problem? And then
do
>>> normalization between arrays using the quantile method?
>>>
>>>
>>> Thanks,
>>>
>>> Joaquin
>>>
>>>
>>>
>>>>> From: Naomi Altman <naomi at="" stat.psu.edu="">
>>>>> Date: January 18, 2012 9:56:45 AM EST
>>>>> To: Gordon K Smyth <smyth at="" wehi.edu.au="">, Ben Tupper <btupper at="" bigelow.org="">
>>>>> Cc: Bioconductor mailing list <bioconductor at="" r-project.org="">
>>>>> Subject: Re: [BioC] Two populations on microarray
>>>>>
>>>>> Dear Ben,
>>>>> A typical MA plot has most of the points scattered around the
line M=0.
>>>> Even if you have 2 populations of probes, the nondifferentially
expressing
>>>> genes should be in that central ellipse. (The lower cluster does
look
>>>> somewhat like the typical MA plot for raw data.) I suggest that
you do
>>>> separate MA plots for each population of probes, to see if one
set of
>>>> probes failed. Or, as Gordon suggests, a population for which
labelling
>>>> failed.
>>>>>
>>>>> --Naomi
>>>>>
>>>>>
>>>>> At 05:48 PM 1/14/2012, Gordon K Smyth wrote:
>>>>>> Dear Ben,
>>>>>>
>>>>>> Are you saying that you have deliberately designed two
different
>>>> populations of probes onto your arrays?
>>>>>>
>>>>>> Your MA-plot suggests that there is substantial body of spots
on the
>>>> array for which the green channel has failed, hence the 45-degree
line at
>>>> the top of the plot. These dots likely represent spots with a
normal red
>>>> channel value but close to zero for green. Normally this would
have a
>>>> technical rather than biological cause. An imageplot may help
you identify
>>>> where the offending spots are on your array.
>>>>>>
>>>>>> On the other hand, if you have deliberately spotted your arrays
with
>>>> two quite different populations of probes, then they probably
need to be
>>>> analysed as separate arrays.
>>>>>>
>>>>>> Best wishes
>>>>>> Gordon
>>>>>>
>>>>>>> Date: Thu, 12 Jan 2012 14:28:36 -0500
>>>>>>> From: Ben Tupper <btupper at="" bigelow.org="">
>>>>>>> To: bioconductor at r-project.org
>>>>>>> Subject: [BioC] Two populations on microarray
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> By virtue of experiment design we have two populations to
analyze on
>>>> each of a suite of Genepix microarrays. You can see an example
in an MA
>>>> plot here (generated using the excellent limma package) :
>>>>>>>
>>>>>>>
http://dl.dropbox.com/u/8433654/BE%20T46h%20slide%2052.png
>>>>>>>
>>>>>>> We have been following the steps in the limma user guide, and
Ben
>>>> Bolstad's helpful notes http://tinyurl.com/7346mh9 All of the
examples we
>>>> see appear to have just one population to contend with, which
gives us an
>>>> inkling that we are being naive about our analysis. We suspect
that we'll
>>>> have to separate the two populations before normalization and
analysis.
>>>> Are there any guides available for managing two populations like
this?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Ben
>>>>>>>
>>>>
>>
>>
______________________________________________________________________
>> The information in this email is confidential and intended solely
for the addressee.
>> You must not disclose, forward, print or use it without the
permission of the sender.
>>
______________________________________________________________________
>>
>
>
>
Ben Tupper
Bigelow Laboratory for Ocean Sciences
180 McKown Point Rd. P.O. Box 475
West Boothbay Harbor, Maine 04575-0475
http://www.bigelow.org