Morning List,
I've been toying with arrayQualityMetrics which gives me a great
overview of my data without too much work.
Several things are still unclear to me though:
- how does it calculate the '*' that indicate that a chip may
be
bad. How stringent/conservative are they? Because several of my
spotted cDNA chips show up as having issues, and now I'm unsure
whether or not I should remove them from my analysis.
- is there any way to "see inside" the package? I'd like to
see how
the stringency is calculated, and adapt some of the output
formatting. But when I try to look inside, all I get is something
called "environment":
> arrayQualityMetrics
standardGeneric for "arrayQualityMetrics" defined from
package
"arrayQualityMetrics"
function (expressionset, outdir = getwd(), force =
FALSE,
do.logtransform = FALSE,
split.plots = FALSE, intgroup = "Covariate")
standardGeneric("arrayQualityMetrics")
<environment: 0x4882ad8="">
Methods may be defined for arguments: expressionset,
outdir, force,
do.logtransform, split.plots, intgroup
Use showMethods("arrayQualityMetrics") for currently
available ones.
- on one NChannelSet containing 280 slides,
arrayQualityMetrics
didn't calculate the heatmap. But didn't display any error messages
either. Possibly because of memory contraints on my 3GB mac?
Thanks for putting me on the track to resolving this.
Best,
Yannick
--------------------------------------------
yannick . wurm @ unil . ch
Ant Genomics, Ecology & Evolution @ Lausanne
http://www.unil.ch/dee/page28685_fr.html
Hi Yannick,
- Here is how the outlier detection is performed:
For the MA-plot, the mean of the absolute value of M is computed for
each array and those that lie beyond the extremes of the boxplot's
whiskers are considered as possible outliers arrays. The same
approach,
i.e. using the whiskers of the boxplot, is applied to the following:
the
mean and interquartile range (IQR) from the boxplots and NUSE, the
sums
of the rows of the distance matrix (for the heatmap), and the
amplitude
of low frequencies of the periodogram (for the spatial intensity
distribution). In the case of the RLE plot, any array with a median
RLE
higher than 0.1 is considered as a possible outlier.
To decide whether or not you should remove some chips from your
analysis, I advice you to run the report after normalisation. If after
normalisation, some arrays are flagged with a star in several quality
assessment sections, I would remove it. Of course, it mainly depends
on
the context. For instance, if there is a biological good reason for an
array to be an outlier, keep it.
- To see the "inside" of the arrayQualityMetrics function:
showMethods("arrayQualityMetrics")
gives you the classes for which a method exists. Then you can see the
function for one of this class using selectMethod, for instance:
selectMethod("arrayQualityMetrics","AffyBatch")
However, if you are willing to modify it, you can download the source
of
the package, the functions are in the directory
"arrayQualityMetrics/R".
I am currently working on a new version of the package where it will
be
easier to adapt the functions and to modify the report. If you are
interested, you can have a look at the devel branch of Bioconductor, I
will update the development version of the package soon.
- For the missing heatmap, I have sometimes seen that the plot is done
but for some reason does not show in the report. You can check the
files
heatmap.png and heamapt.pdf in the directory where you created the
report.
Audrey
Yannick Wurm wrote:
> Morning List,
>
> I've been toying with arrayQualityMetrics which gives me a great
> overview of my data without too much work.
>
> Several things are still unclear to me though:
>
> - how does it calculate the '*' that indicate that a chip may be
> bad. How stringent/conservative are they? Because several of my
> spotted cDNA chips show up as having issues, and now I'm unsure
> whether or not I should remove them from my analysis.
>
> - is there any way to "see inside" the package? I'd like to see
> how the stringency is calculated, and adapt some of the output
> formatting. But when I try to look inside, all I get is something
> called "environment":
> > arrayQualityMetrics
>
> standardGeneric for "arrayQualityMetrics" defined from
package
> "arrayQualityMetrics"
> function (expressionset, outdir = getwd(), force = FALSE,
> do.logtransform = FALSE,
> split.plots = FALSE, intgroup = "Covariate")
> standardGeneric("arrayQualityMetrics")
> <environment: 0x4882ad8="">
> Methods may be defined for arguments: expressionset, outdir,
> force, do.logtransform, split.plots, intgroup
> Use showMethods("arrayQualityMetrics") for currently
> available ones.
>
> - on one NChannelSet containing 280 slides, arrayQualityMetrics
> didn't calculate the heatmap. But didn't display any error messages
> either. Possibly because of memory contraints on my 3GB mac?
>
> Thanks for putting me on the track to resolving this.
>
> Best,
>
> Yannick
>
> --------------------------------------------
> yannick . wurm @ unil . ch
> Ant Genomics, Ecology & Evolution @ Lausanne
> http://www.unil.ch/dee/page28685_fr.html
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Audrey Kauffmann
EMBL - EBI
Cambridge UK
http://www.ebi.ac.uk/~audrey
Thanks for the quick and exhaustive reply!!
Regarding the missing heatmap - the files were indeed missing (this
was on a g5 mac with 3 gigs of ram; R 2.7.2 with
arrayQualityMetrics_1.6.1).
But when I reran arrayQualityMetrics on a linux machine with 64 gigs
of ram the heatmap files were generated (this time under a freshly
compiled R2.8 RC 2008-10-14 with arrayQualityMetrics_1.7.17).
Best,
Yannick
On Oct 16, 2008, at 12:08 , Audrey Kauffmann wrote:
> Hi Yannick,
>
> - Here is how the outlier detection is performed:
> For the MA-plot, the mean of the absolute value of M is computed
> for each array and those that lie beyond the extremes of the
> boxplot's whiskers are considered as possible outliers arrays. The
> same approach, i.e. using the whiskers of the boxplot, is applied
> to the following: the mean and interquartile range (IQR) from the
> boxplots and NUSE, the sums of the rows of the distance matrix (for
> the heatmap), and the amplitude of low frequencies of the
> periodogram (for the spatial intensity distribution). In the case
> of the RLE plot, any array with a median RLE higher than 0.1 is
> considered as a possible outlier.
> To decide whether or not you should remove some chips from your
> analysis, I advice you to run the report after normalisation. If
> after normalisation, some arrays are flagged with a star in several
> quality assessment sections, I would remove it. Of course, it
> mainly depends on the context. For instance, if there is a
> biological good reason for an array to be an outlier, keep it.
>
> - To see the "inside" of the arrayQualityMetrics function:
> showMethods("arrayQualityMetrics")
> gives you the classes for which a method exists. Then you can see
> the function for one of this class using selectMethod, for instance:
>
> selectMethod("arrayQualityMetrics","AffyBatch")
>
> However, if you are willing to modify it, you can download the
> source of the package, the functions are in the directory
> "arrayQualityMetrics/R".
> I am currently working on a new version of the package where it
> will be easier to adapt the functions and to modify the report. If
> you are interested, you can have a look at the devel branch of
> Bioconductor, I will update the development version of the package
> soon.
>
> - For the missing heatmap, I have sometimes seen that the plot is
> done but for some reason does not show in the report. You can check
> the files heatmap.png and heamapt.pdf in the directory where you
> created the report.
>
> Audrey
>
>
> Yannick Wurm wrote:
>> Morning List,
>>
>> I've been toying with arrayQualityMetrics which gives me a great
>> overview of my data without too much work.
>>
>> Several things are still unclear to me though:
>>
>> - how does it calculate the '*' that indicate that a chip may
>> be bad. How stringent/conservative are they? Because several of my
>> spotted cDNA chips show up as having issues, and now I'm unsure
>> whether or not I should remove them from my analysis.
>>
>> - is there any way to "see inside" the package? I'd like to
>> see how the stringency is calculated, and adapt some of the output
>> formatting. But when I try to look inside, all I get is something
>> called "environment":
>> > arrayQualityMetrics
>>
>> standardGeneric for "arrayQualityMetrics" defined from
>> package "arrayQualityMetrics"
>> function (expressionset, outdir = getwd(), force = FALSE,
>> do.logtransform = FALSE,
>> split.plots = FALSE, intgroup = "Covariate")
>> standardGeneric("arrayQualityMetrics")
>> <environment: 0x4882ad8="">
>> Methods may be defined for arguments: expressionset,
>> outdir, force, do.logtransform, split.plots, intgroup
>> Use showMethods("arrayQualityMetrics") for currently
>> available ones.
>> - on one NChannelSet containing 280 slides,
>> arrayQualityMetrics didn't calculate the heatmap. But didn't
>> display any error messages either. Possibly because of memory
>> contraints on my 3GB mac?
>>
>> Thanks for putting me on the track to resolving this.
>>
>> Best,
>>
>> Yannick
>>
>> --------------------------------------------
>> yannick . wurm @ unil . ch
>> Ant Genomics, Ecology & Evolution @ Lausanne
>> http://www.unil.ch/dee/page28685_fr.html
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/
>> gmane.science.biology.informatics.conductor
>
> --
> Audrey Kauffmann
> EMBL - EBI
> Cambridge UK
> http://www.ebi.ac.uk/~audrey
>