Entering edit mode
Samantha Jane England
▴
10
@samantha-jane-england-4995
Last seen 10.3 years ago
Dear Bioconductor Mailing List,
We are working with custom-designed Agilent 4x44 arrays in our lab and
we use Agi4x44PreProcess to perform the pre-processing. I am
looking for some advice about whether or not to summarise probe sets,
and if we should, recommendations for methods/approaches to use.
I have been looking over the GMANE archive and I know that probe-set
summarisation is a common feature of Affymetrix two-colour data
analysis, due to the shorter oligos being used. In contrast, given
the longer length of the oligos with Agilent chips, the consensus
opinion seems to be that summarising Agilent probe sets isnt
necessarily a good idea. Indeed, this may explain why there doesnt
appear to be an obvious R routine to summarise Agilent one-colour
probe sets (but please correct me if I am wrong). We have agreed with
this consensus opinion (and agree that probes can behave differently
within a probe set) and have so far been running the statistics and
clustering analysis where each individual probe (that passes filtering
by flags) is kept in the list.
A colleague suggested that we use a non R-based stats and clustering
program for differential expression. In principle this works fine
the problem is that it cant cope with the large data sets in which
the individual probes remain. We have to break the data set up in to
chunks and perform the analysis that way, which sits a little
uncomfortably with me. So the conflict is do we try to summarise the
probe sets to try and overcome this problem, or should we keep
individual probes separate and look for alternative clustering
programs that would enable us to process the intact data set?
So, in summary the questions I have are these:
1. Is summarisation ever a good idea for Agilent probe sets (we
have 8 probes per transcript), and if so, are their routines in R that
would enable us to do this?
2. If summarisation is a bad idea for Agilent data sets would
taking the median signal intensity be a better strategy?
3. Can anybody recommend a good hierarchical clustering routine in
R that would be suitable for our Agi one-colour data, whether we take
all individual probes or just the median signal intensity? (I thought
maybe oompa or BiClust?)
I would really appreciate any advice or suggestions that people can
give me.
Thank you all very much in anticipation of your help.
With Very Best Wishes
Sam England
Samantha England, PhD
Lewis Lab, Syracuse University
Department of Biology
110 Life Sciences Complex
107 College Place
Syracuse
NY 13244, USA
Email: sjenglan@syr.edu
Tel: (1) 315 443 7253 (lab)
Tel: (1) 315 443 1929 (office)
[[alternative HTML version deleted]]