Entering edit mode
Dear Jeremy,
Personally, I'd treat all the genes as duplicated twice. In this
approach, the small group of
special genes which are actually duplicated 20 times would each be
treated as 10 different genes.
Best wishes
Gordon
> Date: Wed, 4 Apr 2007 14:51:13 -0700
> From: "Jeremy Davis-Turak" <jeremydt at="" gmail.com="">
> Subject: [BioC] Limma: different numbers of duplicated spots
> To: bioconductor at stat.math.ethz.ch
> Message-ID:
> <378b225b0704041451h72a7fbb3hc206614aec7cdc27 at
mail.gmail.com>
> Content-Type: text/plain
>
> Hi BioC list,
>
> I am analyzing some new Agilent 4x44 C. Elegans arrays, and as our
previous
> Agilent celegans arrays, there are 120 genes that are printed many
(> 10)
> times. However, now on each array everything is duplicated: Those
120 spots
> are printed 20 times (not 10), and all others are printed twice
(and one
> spot is printed 4 times...it probably was meant to be 2 different
genes).
> As far as I can tell, the duplicated spots are randomly spaced. I
would
> like to use duplicateCorrelation on the normalized data, sorted by
gene
> name, as described previously on the list:
>
> https://stat.ethz.ch/pipermail/bioconductor/attachments/20060123/489
0ea8e/attachment.pl
>
> My only problem now is the spots that are replicated 20 times. In
the past
> I haven't dealt with them using very stringent statistics, since it
was only
> 120 spots that I was dealing with (and maybe 1 gene in the group was
> differentially expressed). Now however, since all 20K spots are
duplicated,
> we need to take of the duplicates. Clearly duplicateCorrelation is
the
> simplest way to do this, but it won't work if we have 120 genes that
are
> printed 20 times.
>
> My question is: how do I deal with these gens? Could I just ignore
those
> 120 genes for the calculation of the consensus correlation? I read
on this
> list that small numbers of genes won't affect this calculation: 120
/20K is
> less than 1% of the genes.
>
> If I do that, what would become of the 120 spots? Can I somehow
apply the
> same consensus correlation to them?
>
> What other solutions do people propose?
>
>
> Thanks in advance for your time.
>
> Jeremy Davis-Turak