Entering edit mode
Calin-Jageman, Robert
▴
30
@calin-jageman-robert-6431
Last seen 7.8 years ago
I've been working on a two-color analysis of a custom-printed agilent
array. I'm examining gene expression in CNS samples from Aplysia
californica that have undergone unilateral sensitization training.
I've got 8 animals, each with a sample from the trained and untrained
side of the CNS. Each pair is hybridized to one two-color array.
With the limma user guide and Google, I came up with the following
script which seems to give quite reasonable results:
library(limma)
targets <- readTargets("targets.txt")
RG <- read.maimages(targets, source="agilent")
RG <- backgroundCorrect(RG, method="normexp", offset=16)
MA <- normalizeWithinArrays(RG, method="loess")
MA2 <- MA[MA$genes$ControlType==0,]
MA2.avg <- avereps(MA2, ID=MA2$genes$GeneName)
fit <- lmFit(MA2.avg)
fit2 <- eBayes(fit)
allgenes <- topTable(fit2,number=nrow(fit2))
write.table(allgenes,file="all_genes.txt")
# I know topTable is not the only output to examine; this is just a
start.
Couple of quick questions for anyone out there who is reading:
* I ended up with a much larger list of regulated transcripts than
when I conducted a similar analysis in GeneSpring. In GeneSpring I
ended up with ~400 transcripts regulated at with a FDR of 0.05. With
limma, though, I get nearly 1,000 at FDR = 0.01! Many of these,
though, have fairly small fold-changes. In fact, when I filter for at
least 1.5x change, the list shrinks dramatically to about 60
transcripts. Is the much larger list I'm initially getting due to the
enhanced power of using the bayes approach for estimating sampling
error, or have I failed to properly adjust for multiple comparisons in
this script?
* My qPCR data from the same samples fits the limma analysis MUCH
better. Over 15 transcripts where I've done qPCR on the same samples
(mix of predicted up/down/stable), the r2 with GeneSpring estimates is
about 0.5; with the limma estimates it's 0.83! Is limma really this
much better?
* I'm not sure if I fully understand the choice of offset in
background correction. I've seen no offset, offsets of 16, 30....
Despite reading the documentation on the function, it's not clear to
me how users select an offset or if it's actually worth using. Any
feedback?
Cheers,
Bob
========
Robert Calin-Jageman
Associate Professor, Psychology
Neuroscience Program Director
Dominican University
rcalinjageman at dom.edu
708.524.6581
Through painful experience I found that the script I quoted is NOT correct (in most cases). Specifically,
fit <- lmFit(MA2.avg)
should have been:
What's tricky about this bug is that lmFit doesn't complain about not receiving the design array...instead it simply proceeds assuming each dye is 1 condition. If you have no dye swaps, this will produce the expected results. But if you have dye swaps..disaster.
I should have posted a correction long ago, but forgot I had posted this. Just thought I'd add this comment just in case anyone finds this code via Google.