Entering edit mode
lepalmer@notes.cc.sunysb.edu
▴
40
@lepalmernotesccsunysbedu-1254
Last seen 10.5 years ago
This is the pipeline I have been currently using for analysis. I just
wanted peoples opinions on if things can be done better. (Its a 3
sets
of dye-swaps with 2 spots per orf per chip)
library(limma)
targets<-readTargets("targets.txt")
RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0))
RG$printer<-getLayout(RG$genes)
RG$genes<-readGAL("Y_pestis.sorted.gal")
spottypes<-readSpotTypes("spotTypes.txt")
RG$genes$Status<-controlStatus(spottypes,RG)
RGb<-backgroundCorrect(RG,method="normexp")
MA<-normalizeWithinArrays(RGb)
MA<-normalizeBetweenArrays(MA)
cor<-duplicateCorrelation(MA,ndups=2,spacing=240)
design<-c(1,-1,1,-1,-1,1)
fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,spa
cing=240)
fit<-eBayes(fit)
tt<-topTable(fit,adjust="fdr",n=6000)
write.table(tt,file="tmp.txt",sep="\t")
I have also recently read about the Kooperberg method for background
correction. Is this a preferred method?
I have been able to do this with the following commands
targets<-readTargets("targets.txt") #
RG<-read.maimages(targets$FileName,source="genepix",wt.fun=wtflags(0))
RG$printer<-getLayout(RG$genes)
RG$genes<-readGAL("Y_pestis.sorted.gal")
spottypes<-readSpotTypes("spotTypes.txt")
RG$genes$Status<-controlStatus(spottypes,RG)
read.series(targets$FileName, suffix=NULL, skip=31, sep="\t")
RGb <- kooperberg(targets$FileName, layout=RG$printer)
RGb$genes<-RG$genes
RGb$printer<-RG$printer
RGb$weights<-RG$weights
RGb$targets<-RG$targets
MA<-normalizeWithinArrays(RGb)
MA<-normalizeBetweenArrays(MA)
cor<-duplicateCorrelation(MA,ndups=2,spacing=240)
design<-c(1,-1,1,-1,-1,1)
fit<-lmFit(MA,design,ndups=2,correlation=cor$consensus.correlation,spa
cing=240)
fit<-eBayes(fit)
topTable(fit,adjust="fdr",n=32)
tt<-topTable(fit,adjust="fdr",n=6000)
write.table(tt,file="tmp.txt",sep="\t")
I recently had a small argument with an advisor who told me to do
background correction by subtracting background from foreground and
flagging negative numbers. This is obviously the default for limma.
BUt
when doing this approach, a lot of spots popped up that didnt make
sense
(ie non-specific DNA), while the normexp fixed that problem. I
recently
discovered Kooperberg, which was designed for the problem of negative
intensitie with Genepix data. So which is the best method, and how do
I
convince this guy that I should use this method?
One last question I have is that these methods will give you some
statistics on gene expression differences. Often people report genes
that
are differentially regulated by more than two-fold. It seems to me
that
to do this, one would need an intensity cutoff, as genes with little,
or
no expression can easily slip into that category. How would one
calculate
such a cutoff? There are spots on the array that contain oligos that
are
definitely not found in the species being studied. (Bacteria vs
arabidopsis). Can this information be used.
Thanks,
Lance Palmer
[[alternative HTML version deleted]]