Entering edit mode
David Iles
▴
130
@david-iles-4487
Last seen 9.8 years ago
Hi Folks,
As a relative newcomer to BioC, having spent most of the last 28 years
at the bench, I am finally getting my head round R programming. I've
been using limma and affy to analyse a fairly chunky (and expensive!)
Affymetrix hgu133plus2 data set and have been successful in generating
topTable results that actually make sense. Always good when an
experiment works (or seems to!)
One thing that has completely stumped me so far though, despite
extensive vignette and email string searches, and attempts to adapt
code written for Agilent single channel data, is how to
'automatically' include gene names, symbols, GO IDs etc in the
topTable output. While it may be easy enough to use mget to retrieve
the necessary info for small numbers of probesets, it gets tedious
when one needs either to cut-and-paste long lists of affy IDs into
DAVID, or convert them to long lines, with each probeset ID flanked by
" ", which is what I have done so far.
Since there is such a rich repository of data in hgu133plus2.db, there
must be a way to tap into this without going 'outside' limma. Can
anyone suggest how to do this? I'd be most grateful.
Code (comments on errors/shortcuts etc appreciated) and session info
below.
Thanks.
Dr David Iles
Institute for Integrative and Comparative Biology
University of Leeds
Leeds LS2 9JT
d.e.iles at leeds.ac.uk
The experiment is designed to detect differences in muscle gene
expression between patients with a myopathy (S) and controls (N), and
also how gender affects these differences.
> library(affy)
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'openVignette()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation(pkgname)'.
> library(limma)
> library(hgu133plus2.db)
Loading required package: AnnotationDbi
Loading required package: org.Hs.eg.db
Loading required package: DBI
> library(hgu133plus2cdf)
> mtargets<-readTargets("/Users/daveiles/Documents/R/muscle_data/muscl
edat/mustargets.txt")
> mtargets
filename phen gen
1 DF1U133plus25222M.CEL S M
2 DF1U133plus25526M.CEL S F
3 DF2U133plus22264M.CEL S M
4 DF2U133plus22341M.CEL N M
5 DF2U133plus22469M.CEL S M
6 DF2U133plus22539M.CEL S M
7 DF2U133plus22632M.CEL N F
8 DF2U133plus23490M.CEL N F
9 DF2U133plus23690M.CEL S M
10 DF2U133plus24018M.CEL S M
# plus 49 others, deleted for brevity
> mdat<-ReadAffy(widget=T)
Loading required package: tkWidgets
Loading required package: widgetTools
Loading required package: tcltk
Loading Tcl/Tk interface ... done
Loading required package: DynDoc
Loading required package: tools
> meset<-rma(mdat)
Background correcting
Normalizing
Calculating Expression
> mphengen<-paste(mtargets$phen,mtargets$gen,sep=".")
> mphengen
[1] "S.M" "S.F" "S.M" "N.M" "S.M" "S.M" "N.F" "N.F" "S.M" "S.M" "N.F"
"S.M"
# etc - deleted for brevity
> mphengen<-factor(mphengen,levels=c("S.M","S.F","N.M","N.F"))
> design<-model.matrix(~0+mphengen)
> colnames(design)<-levels(mphengen)
> fit<-lmFit(meset,design)
> fit<-eBayes(fit)
# 1. influence of genotype within each phenotype
> cont.matrix <- makeContrasts(S.MvsF=S.M-S.F, N.MvsF=N.M-N.F,
Diff=(N.M-N.F)-(S.M-S.F), levels=design)
> fit2<-contrasts.fit(fit,cont.matrix)
> fit2<-eBayes(fit2)
> msMvrfes<-topTable(fit2,coef="S.MvsF",n=100,adjust="BH")
> mnMvrfes<-topTable(fit2,coef="N.MvsF",n=100,adjust="BH")
> write.table(msMvrfes,file="mS_MvFres.txt")
> write.table(mnMvrfes,file="mN_MvFres.txt")
# 2. influence of phenotype within each genotype
> cont.matrix1 <- makeContrasts(M.SvsN=S.M-N.M, F.SvsN=S.F-N.F,
Diff=(S.M-S.F)-(N.M-N.F), levels=design)
> fit3<-contrasts.fit(fit,cont.matrix1)
> fit3<-eBayes(fit3)
> mM_SvsNres<-topTable(fit3,coef="M.SvsN",n=100,adjust="BH")
> mF_SvsNres<-topTable(fit3,coef="F.SvsN",n=100,adjust="BH")
> write.table(mM_SvsNres,file="mM_SvsNres.txt")
> write.table(mF_SvsNres,file="mF_SvsNres.txt")
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/C/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] limma_3.6.9 affy_1.28.0 hgu133plus2cdf_2.7.0
[4] hgu133plus2.db_2.4.5 org.Hs.eg.db_2.4.6 RSQLite_0.9-4
[7] DBI_0.2-5 AnnotationDbi_1.12.0 Biobase_2.10.0
loaded via a namespace (and not attached):
[1] affyio_1.18.0 preprocessCore_1.12.0 tools_2.12.0
>