Hi All,
I have tried to process data from Affymetrix Human Gene 2.0 ST Array. This is my first time investigating data from this array. I have a couple of questions. I would appreciate any advice in advance.
1) I have realized "affy" package does not work for data from this array, and found "oligo" package does work. My normalizing processes are the followings:
celfiles = list.files(path = ".", pattern = ".CEL$", all.files = FALSE,
full.names = FALSE, recursive = FALSE, ignore.case = FALSE)
Data <- read.celfiles(celfiles)
celfiles.rma <- rma(Data, target="core")
celfiles.rma
ExpressionSet (storageMode: lockedEnvironment)
assayData: 53617 features, 6 samples
element names: exprs
protocolData
rowNames: #8_(HuGene-2_0-st).CEL E_(HuGene-2_0-st).CEL ... T_(HuGene-2_0-st).CEL (6 total)
varLabels: exprs dates
varMetadata: labelDescription channel
phenoData
rowNames: #8_(HuGene-2_0-st).CEL E_(HuGene-2_0-st).CEL ... T_(HuGene-2_0-st).CEL (6 total)
varLabels: index
varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hugene.2.0.st
Now, How can I to remove control probesets from ST arrays? Thank you
@ James W. MacDonald Could you please help me in this?
Hi James,
Im working on Affymetrix Human Gene 2.0 ST Array.
library(oligo)
#Read cel files
celfiles = list.files(path = ".", pattern = ".CEL$", all.files = FALSE,
full.names = FALSE, recursive = FALSE, ignore.case = FALSE)
celfiles
[1] "#8_(HuGene-2_0-st).CEL" "E_(HuGene-2_0-st).CEL" "GG7_(HuGene-2_0-st).CEL"
[4] "J_(HuGene-2_0-st).CEL" "O_(HuGene-2_0-st).CEL" "T_(HuGene-2_0-st).CEL"
Data <- read.celfiles(celfiles)
#RMA
celfiles.rma <- rma(Data, target="core")
library(limma)
design <- model.matrix(~ 0+factor(c(1,1,1,2,2,2)))
colnames(design) <- c("group1", "group2")
fit <- lmFit(celfiles.rma, design)
contrast.matrix <- makeContrasts(group2-group1, group1-group2, levels=design)
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
tab <- topTable(fit2, coef=2, adjust="fdr", sort.by="none", number=Inf)
head(tab)
logFC AveExpr t P.Value adj.P.Val B
16650001 -0.67713091 0.7013795 -2.11246203 0.07463104 0.8414740 -3.923065
16650003 0.01775323 0.8173575 0.05752413 0.95581904 0.9922585 -5.247542
16650005 0.20727356 1.2926154 0.28654513 0.78318782 0.9636612 -5.214740
16650007 0.13198594 0.8028477 0.37395467 0.72008320 0.9502862 -5.191016
16650009 -0.31117759 0.7187522 -0.92432176 0.38765340 0.8815813 -4.917247
16650011 -0.18142907 0.7450577 -0.46483921 0.65689248 0.9382330 -5.160082
write.table(tab, file="DEG.xls",row.names=F, sep="\t")
results <- tab[which(tab$logFC >= 1.5 & tab$P.Value <= 0.05),]
write.table(results, file="DEGp.xls", row.names=T, sep="\t")
idx = which(tab$P.Value < 0.05 & tab$logFC > 1.5)
heatmap(exprs(celfiles.rma[idx,]),trace='none',scale='row')
Could you please tell me how can I add the annotation for the genes?
Thank you