Question

Need expert suggestion about Bioconductor package for microarray data anlaysis

0

Entering edit mode

mahmudornob • 0

@mahmudornob-19533

Last seen 5.9 years ago

Dear all,

I am just a beginner of R program. I need to analyze my microarray data. So, I have just started (just 2 days only) to practice with one data set available in GEO called "GSE58231"

I have compiled the script following the wiki page "Analyze your own microarray data in R/Bioconductor". However, a lot of functions don't work properly, and unfortunately, as I am a beginner, I am unable to figure out the problem. So, I am requesting experts' opinion on this issue. Here is the script. I have made bold where I am facing the problem

# Install bioconductor
source("http://bioconductor.org/biocLite.R")

#Install requried packages of bioconductor for Microarray
source("https://bioconductor.org/biocLite.R")
biocLite("affy")
source("https://bioconductor.org/biocLite.R")
biocLite("affydata")
source("https://bioconductor.org/biocLite.R")
biocLite("ArrayExpress")
source("https://bioconductor.org/biocLite.R")
biocLite("limma")
source("https://bioconductor.org/biocLite.R")
biocLite("Biobase")
source("https://bioconductor.org/biocLite.R")
biocLite("Biostrings")
source("https://bioconductor.org/biocLite.R")
biocLite("genefilter")

# Load required libraries
library(affy)
library(affydata)
library(ArrayExpress)
library(limma)
library(Biobase)
library(Biostrings)
library(genefilter)

#Open and import raw CEL File data in Affay
celpath = "C:/Users/Ornia/Documents/R/Microarray Practice"

data = ReadAffy (celfile.path=celpath)

# Retrieve sample annotation
ph = data@phenoData
ph
ph@data

## If there is no sample annatation in previous section then rewrite the sample annotaion.
ph@data[ ,1] = c("wt1","wt2","wt3","erf96oe1","erf96oe2","erf96oe3")
ph

#Plot histogram of each sample
for (i in 1:6)
{
  name = paste("histogram",i,".tiff",sep="")
  tiff(name)
  hist(data[,i],lwd=2,which='pm',ylab='Density',xlab='Log2 intensities',main=ph@data$sample[i])
  dev.off()
}

#Plot histogram of all sample combined (this one is not working)
**op = par(mfrow = c(2,3))
for(i in 1:6){hist(data[,i],lwd=2,which='pm',ylab='Density',xlab='Log2 intensities',main=ph@data$sample[i])}**

#Plot histogram of all sample in one graph (this one is not working)
**color=c('green','green','green','red','red','red')
hist(data[,1:6],lwd=2,which='pm',col=color,ylab='Density',xlab='Log2 intensities',main='Histogram of raw data')**

#Quantificatioin of Quality (None of them are working)
**data.qc = qc(data)
avbg(data.qc)
sfs(data.qc)
percent.present(data.qc)
ratios(data.qc)
Showed there is no qc**

# Data normalization
data.rma = rma(data)
data.matrix = exprs(data.rma)

#MA plot for normalized data (this one is not working)
**for (i in 1:6)
{
  name = paste("MAplotnorm",i,".tiff",sep="")
  tiff(name)
  MAplot(data.rma,which=i)
  dev.off()
}**

#Principal Component Analysis (this one is not working)
**color=c('green','green','green','red','red','red')
data.PC = prcomp(t(data.matrix),scale.=TRUE)
plot(data.PC$x[1:2],col=color)**


# Identification of DEGs
ph@data[ ,2] = c("wt","wt","wt","ox","ox","ox")
colnames(ph@data)[2]="source"
ph@data
groups = ph@data$source
f = factor(groups,levels=c("wt","ox"))
design = model.matrix(~ 0 + f)
colnames(design) = c("wt","ox")

data.fit = lmFit(data.matrix,design)
data.fit$coefficients[1:10,]

contrast.matrix = makeContrasts(ox-wt,levels=design)
data.fit.con = contrasts.fit(data.fit,contrast.matrix)

data.fit.eb = eBayes(data.fit.con)
names(data.fit.eb)
data.fit.eb$coefficients[1:10,]
data.fit.eb$t[1:10,]
data.fit.eb$p.value[1:10,]

#List of IDs that are DEG (this one is not working)
**IDs.up = rownames(topups)
IDs.down = rownames(topdowns)
ups = subset(DEresults, DEresults[,1]==1)
downs = subset(DEresults, DEresults[,1]==-1)
IDs.up = rownames(ups)
IDs.down = rownames(downs)**

I believe any expert can figure out where I am making the mistakes. I am eagerly looking forward to hearing from you all.

Sincerely, Mahmudul Korea University

affy limma • 1.2k views

ADD COMMENT • link updated 6.1 years ago by Gordon Smyth 52k • written 6.1 years ago by mahmudornob • 0

0

Entering edit mode

You say that you need to analyze your own microarray data and that this current dataset is just for practice. Do you have your own data already and is it from an Affymetrix BeadChip?

ADD REPLY • link 6.1 years ago Gordon Smyth 52k

0

Entering edit mode

Thanks for your reply. And sorry for being late to respond to you. I somehow missed it.

I fixed the problem. However, I am not using microarray anymore as the price of the microarray is much higher than RNAseq in Korea (I have to compare only two samples)

ADD REPLY • link 5.9 years ago mahmudornob • 0