I have an Affymetrix single channel DNA microarray dataset where normal and diseased tissue are taken from the same organ from 5 persons. So I have 10 CEL files with me. Now I want to analyse the differential gene expression between the normal and diseased tissues. I think I should use Limma Paired samples design matrix and compute paired moderated t test. Limma user guide. 9.4.1 pg no: 42 and 43.
Can anyone please confirm whether my approach is correct or Should I follow different approach ?
My phenodata file looks like this
FileName | Subject | Tissue |
100.CEL | 1 | NORMAL |
101.CEL | 1 | DISEASED |
102.CEL | 2 | NORMAL |
103.CEL | 2 | DISEASED |
104.CEL | 3 | NORMAL |
105.CEL | 3 | DISEASED |
106.CEL | 4 | NORMAL |
107.CEL | 4 | DISEASED |
108.CEL | 5 | NORMAL |
109.CEL | 5 | DISEASED |
My code
library(affy)
library(limma)
# Read all CEL FIles and put into an affybatch
affy.data = ReadAffy()
#Importing the phenotype data
pData(affy.data) = read.table("phenodata.txt", header=TRUE, row.names=1, as.is=TRUE)
#Visualise the phenotype data
pData(affy.data)
# Normalize the data
eset = rma(affy.data)
# DIFFERENTALLY GENE EXPRESSION ANALYSIS
library(limma)
pData(eset)
Subject <- factor(eset$Subject)
Tissue <- factor(eset$Tissue, levels = c("Normal", "Diseased"))
design <- model.matrix(~Subject+Tissue)
design
fit <- lmFit(eset, design)
eBayesfit <- eBayes(fit)
View(eBayesfit)
topTable(eBayesfit, coef="TissueDiseased")
Thank you for the clarification.