Hello
So I finally fixed some problems I was having with R in intial processing and now have questions of the workflow/pipeline for DE analysis for semitime series dataset in limma. My experiment is trying to find the DE genes between 4 different time points. The reason I say semi time series is because there are months in between instead of hours like bacterial time series so I am not exactly sure it qualifies but they are definitely linked because it is the same samples. I have 8 biological control replicates and 9 replicates for the other time points. I am basically completely lost. I know for a fact there are DE genes but my script so far shows none. Can any one help me or point me to good tutorials in this. Find my script below. Thank you so much in advanced. Note the code below is just me testing out the workflow with two time points.
source("http://bioconductor.org/biocLite.R") biocLite("oligo") biocLite("limma") biocLite("pd.hugene.2.1.st") library(genefilter) library(limma) library(oligo) library(pd.hugene.2.0.st) #directory for the data mydir <- "C:\\Users\\hakim\\Desktop\\Bioinformatics_Thesis_Concordia_Microarry_Data\\Control" #setting seed for reproducibility set.seed(1) #listing the files from directory using special CEL file read function celList <- list.celfiles(mydir, full.names=TRUE) #reading data from cellist and setting annotation package to approiate one for this microarray rawData <- read.celfiles(celList, pkgname='pd.hugene.2.0.st') #normalizing the data using RMA algorithm normData <- rma(rawData) #checking boxplot of raw data par(mar=c(10,4.5,2,1)) boxplot(rawData,las=3) #checking boxplot of normalized data boxplot(normData,las=3) #the annotation package biocLite("hugene20sttranscriptcluster.db") library(hugene20sttranscriptcluster.db) # Strategy is to create data frame objects and merge them together - put expression info into a data frame my_frame <- data.frame(exprs(normData)) # Put annotation information in a data frame. To get specific fields, use packageNameSYMBOL, where the caps part names the type of data you're after hugene20sttranscriptcluster() Annot <- data.frame(ACCNUM=sapply(contents(hugene20sttranscriptclusterACCNUM), paste, collapse=", "), SYMBOL=sapply(contents(hugene20sttranscriptclusterSYMBOL), paste, collapse=", "), DESC=sapply(contents(hugene20sttranscriptclusterGENENAME), paste, collapse=", ")) #retreaving feature data featureData(normData) <- getNetAffx(normData, "transcript") #addomg phenotypic data phn = normData@phenoData phn@data[1:14,2]= "1" phn@data[15:27,2]="0" colnames(phn@data)[2]="source" phn@data # Merge data frames together (like a database table join) also removing NAs all <- merge(Annot, my_frame, by.x=0, by.y=0, all=T) merge <- subset(all, SYMBOL!="NA") # Write out to a file: write.table(merge,file="thesis.ann.txt",sep="\t") #plm Pset = fitProbeLevelModel(rawData) op = par(mfrow = c(1,1)) for (i in 1:27){image(Pset,which=i,main=ph@data$sample[i])} design = cbind(control = 1, controlvstreatment = c((rep.int(0,13)),rep.int(1,13))) fit<- lmFit(normData, design) fit <-eBayes(fit) topTable(fit, coef="controlvstreatment", adjust="BH") results <- decideTests(fit) vennDiagram(results)