Hi there, I am trying to analyse a data set to identify differentially expressed genes between plant development stages. Currently, I have generated the data for the wildtype. I have followed the manual for deseq using the htseqinput pipeline, and generated the dds dataset. The problem I am having is comparing the expression between the 8 developmentally stages I have. I do not have "treated" versus "untreated" as is commonly the case in the online forums I have read.
I believe I need to use the LTR test. However, my design has just 1 variable (stage) and so I am confused as to how to write a formula for my reduced model. I don't want to compare the DEG in a pairwise manor against my reference, but overall.
Can anyone give me some advice?
Thanks in advance, C
sampleFiles <- grep("readCount",list.files(directory),value=TRUE)
sampleStage <- c('12dpaWT', '12dpaWT', '2mmWT','2mmWT','2mmWT', '4DPAWT', '4DPAWT','4DPAWT', '4mmWT', '4mmWT', '4mmWT', '8dpaWT', '8dpaWT', '8dpaWT','8mmWT', '8mmWT','8mmWT','AntFleurWT','AntFleurWT','AntFleurWT','AntFruitWT', 'AntFruitWT', 'AntFruitWT', 'FRUITO')
sampleTable <- data.frame(sampleName = sampleFiles,
fileName = sampleFiles,
Stage = sampleStage)
sampleTable$Stage <- factor(sampleTable$Stage)
library("DESeq2")
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
directory = directory,
design= ~ Stage)
ddsHTSeq # Provides user with table containing information on the matrix
keep <- rowSums(counts(ddsHTSeq)) >= 10 # Remove rows which have less than 10 reads, rowSums calculates the sum of rows of a matrix or an array, counts is the number of occurences)
dds <- ddsHTSeq[keep,] # This removed ~10 000 elements, reducing size of matrix
dds$Stage <- relevel(dds$Stage, ref = "FRUITO")
reduced <- #Need to create a reduced formula ??
dds <- DESeq(dds, test="LRT", reduced= )
Hi Basti, thank you for that quick response. Yes I had read that, but maybe I have interrupted it wrong... as I have just 1 design variable, would I simply write my reduced formula as:
I was trying to create a new variable, but was unsure which design term to remove as i have just one, i.e. here:
Cheers!
I don't understand why would you want to create a new variable, if you want to test for multiple stages at once as you formerly described, just keep the DESeq() function like this