Hi there!
I would like to use DESeq2 for microbiome data to investigate gene abundance on different groups.
To make it simple, this is information about my samples. I have 2 main groups of patient. 1- with disease 2-matched control to each sample. I have 2 samples for each patient at 2 time-points (TP).
I'm trying to run deseq2 to investigate the gene differential expression between two time-points of groups with disease and control. Then between same time-point between disease and control group. This picture should make the analysis strategy more clearer https://www.dropbox.com/s/ee7gsb21rx9ge81/Screenshot%202018-11-05%2016.21.34.png?dl=0
I split my data to 4 groups. 1- Disease group: First and last TP, 2- Control groups: First and last TP, 3- First TP: Disease and control, and 4- Last TP: Disease and control groups.
I have 2 columns in my metadata/factor table. First coldata Group::First and Last. Second column Condition and each 2 samples from one paitient paired in one number. For example, each sample at Early collection and Late collection from one patient given one number.
This is the design I used for my model
AllData$Group <- factor(AllData$Group) AllData$Condition <- factor(AllData$Condition) dds <- DESeqDataSetFromMatrix(countData = GenesCount, colData = AllData, design= ~ Group + Condition) dds<-DESeq(dds) design(dds) <- formula(~ Group + Condition) dds <- DESeq(dds, betaPrior=FALSE)
However, when I run the first group pf data of Disease group for both Early and Late collection samples I got this for resultsNames(dds)
resultsNames(dds) [1] "Intercept" "Group_Disease_Late_vs_Disease_Early" "Condition_2_vs_1" "Condition_3_vs_1" "Condition_4_vs_1" [6] "Condition_5_vs_1" "Condition_6_vs_1" "Condition_7_vs_1" "Condition_8_vs_1" "Condition_9_vs_1" [11] "Condition_10_vs_1" "Condition_11_vs_1" "Condition_12_vs_1" "Condition_13_vs_1" "Condition_14_vs_1" [16] "Condition_15_vs_1"
I believe this analysis compare every paired samples to the first paired number 1 ! which look strange for me.
I am wondering is my model for this experiment is correct ? any advice to improve the analysis methodology?
Thank you!