Hi all,
I have 12 RNA-seq samples that represent 4 different cell lines each done in triplicate, as below:
fastq Condition CellLine 64 64-C232_3_S11_R1_001.fastq.gz Control C1 73 73-C232_2_S10_R1_001.fastq.gz Control C1 84 84-C232_1_S4_R1_001.fastq.gz Control C1 65 65-C229_2_S3_R1_001.fastq.gz Control C2 76 76-C229_3_S12_R1_001.fastq.gz Control C2 81 81-C229_3_S5_R1_001.fastq.gz Control C2 62 62-232_3_S1_R1_001.fastq.gz Disease D1 63 63-232_3_S2_R1_001.fastq.gz Disease D1 89 89-232_1_S7_R1_001.fastq.gz Disease D1 71 71-348_4_S8_R1_001.fastq.gz Disease D2 86 86-348_3_S6_R1_001.fastq.gz Disease D2 88 88-348_4_S9_R1_001.fastq.gz Disease D2
So, C1, C2, D1, and D2 are all independent cell lines, and really I'd like to know what is different in Control vs Disease. Of note, C1/D1 and C2/D2 are not paired cell lines (e.g. healthy and disease derived from the same individual). As such, the straightforward design of ~ CellLine + Condition
gives the model matrix not full rank error. I can, however, use a custom design matrix as suggested in the vignette as follows:
dds <- DESeqDataSetFromMatrix(countData=counts, colData=mapping, design= ~1) dds$indn <- factor(c(1,1,1,2,2,2,1,1,1,2,2,2)) mm1 <- model.matrix(~ Condition + Condition:indn, colData(dds)) > mm1 (Intercept) ConditionDisease ConditionControl:indn2 ConditionDisease:indn2 64 1 0 0 0 73 1 0 0 0 84 1 0 0 0 65 1 0 1 0 76 1 0 1 0 81 1 0 1 0 62 1 1 0 0 63 1 1 0 0 89 1 1 0 0 71 1 1 0 1 86 1 1 0 1 88 1 1 0 1
I then extract the results as follows:
> resultsNames(dds) [1] "Intercept" "ConditionDisease" "ConditionControl.indn2" [4] "ConditionDisease.indn2" > res <- results(dds, contrast=c(0,1,0,0))
Is this the correct way to extract the Condition effect while controlling for the fact that I have triplicate samples from different cell lines? I've looked through the various posts here/Biostars/seqanswers but can't seem to find this exact situation. I suppose we should have used paired cell lines, right? Thanks in advance for any help!
Thanks for the answer Michael. I suppose in the future I should ask this group to use paired cell lines if possible.