Hello,
I'm running DESeq with the design=~Cell_Treat + MGMT. I'm working with 6 samples from two different patients, so 12 samples in total. There are two factors, Cell line (HF2303 or HF 2927) and Treatment (untrt or trt). I'm trying to include a continuous covariate into the design (MGMT) to control for the effect of the factors based off of MGMT presence in the cell.
> colData(se) DataFrame with 12 rows and 5 columns sample Cell Treat MGMT Cell_Treat <factor> <factor> <integer> <factor> <factor> 1 accepted_hits.CSC.Contc.HF2303 A 1 pos A-1 2 accepted_hits.CSC.TMZa.HF2303 A 2 pos A-2 3 accepted_hits.SDC.Contb.HF2303 A 1 pos A-1 4 accepted_hits.SDC.Contc.HF2303 A 1 pos A-1 5 accepted_hits.SDC.TMZb.HF2303 A 2 pos A-2 ... ... ... ... ... ... 8 accepted_hits.CSC.TMZb.HF2927 B 2 neg B-2 9 accepted_hits.CSC.TMZc.HF2927 B 2 neg B-2 10 accepted_hits.SDC.Contc.HF2927 B 1 neg B-1 11 accepted_hits.SDC.TMZa.HF2927 B 2 neg B-2 12 accepted_hits.SDC.TMZc.HF2927 B 2 neg B-2
When I try to set up the DESeqDataSet with design=~Cell_Treat + MGMT it gives me an error that the model matrix isn't in full rank. I thought this might be due to that MGMT factors correlate the same as the Cell factor, but I'm not sure if I should get rid of the column because id like to base the results off of whether or not the MGMT variable is present of not and the cell column would be needed to separate HF2303 from HF2927?
Am I doing something wrong in the way the design is set up or columns?
This is the full error message:
> dds<-DESeqDataSet(se, design=~Cell_Treat + MGMT) Error in checkFullRank(modelMatrix) : the model matrix is not full rank, so the model cannot be fit as specified. One or more variables or interaction terms in the design formula are linear combinations of the others and must be removed. See the section 'Model matrix not full rank' in vignette('DESeq2')
Oh okay thanks! I see what you're saying, I converted MGMT to a integer value as "1" and "0" instead of "pos" and "neg", but I still get the same error when I run DESeqDataSet.
When I ran DESeqDataSet it gave me an error that to convert my numeric integer to a factor along with the same matrix not full error.
Is there a mistake in how i have the design set up? I'm still trying to figure out how the order of how I list things in the design work and I'm worried the order I have the variables set up in might be causing the problem
This is the code I used to make the DESeqDataSet:
Changing MGMT to 0/1 doesn't make it continuous! It will still be treated by R as a factor, and will still be confounded with the cell type. If MGMT is only produced in one cell type, then saying 'this cell produces MGMT' and 'this cell is HF2303' tells you the same thing (e.g., if you know it's a HF2303 cell, then you also know it produces MGMT, and vice versa). Since you are not adding any new information, well, you aren't adding any new information.
You should also note that if these samples are all just from two different patients, there may be a patient-level factor that you should be accounting for.
Okay that's what I thought, but I wasn't sure if I had done something else wrong. I ended up removing the Cell column and now I'm left with 2 factors, MGMT and Treatment. My current design is ~Treat_MGMT with the two factors combined into a group and I am trying to contrast the untrt vs the trt for the MGMT positive and negative lines
Am I just going about the comparisons all wrong? I had previously gotten like 1400 differentially expressed genes and it doesn't support what I expected to happen at all. I thought about adding the cell types (SDC vs CSC) as another column, the cells being CSC or SDC doesn't have a direct comparison that i'm trying to make and I'm not sure whether or not to incorporate it without making a CSC contrast against a SDC
You are getting an error, which says
And in your post you also list the elements of resultsNames(object). If I parse that further, it reads
Does that not tell you what the problem is?
Yes, that makes sense. I've been reading through forum posts and the vignette on interaction terms and it's cleared up a few things for me. Currently I'm using:
From what I understand, this design should give me the effect of treated vs untreated based off MGMT presence. My goal is to find 1) the effect of MGMT presence on treatment vs untreated (does MGMT presence influence response to treatment) 2) what are the differentially expressed genes between treatments for MGMT neg 3) what are the differentially expressed genes between treatments for MGMT pos.
After running dds with the design "dds<-DESeqDataSet(se, design=~MGMT*Treat)" the resultsNames(dds) that i get are:
However, I'm not sure what to contrast to find the effect of MGMT presence on treatment vs untreated based off the resultNames. Is the design MGMT*Treat any different from using MGMT:Treat or MGMT + Treat?
Thanks! sorry for all the questions i'm just starting to pick up bioinformatics and r