Interaction terms in DESeq2
1
0
Entering edit mode
Mimi • 0
@6b9a44fb
Last seen 12 months ago
United Kingdom

Hi, I am hoping this isn't a stupid question as I am really lost here. I have extensively read the manual and other forum posts but am struggling to find a solution.

I am using DESeq2 to analyse my data set but running into problems with an interaction term in the model design. To break down the experiment I have two species A and B, which are orthologed to compare directly between them, and then I have two treatments (trt) termed X and Y. My model incorporates the species and treatment to look at the effect of treatment in a species dependent manner.

To break it down; I have 4 possible interactions. -speciesa.trtX -speciesa.trtY -speciesb.trtX -speciesb.trtY

My model allows me to look at DE genes for speciesA.trtX, so i can get these genes out, and when I make a heatmap, these genes look up/down regulated in this column compared to the other 3 as expected. However, I now want to change the interaction term to look at speciesB.trtX. I have tried changing the reference level to be species B, and/or treatment reference to be X, but it doesnt change the output of the genes when i re-run after re-level. For all the other model terms i can successfully change the reference level and the gene number for up and down and p values etc change

These for example are the model terms fully. Intercept, replicate_1_vs_2, species_a_vs_b, trt_X_vs_Y, speciesa.trtX.

I have also tried changing the contrast as per the manual, but unlike the other model terms where both options are available the interaction term seems to not be able to be changed via a contrast. Am I missing something really obvious? Any help would really be appreciated.


    > cleaned_matrix <- counts[complete.cases(counts), ]
    > counts<- cleaned_matrix

    > cts <- as.matrix(counts)
    > table(rownames(pheno) == colnames(counts))

TRUE 
  16                     

    > dds <- DESeqDataSetFromMatrix(countData = cts,
    +                               colData = pheno,
    +                               design = ~ replicate + species + trt + species:trt) 

Warning message:
In DESeqDataSet(se, design = design, ignoreRank) :
  some variables in design formula are characters, converting to factors

> #try to set reference level to b not a for species 

    > dds$species <- relevel(dds$species, ref = "b")

> # try another method to set reference level 

    > dds$species <- factor(dds$species, levels = c("b","a"))

    > dds <- DESeq(dds)

estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing

    > res <- results(dds)
    > res

log2 fold change (MLE): speciesa.trtX 
Wald test p-value: speciesa.trtX 
DataFrame with 7463 rows and 6 columns
             baseMean log2FoldChange     lfcSE       stat      pvalue      padj
            <numeric>      <numeric> <numeric>  <numeric>   <numeric> <numeric>
AGAP000002    934.719    -0.00237484  0.181715 -0.0130691 0.989572696 0.9973618
AGAP000007    303.459    -1.02785205  0.264296 -3.8890181 0.000100651 0.0234986
AGAP000009   1501.501    -0.17567466  0.208026 -0.8444850 0.398398467 0.8038127
AGAP000010    296.587     0.03751027  0.311100  0.1205729 0.904029348 0.9796408
AGAP000011    916.324    -1.18476455  0.605364 -1.9571107 0.050334453 0.4712653
...               ...            ...       ...        ...         ...       ...
AGAP013538    388.339     -0.2402605  0.248789  -0.965720    0.334184  0.764737
AGAP013542    149.048      0.0895397  0.428572   0.208926    0.834506  0.968139
AGAP013543 101924.825      0.5902212  0.689588   0.855904    0.392051  0.801294
AGAP013544   5463.594      0.2554445  0.367638   0.694825    0.487165  0.847763
AGAP013545     18.887     -0.3016756  1.693431  -0.178145    0.858609  0.971362

> #what comparison is it looking at?

    > mcols(res)$description

[1] "mean of normalized counts for all samples" "log2 fold change (MLE): speciesa.trtX"    
[3] "standard error: speciesa.trtX"             "Wald statistic: speciesa.trtX"            
[5] "Wald test p-value: speciesa.trtX"          "BH adjusted p-values"      

sessionInfo( )
R version 4.2.3 (2023-03-15)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 14.1.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pheatmap_1.0.12             EnhancedVolcano_1.16.0      ggrepel_0.9.4              
 [4] ggplot2_3.4.4               RColorBrewer_1.1-3          DEGreport_1.34.0           
 [7] tibble_3.2.1                dplyr_1.1.3                 DESeq2_1.38.3              
[10] SummarizedExperiment_1.28.0 Biobase_2.58.0              MatrixGenerics_1.10.0      
[13] matrixStats_1.0.0           GenomicRanges_1.50.2        GenomeInfoDb_1.34.9        
[16] IRanges_2.32.0              S4Vectors_0.36.2            BiocGenerics_0.44.0
RNASeqData RnaSeqSampleSize DESeq2 • 1.0k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 8 hours ago
United States

You shouldn't use replicate as a coefficient in your model. That will do one of two things that you don't want. If the replicate values are numeric, then DESeq2 will fit a linear regression using those numbers (so e.g., replicate 4 will be assumed to have four times as much replicateness as replicate 1, which isn't a thing). If the replicate values are character or factor, DESeq2 will treat them as groups, and any replicates with the same value will be grouped together. So if you have replicates A-E for each species/treatment, then all the A replicates will be grouped, and all the B replicates, etc. Again, not what you want.

You are also misinterpreting the interaction coefficient. If you remove replicate from your model, then the interaction is testing for a difference in treatment response that varies by species. Algebraically that is this:

(species_a_trtX - species_a_trtY) - (species_b_trtX - species_b_trtY)

Which is what I think you want.

ADD COMMENT

Login before adding your answer.

Traffic: 725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6