Combat and dds Object
1
1
Entering edit mode
Bine ▴ 50
@bine-23912
Last seen 8 months ago
UK

Dear all,

I want to correct my dataset for batch instead of including batch in the formula. The reason is that it seems that my data is not coming for certain sub-analysis from all batches and therefore I get a "Full Rank"-Error. The Bioinformatics Core therefore suggested to run Combat (not Limma) to correct for the batch.

Now I am struggling a bit how to do it.. I have my data - colData and cts which I want to run after with

dds <- DESeqDataSetFromMatrix(countData = cts,
                               colData = colData,
                               design = ~SEX + DIAGNOSIS

How do I run combat with my colData and cts? I need to build a matrix?! But I need colData and cts seperate after to build the dds object?

I am a bit confused right now...

Thank you for any hint!

Bine

DESeq2 combat • 3.3k views
ADD COMMENT
1
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 7 days ago
Republic of Ireland

Hi Bine,

If you wish to correct for batch in this way, then I would do this using ComBat-seq on the raw counts, prior to using any DESeq2 function. Then, when you use DESeq2, technically, you can ignore batch (although I would still check for batch effects via PCA and regression).

ComBat-seq information is here: https://github.com/zhangyuqing/ComBat-seq

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin,

Thank you. I have been on that page and I find it very difficult to understand. The remove batch effect function of Limma was three lines of code (unfortunately I was told to use combat not Limma if I do Deseq2 after)...

mat <- assay(vsd)
mat <- limma::removeBatchEffect(mat, vsd$batch)
assay(vsd) <- mat

I have my raw counts (already quality control done and removed samples) in cts and my metadata in colData (colData$BATCH having my three batches).. any chance you can direct me to the part of the code relevant for me?

ADD REPLY
1
Entering edit mode

Can you show the colData? If DESeq2 refuses it due to linear combinations with other covariates then odds are good then any other regression-based approach will do the same.

ADD REPLY
1
Entering edit mode

Hi Bine, you have 2 choices in relation to batch adjustment:

Choice 1

  1. ComBat-seq on raw counts
  2. DESeq2 with ~ condition
  3. rld / vst

Choice 2

  1. DESeq2 with ~ condition + batch
  2. rld / vst with limma::removeBatchEffect()
ADD REPLY
0
Entering edit mode

Hi Kevin,

If I choose for the option 1, then DESeqDataSetFromMatrix from DESeq2 complains that the count matrix have negative values. This is because ComBat expects normalized data (like log2 transformed I guess) and therefore its output has negative values. Should I then apply a log2 of the data before running ComBat and then DESeqDataSetFromMatrix and then vst? I am concerned because I am analyzing ATAC-seq data, so my input are raw peak counts, and I am worried that by doing some of this I would be losing valuable information.

Thank you,

Sergio

ADD REPLY
1
Entering edit mode

Combat-Seq is not the same as the Combat function. See the sva package for the Combat-Seq function that works on raw counts and returns corrected raw counts.

ADD REPLY
0
Entering edit mode

I used the combat-seq but the deseq complain as its not integer what should i do

ADD REPLY
0
Entering edit mode

I cannot reproduce this, please provide a minimal reproducible example.

This here works just fine:

library(DESeq2)
library(sva)


dds <- makeExampleDESeqDataSet(m=6)
dds$batch <- factor(c(1,2,1,2,1,2))

corrected <- ComBat_seq(counts=counts(dds), batch = dds$batch, group = dds$condition)

DESeqDataSetFromMatrix(countData = corrected, colData = colData(dds), design = ~ condition)
ADD REPLY

Login before adding your answer.

Traffic: 583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6