Question

Best way to do RNA-seq analysis for a small number of samples obtained from 2 different batches

0

Entering edit mode

theemperor79 • 0

@6298e29a

Last seen 2.9 years ago

United States

Hi all

I have a set of 24 samples from 8 different experimental mice groups. Each group has 3 samples, (2 from batch 1 and 1 from batch 2).

I am confused about whether I should use combat_seq to obtain a batch correction before I do DESeq2 analysis or whether I should (as is often advised) include the batch variable in the model I am going to use to perform DE analysis.

One problem I am facing with Combat_seq is that there are several genes with negative count values after the batch correction is performed and so DESeq errors out and won't create the dds object

However, if I do DESeq by including the batch variable in the model, when I go on to generate heat maps etc. the clusters still are separated by batch and I can't seem to find any way to "combine" the batches.

Any suggestions will be greatly appreciated.

RNASeq • 6.4k views

ADD COMMENT • link updated 3.0 years ago by swbarnes2 ★ 1.4k • written 3.0 years ago by theemperor79 • 0

0

Entering edit mode

Can you show some plots and the colData? You should not have negative counts after Combat-Seq, can you show code (plots and code should always be posted rather than just text).

ADD REPLY • link 3.0 years ago ATpoint ★ 4.8k

0

Entering edit mode

Including batch in the model isn't going to affect PCA much. It will just modify the fold changes and p-values. If you have a strong batch effect, there is no magic that will remove it completely.

ADD REPLY • link 3.0 years ago swbarnes2 ★ 1.4k

score 0 · Answer 1 · 2022-04-19

Indeed, for statistical inference, go for including the batch variable in your model.

For visualization in e.g. heatmaps, transformed count data are standardly used as input. VST-transformation of count data is very suitable for all kind of downstream analyses, including heatmap generation. The batch effect can be removed from the VST-transformed data using the function removeBatchEffect() from limma. See the FAQ of DESeq2 for more info on this: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#why-after-vst-are-there-still-batches-in-the-pca-plot