Question

Robust way of dealing with low number of samples for Differential Gene Expression

0

Entering edit mode

Satoshi • 0

@762f5205

Last seen 10 hours ago

United States

Hello,

We have single-cell data from 12 breast cancer patients with 3 biopsies from each patient (Baseline, treatment one, treatment two); so in total 36 samples. Out of 12 patients, 4 are responders (R) and 8 are non-responders (NR). I have done cell-typing and sub-typing for all cells in my dataset. I want to perform a differential expression test between responders and non-responders for each cell type as well as sub-type at each time-point (Baseline, treatment one and treatment two). I also want to perform a differential expression test between Baseline vs treatment one; baseline vs treatment two and treatment one vs treatment two for each cell type and subtype and response category (i.e R and NR).

Based on https://www.nature.com/articles/s41467-021-25960-2, I am performing pseudo-bulk based DE analysis using DESeq2/edgeR and was wondering how robust would that be? In my understanding, there are two more ways to do this: 1) Do a single-cell based DESeq2/edgeR/MAST run instead of pseudo-bulk and 2) Perform a rank-sum test on a single-cell basis and estimate the error per sample. I wasn't able to find the thread but I remember reading a discussion about this from one of Michael Love's publications.

Thank you for your time and suggestions in advance.

limma edgeR DESeq2 MAST • 693 views

ADD COMMENT • link updated 2 hours ago by Kang • 0 • written 12 days ago by Satoshi • 0

score 3 · Accepted Answer · 2024-10-27

3

Entering edit mode

Michael Love 42k

@mikelove

Last seen 1 day ago

United States

I find pseudo-bulking is a robust way to approach DE, provided reliable cell type identification across samples, and when used with appropriate controlling for technical variation using methods like RUVSeq.

3 biopsies from each patient... out of 12 patients, 4 are responders (R) and 8 are non-responders (NR)

With such a design, it may be better approached with mixed effects models, using e.g. duplicateCorrelation with limma-voom.

ADD COMMENT • link 11 days ago Michael Love 42k

0

Entering edit mode

Alright, thanks that answers my question

ADD REPLY • link 11 days ago Satoshi • 0

0

Entering edit mode

I believe that pseudo-bulking is a strong way to approach DE, as long as it is possible to reliably identify the cell type across data and as long as technical variation is taken into account using tools such as RUVSeq. io games

ADD REPLY • link 2 hours ago Kang • 0