Question

Using DESeq2 for T-cell receptor clonotypes

0

Entering edit mode

jirkanov • 0

@jirkanov-14624

Last seen 4.2 years ago

Hello,

I need to do a differential expression analysis of T-cell receptor (TCR) clonotypes. Simplified, TCRs are special proteins (part of immune system) which can recognize various another proteins. Because of their dynamic recognition abilities, gene encoding TCR has variable regions which are different in various T-cells - those differing T-cells are called clonotypes.

I have RNA-Seq data from 300bp paired-end MiSeq run. Four different samples, three biological replicates, 12 samples in total. Reads are such that everyone contains the variable region of TCR and UMI barcode on 5' end. Doing the standard pipeline for TCR analysis (MIGEC), I got a count matrix where columns are samples and rows are clonotypes. In my opinion, this is very similar to normal RNA-Seq count matrix where rows are genes.

Unfortunately, sequencing didn't go very well, so there are large differences in depth. Next thing is there are some dominating clonotypes, highly abundant across all samples, and on the other hand some clonotypes are very rare, with zero counts in almost all samples. Overall, I have 615 clonotypes. To get rid of those "zero" clonotypes, I did a standard rowSums thresholding: dds[rowSums(counts(dds)) >= 10, ]

but only 60 clonotypes left! With threshold of 5, 134 clonotypes left.

My question is whether this type of data is suitable for analysis with DESeq2.

To see my existing results, you can download RMarkdown HTML report: https://owncloud.cesnet.cz/index.php/s/UtWukFacNR6kD3Y

Thank you in advance for any help!

deseq2 immunology tcr clonotypes • 1.8k views

ADD COMMENT • link updated 6.7 years ago by Michael Love 43k • written 6.7 years ago by jirkanov • 0

score 0 · Answer 1 · 2018-08-16

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 2 days ago

United States

I think the dispersion shrinkage can be useful even with eg 100 rows. The depth issue is not a problem alone, unless it is very confounded with condition. Can you plot sizeFactors(dds) over dds$condition? With so few genes you may want to use fitType=“mean”.

ADD COMMENT • link 6.7 years ago Michael Love 43k

0

Entering edit mode

Thanks for quick answer. Here it is:

ADD REPLY • link 6.7 years ago jirkanov • 0

0

Entering edit mode

That looks fine. The overall range really isn’t so bad.

ADD REPLY • link 6.7 years ago Michael Love 43k

0

Entering edit mode

OK. And should I use fitType=“mean”?

ADD REPLY • link 6.7 years ago jirkanov • 0

0

Entering edit mode

That's what i recommended above

ADD REPLY • link 6.7 years ago Michael Love 43k

0

Entering edit mode

Sure, I was just a little bit confused with the formulation "you may want" :-) Many thanks Michael!

And by the way, thumbs up for your talks at CSAMA2018, they were fantastic :-) Unfortunately, at that time, I didn't have those data to ask you personally...

ADD REPLY • link 6.7 years ago jirkanov • 0