Using DESeq2 on CEL-Seq data
1
0
Entering edit mode
@solgakarbitechnionacil-6453
Last seen 7.7 years ago
European Union

Hello all,

I am working on CEL-Seq data, which is a protocol that allows working with small starting amounts of RNA.

Most of the CEL-Seq pipeline is similar to the RNA-Seq pipeline, but a big difference between CEL-Seq and RNA-Seq data is that after the counting step (using HTSeq-count or a modified version of this tool in order to collapse reads that originate from a single transcript, using Unique Molecular Identifier), the amount of reads that are counted to features is much lower.

After performing the collapsing, we might receive around 100,000 reads per sample or even less, which is of course much lower than the amount of reads usually counted in regular RNA-Seq data.

I wanted to ask if this kind of data and such low amount of reads could be used to perform differential gene expression testing using DESeq2? If so, are there any modification to the normalization method or any other steps in the workflow of DESeq2 that i should consider?

Thank you very much,

Olga Karinky.

DESeq2 • 2.3k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 6 days ago
United States

Yes the inference automatically adjusts when counts are low. I've looked at UMI counts which tend to have not as high dispersion as RNA-Seq counts, and I think the Negative Binomial GLM can be used on these.

 
ADD COMMENT
0
Entering edit mode

Dear Michael,

I am trying to use DESeq2 on UMI count data. In my case, the counts are extremely low. Due to the design of the experiment, I end up with around 30,000 total counts per sample (coming from around 2 million reads). If I substitute counts for reads as input, this causes problems because the counts per DNA fragment of interest are so low (range 1-100).

My idea was to divide each UMI count by the total number of counts and multiply this by the total number of reads for the sample before inputting to DESeq2. Do you think this is a valid approach?

ADD REPLY
0
Entering edit mode

The counts being low is not a problem. If the differences across condition rise above the expected sampling variance and the estimated extra variance (overdispersion), then you will be sensitive to detect changes.

ADD REPLY
0
Entering edit mode

Is it possible to combine raw counts from CEL-Seq and raw counts from a "regular RNA-Seq" experiment in DESeq2? I guess the CEL-Seq counts will be scaled up a lot?

ADD REPLY
1
Entering edit mode
For looking for differences in a ratio across conditions, scaling doesn't matter, see my other comment in this thread.
ADD REPLY

Login before adding your answer.

Traffic: 473 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6