Running DESeq2 on viral infection without control/host data
1
0
Entering edit mode
f99942 • 0
@13d60551
Last seen 2.1 years ago
Austria

Hi,

I have a raw count data set of a viral infection consisting of 3 time points and a virus free control with 3 replicates each produced with featureCounts.

Normalization after running template_script_DESeq2.r: enter image description here

It does not look like the distributions across samples are stabilized. Apparently there were some problems with controls, especially replicate A. Since I am working only on the viral side of the replication cycle, I have just changed condRef in the template script from "control" to to one of the groups (a time point), thus dropping the controls. Normalization improved!

Is this the correct/intended way to do that?

Thank you!

DESeq2 viral controls • 1.1k views
ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.4k
@swbarnes2-14086
Last seen 1 day ago
San Diego

I think you might have to drop your controls. Are those counts of viral genes alone, so that's why the controls don't have any? I think you might have to include host genes in the normalization, if not the whole analysis, if that's the case.

ADD COMMENT
0
Entering edit mode

Yes, these counts are viral genes alone. There is no host genome available yet...

ADD REPLY
0
Entering edit mode

I agree that you may have to drop the controls.

ADD REPLY
0
Entering edit mode

So isn't it correct for the controls to have almost no reads? Why would you want to do anything to make pretend that they have comparable counts to infected samples? The premise of size normalization; that some genes with median expression are unchanging; looks wrong for the viral genes alone, even in the non-control samples. Did your prep collect any host RNA? If so, I'm not sure it's right to proceed by totally ignoring that. Including it would probably make normalization work.

ADD REPLY
0
Entering edit mode

Yes, to have almost no viral reads in the virus free control is what one would expect. Since including the controls disturbs the normalization and the biological question (for now) is the expression dynamics of the viral side, it should be fine to drop them. The data I have produced this way makes more sense in the context of promoter and proteomics data...

ADD REPLY

Login before adding your answer.

Traffic: 581 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6