tech replicates or bio replicate?
2
0
Entering edit mode
Xianjun Dong ▴ 10
@xianjun-dong-7069
Last seen 3.2 years ago
United States

Hi,

I have a quick question on DEseq2 replicates.

A good number of genes were marked as outliers (912, 5.7%) in my DEseq2 summary(res) output. I am wondering what that means. I found the following sentence in the vignettes: 

"The results function automatically flags genes which contain a Cook's distance above a cutoff for samples which have 3 or more replicates....When there are 7 or more replicates for a given sample, the DESeq function will automatically replace counts with large Cook’s distance with the trimmed mean over all samples, scaled up by the size factor or normalization factor for that sample."

Do you mean technical or biological replicates here? In my case, I only have biological replicates, no technical replicates. Does it matter?

Also, when you mean replicate, it means the replicate regarding to the contrast, right? For example, I have 80 samples (from 10 case and 10 control subjects, each subject having 4 different cell types, no tech replicate). If my design is like

design(dds) = ~ cell + condition

How will DESeq2 calculate Cook's distance? Will it consider the 40 control samples as 40 replicates when calling result(dds) ? 

Thanks,

Xianjun

deseq2 • 1.1k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 5 days ago
United States

That refers to biological replicates. Basically, the vignette assumes that you collapse "technical replicates" (defined by me mas: sequencing more reads from the same cDNA library) at the start of an analysis.

(Biological) replicate is with respect to the groups defined by the design. For the purpose of counting number of replicates for heuristic that decides this outlier behavior, DESeq2 counts all of those samples with the same value for all the variables listed in the design.

I'd recommend to take a look at the genes which were filtered. You can do this by examining the highest values for mcols(dds)$maxCooks, and using plotCounts() for these genes. If you think they should not be filtered, you can instead set cooksCutoff=FALSE when you call results().

ADD COMMENT

Login before adding your answer.

Traffic: 594 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6