Finding genes that are expressed only in one condition within a contrast
1
0
Entering edit mode
@charlesfoster-17652
Last seen 21 hours ago
Australia

Hi,

I have carried out differential expression analyses comparing conditions using DESeq2. Intuitively, I have considered genes to be expressed if they have a count of at least 10 in at least some libraries (sensu Chen et al: https://f1000research.com/articles/5-1438). Hence, I carried out a filtering step before DE analysis using the filterByExpr function of edgeR. In my results, in addition to the pvalues and LFC etc. I have columns with baseMeans for conditions:

Gene    sampleA sampleB baseMeanA_cond1_vs_cond2    baseMeanB_cond1_vs_cond2

Gene1   cond1   cond2   0   70.0618858219621

Gene2   cond1   cond2   0   13.8155035471724

(apologies if the tab-delimited table shows up poorly)

To get these, I did (e.g.):

baseMeanA_cond1_vs_cond2 <- rowMeans(counts(dds, normalized=TRUE)[,colData(dds)$Tissue == "cond1"]) baseMeanB_cond1_vs_cond2 <- rowMeans(counts(dds, normalized=TRUE)[,colData(dds)$Tissue == "cond2"])

Now, I am looking to further refine my results to find any genes that are expressed in one condition, and not expressed at all in another. In this case, I do not want to know that Gene1 is upregulated in Condition2 relative to Condition2, but is still expressed in Condition1. I would just like to know that Gene1 is expressed in Condition2, and is not expressed in Condition1.

What would be the best way to do this?

From reading this site and the DESeq2 vignette, I know that the baseMean is "the mean of normalized counts of all samples, normalizing for sequencing depth." However, I'm a bit confused about 1) how my criterion on counts having to be >=10 to be expressed has been factored into the final baseMean results, and 2) how to subset my DE results to get expressed vs not expressed.

Is it as simple as getting all genes where the baseMean for condition1 = 0, and the baseMean for condition2 > 0? Or would it be genes where the baseMean for condition1 < 10, and the baseMean for condition2 >= 10?

Also, if it's easier to do this separately to the DESeq2 results, I'm happy to do so, e.g. by subsetting a matrix of count values or TPM values or TMM values.

Thanks!

Charles

filter deseq2 • 958 views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

We don’t have a way in DESeq2 to determine what counts or TPM correspond to “expressed” and what to “not expressed”. When students or collaborators want to do this I typically recommend looking at histsograms of abundance (TPM) over all genes.

ADD COMMENT

Login before adding your answer.

Traffic: 638 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6