This is a follow-up to my post about DESeq2 with 2-factor design here.
Nascent RNA-seq experiments analyzed by slamdunk report a Conversion Rate for each gene, which is the ratio of Nascent Reads to Total Reads for a given gene (simplified description). DESeq2 requires unnormalized counts (integers), so a direct comparison of Conversion Rate is not possible. Instead, a 2-factor design should be used. However, 2-factor design comes with the caveat of needing to define a threshold value for LFC. I don't have a way to determine the appropriate threshold biologically, and different values yield quite different results (see code below). Further, the appropriate threshold may change between experiments.
For this reason, I'm wondering if there is a way to compare Conversion Rates directly? For instance, if I excluded low-count genes, and then multiplied the ratios by 1E6 and then rounded to the nearest integer? I'm sure a statistician is screaming somewhere, heh. Just looking for alternatives that could potentially be used.
The following shows genes from a 2-factor design with padj <0.05 when log2FC threshold is set to 0.2, 0.3, 0.585, 0.848 (1.15x, 1.2x, 1.5x, and 1.8x respectively)
results2 <- results(dds_filtered, alpha=0.05, contrast=c(0,0,1,0), lfcThreshold=.2, altHypothesis="lessAbs")
length(results2@rownames[results2$padj <= 0.05])
[1] 0
results3 <- results(dds_filtered, alpha=0.05, contrast=c(0,0,1,0), lfcThreshold=.3, altHypothesis="lessAbs")
length(results3@rownames[results3$padj <= 0.05])
[1] 2046
results4 <- results(dds_filtered, alpha=0.05, contrast=c(0,0,1,0), lfcThreshold=.585, altHypothesis="lessAbs")
length(results4@rownames[results4$padj <= 0.05])
[1] 2690
results5 <- results(dds_filtered, alpha=0.05, contrast=c(0,0,1,0), lfcThreshold=.848, altHypothesis="lessAbs")
length(results5@rownames[results5$padj <= 0.05])
[1] 3735