Entering edit mode
barakdror
•
0
@barakdror-21958
Last seen 5.6 years ago
Hello,
I'm using DESeq2 to compare amplicons counts difference between 2 conditions. My input data is raw count table (as required), and as my dataset has many 0's, I used a former solution by using:
dds_lettuce <- DESeqDataSetFromMatrix(countData=countData_lettuce,
colData=metaData_lettuce,
design=~source, tidy = TRUE)
#deal with many 0's in the dataset:
dds_lettuce <- dds_lettuce[ rowSums(counts(dds_lettuce)) > 5, ]
cts <- counts(dds_lettuce)
geoMeans <- apply(cts, 1, function(row) if (all(row == 0)) 0 else exp(mean(log(row[row != 0]))))
dds_lettuce <- estimateSizeFactors(dds_lettuce, geoMeans=geoMeans)
dds_lettuce=DESeq(dds_lettuce)
However, for some of the genes I'm getting very high log2foldchange values (>20), along with low pvalue and padjuested. What can be a potential reason for that? looking at the raw counts for these genes clearly shows they present only in the treatment group (average of 4500 vs. 0 in the control).
Thank you, Barak
It would seem that you just answered your own question, no?
In your opinion, what would would be a better estimate of the fold change in this scenario?
Correct me if I'm wrong, but log2foldchange means a fold change of 2^x, means in this case a fold change of 2^27 of this particular gene in the treatment group. Shouldn't it be closer to 7-10?