I have questions about "dba.report",
I applied "DiffBind" to analyze my dataset like:
H3K27ac_CellA_CellD <- dba(sampleSheet="sampleconfiguration.csv",) H3K27ac_CellA_CellD_Count <- dba.count(H3K27ac_CellA_CellD, score= DBA_SCORE_TMM_READS_EFFECTIVE, mapQCth=0, bRemoveDuplicates=FALSE) H3K27ac_CellA_CellD_Diff <- dba.analyze(H3K27ac_CellA_CellD_Count,method=DBA_DESEQ2) H3K27ac_CellA_CellD_Diff_All <- dba.report(H3K27ac_CellA_CellD_Diff,method=DBA_DESEQ2, th=1) H3K27ac_All <- data.frame (Conc_WT=H3K27ac_CellA_CellD_Diff_All @elementMetadata$Conc_WT, Conc_GeneXcKO= H3K27ac_CellA_CellD_Diff_All @elementMetadata$Conc_GeneXcKO) H3K27ac_CellA_CellD_Diff_All @elementMetadata DataFrame with 6523 rows and 6 columns Conc Conc_GeneXcKO Conc_WT Fold p-value FDR <numeric> <numeric> <numeric> <numeric> <numeric> <numeric> 1 4.77 -0.51 5.75 -6.26 6.51e-11 4.11e-07 2 4.41 0.27 5.37 -5.11 2.55e-08 8.06e-05 ... 6523 3.88 4.06 3.67 0.39 1 1
I was wondering how these values are calculated:
Conc Conc_GeneXcKO Conc_WT
since DESEQ2 was utilized to perform the differential analysis, does this means the "regularized log transformation" "DESeq2" was applied to the raw count value? Or in my case above, simple log transformation of "TMM_READS_EFFECTIVE"?
Either case, it seems that values were "normalized" one way or another.
when I plotted the "H3K27ac_All" containing the "Conc_GeneXcKO Conc_WT" in my case,
p <- ggplot(H3K27ac_All, aes(Conc_WT, Conc_GeneXcKO)) p + geom_point() + geom_abline(intercept = 0, slope = 1, linetype=2) + scale_x_continuous(name="Epifeatures\n in WT Cell ", limits = c(-1, 11)) + scale_y_continuous(name="Epifeatures \n in KO Cell ",limits = c(-1, 11))
I expected the dots distributed along the line with slope of 1, somehow the results turned out to be different.
Could this be a result of improper data process?
I just noticed some related post DiffBind: Normalization for DESeq2 and DiffBind dba.report() output: Is "concentration" normalized?:
so it seems to me that in my case, the "conc" is log transformed counts normalized by library size.
or set the lib size as "sizeFactor" in "DESeq2"
"DESeq2::sizeFactors(DESeqDataSeq) <- libsize/min(libsize)"
just make sure, is log transformation here equivalent to "rlog" in "DEseq2"?