Question

Is there a way to run Diffbind without the normalization ?

1

Entering edit mode

Thomas ▴ 10

@thomas-16893

Last seen 6.7 years ago

Hello,

I was wondering if there was an option to run the differential analysis without the normalisation step ?

Apparently there is an option like that for some plots but I can't happen to find anything for the differential analysis.

Thank you for your time

Thomas

diffbind differential binding analysis • 2.0k views

ADD COMMENT • link updated 8 weeks ago by tifaGis • 0 • written 6.7 years ago by Thomas ▴ 10

score 0 · Answer 1 · 2018-10-03

0

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 3 months ago

Cambridge, UK

I am currently working on giving users much more control over normalization in DiffBind. In the meantime, there is a trick you can do you avoid normalization when using DESeq2 as the analysis method.

After running dba.count(), you can change the individual library sizes to all be the same. Then if you run dba.analyze() with bFullLibrarySize=TRUE (default) the normalization facts will all be equal to 1. You probably want to run dba.analyze()with bSubControl=FALSE to avoid having the control reds subtracted if do not want normalization of any kind.

You can set the library size values as follows:

> myDBA <- dba.count(myDBA, ...)
> myDBA$class[8,] <- 1e+07
> myDBA <- dba.analyze(myDBA, method=DBA_DESEQ2, bFullLibrarySize=TRUE, bSubControl=FALSE)

Also, there is a "backdoor" in DiffBind to allow you to provide your own normalized read counts -- if that is of interest, I can show the code required to use it.

ADD COMMENT • link 6.6 years ago Rory Stark ★ 5.2k

0

Entering edit mode

In my opinion, for visualization purposes it's accurate enough to convert the raw reads (bam files) to tdf format and load them in IGV with the option "normalize coverage data" enabled, this will rescale the raw counts to 1000000 / total counts.thus correcting for different library sizes. This assuming you are using/willing to use IGV, other browsers surely have similar options.

DiffBind, or rather the underlying edgeR and DESeq packages, don't make use of the read profiles. All they use are the raw counts of reads in given intervals (ChipSeq peaks, genes, whatever), in fact the read profile within an interval doesn't enter in the analysis at all.

If you want to show the log fold change you could prepare a bedgraph file where each line has as coordinates the region tested, or maybe the midpoint of the region tested, whichever looks better, and as value (4th column) the log fold change. Then load this file in IGV with or without the raw reads file as above.

ADD REPLY • link 6.6 years ago cchristopherdeveauret • 0

score 0 · Answer 2 · 2020-11-06

0

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 3 months ago

Cambridge, UK

The recently released version of DiffBind includes a new interface function, dba.normalize(), that allows library sizes and normalization factors (or offsets) to be set and retrieved directly. The updated vignette includes a detailed exploration of the impact of different normalization schemes.

ADD COMMENT • link 4.5 years ago Rory Stark ★ 5.2k

0

Entering edit mode

Dear author, I have a similar question. dba.plotMA could easily generate ma plots under raw counts (bNormalized = F), but there is no way to retrieve that data behind the MA plot? I followed your advice in specifying the same library size for my samples, and use full library size in dba.normalize() step. But I still get results in the dba.analyze step which is different from what my raw counts ma plot shows. Looks like there is some normalization going on.

If you check on the demo data tamoxifen.

The ma plot for the raw data is: enter image description here And I run the following to run the analysis without normalization:

tamoxifen$class[8,] = 1e+07
tamoxifen = dba.normalize(tamoxifen,  normalize = DBA_NORM_LIB, library = DBA_LIBSIZE_FULL) ### at this stage the library sizes should be all the same, hence the normalization factors
tamoxifen = dba.analyze(tamoxifen, method = DBA_DESEQ2)
dba.plotMA(tamoxifen, bUsePval = F)

And the ma plot is now: enter image description here

I appreciate your suggestions as why there is still some sort of "normalization" in my attempt to suppress normalization.

ADD REPLY • link 8 weeks ago tifaGis • 0