I have chip-seq data which I am analysing with Diffbind. I understood the "blocking factor" functionality of it can be used to remove confounding variables. In the Diffbind manual, the confounding factors used as examples are strings (e.g. cell line names, if the cells from those samples were "resistant" or not), but I would like to use numeric variables such as age or post-mortem delay (e.g. 80 years, 24 hours). I happen to have also another string variable, which is sex (M for male and F for female). When I build my sample sheet and dba object with a string variable, everything runs smoothly (the name of the column, by the way, is Factor, and then I call it as DBA_FACTOR). To do the same for age I simply substituted the values of the column that had Fs and Ms for the age of each individual (each sample belongs to a different individual). The age does not differ that much, so I did not expect the results to change that much. The PCA plot shows exactly the same clustering as before. Nevertheless, while running further parts of the script (dba.contrast, dba.plotMA, dba.plotVolcano, dba.analyze) I find problems.
Context: Let's consider I have conditions A, B, C and D with multiple replicates each (6,4,5,7 respectively). They are incremental stages of a disease condition, being A the milder state and D the worst (and the amount of differentially bound sites I know increments as well).
The kind of errors I find are: (for samples A, B and C)
Error in pv.DBAplotVolcano(DBA, contrast = contrast, method = method, : object 'sigSites' not found In addition: Warning message: No sites above threshold
(but there should be sites above threshold because of my previous knowledge of the data)
For sample D:
results=dba.analyze(results, method=DBA_DESEQ2) converting counts to integer mode gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates converting counts to integer mode DESeq2 multi-factor analysis Error in checkForExperimentalReplicates(object, modelMatrix) : The design matrix has the same number of samples and coefficients to fit, so estimation of dispersion is not possible. Treating samples as replicates was deprecated in v1.20 and no longer supported since v1.22.
Question: I was wondering if Diffbind has some problem when handling numeric instead of string variables or there is something else in my script. But, since I used exactly the same script for both and only changed the values of the columns, I cannot think of any bug on my code that might be responsible and decided to double-check here.