Dear Bioconductor Community,
based on the very interesting question on a previous post (batch effect : comBat or blocking in limma ?) regarding the possible batch correction methodologies, through the answers created i desided to adress a very important issue in my opinion-as im still a newbie in these specific statistics-which i also dont have find any relevant explanation in any according paper of MOOC. In detail, Dr Johnson kindly mentioned that limma should not be used in conjuction with ComBat, when the latter applied for batch effect correction. With all respect(and of course without arguing against), i would like to ask some further explanations for this specific matter:
this could be more context and experimental design specific ?? that is, it depends on the magnitude of the batch effect, the variances between the batches, or also the experimental design used in Limma itself(i.e. paired or unpaired comparison) ? Or should someone always avoid it ? And if so, what is the main notion ? (It has to do with the "correlation" mentioned in the above post and possibly with the overestimation of DE ?)
Please excuse me for creating this post, but i believe this question is very important and i would like to get the maximum feedback, because i would like to know (as i used a combination of them in one of my recent analyses) if "still" could the combination considered "valid" under certain circumstances !!
Thank you for your consideration on this matter !!
Efstathios
Dear Gordon,
thank you for your crusial explanation-just to pinpoint an example-one thing that bothers me though-, regarding some older conversation about a specific analysis that both you and Aaron kindely provided explanations(C: Correct creation of design matrix in limma regarding a multi-level experiment)- for this specific analysis, i compared the DE genes after Combat with limma(without including the batch effect variable, just the pairs factor) and one without running ComBat but including in limma the batch effect variable along with the pairs(although Aaron have indicated that this would be redudant-Merge different datasets and perform differential expression analysis in limma)---so, as only a small group of genes(40) are different when i compare the two lists(this with the batch effect correction has 37 more DE genes), This could provide a "relative" confidence about the "above general principle" of the overestimation of statistical significance you underline ?
Thank you in advance,
Efstathios
One good way to see the overestimation of significance in action would be to simulate a dataset where all the null hypotheses are true, both with and without an added batch effect. Then analyze both the datasets with all the methods you are considering and compare the resulting p-values to see if any method is systematically over-conservative or over-liberal. A well-behaved method should give a completely uniform (i.e. flat) p-value distribution.
Dear Dr Thompson,
thank you for idea !! Firstly, regarding my above observation i mention could provide even a small "confidence" about batch effect overestimation ?
Secondly, as im using R about a year and im "relatively" new in statistics:
1. Regarding the simulation you mean, you mean something like permuting the labels? and leave the gene expression as it is, while create two cases: a) take the batch corrected expression set and permute the labels and continue b) the same without correcting for batch effect ?
Please excuse me if my question is naive or irrelevant but i havent performing so far any "simulation" you mention !!
Best,
Efstathios
I would be hesitant to draw conclusions about differences in performance between methods based on data where you don't already know the answer. That's why I recommended simulating a dataset from scratch, so you can control all the variables and you will know exactly which simulated genes are DE and which are not. This allows you do draw ROC curves and such for each method to evaluate their relative performance.
If you want to pursue this, there are a number of "benchmark" papers that give examples of how to do such a simulation. I don't have one handy at the moment, but you should be able to find one on PubMed.
I see. Then in simple "naive" words, simulate 2 datasets, one from the batch corrected dataset and the other from the same dataset without a batch effect correction, right ? I will search extensively for papers, with a small search i have found
http://www.researchgate.net/publication/255702173_A_Flexible_Microarray_Data_Simulation_Model
Maybe i could also use the below tools from Mr Vegard if they can assist in evaluating this specific issue
Yes, I'm not familiar with that particular method, but it looks like a reasonable simulation method. They also provide an R package, which certainly helps.
Efstathios, I'm not quite sure what is bothering you or what you mean by "relative confidence".
For your paired experiment, you don't even need a batch correction. The correct analysis is also the simplest.
Other analyses are more complicated and are either not needed or are theoretically wrong.
Dear Gordon,
because i have not such great experience in these issues, i was more "affected" from literature agumenting the necessity when there are two or more different datasets, batch effect correction is essential for merging !! Thus, i was afraid that the pairs factor could not handle a batch effect resulting from both studies !! Nevertheless, as your opinion and your consideration on this matter are trustful and crusial, i will perform the same analysis without batch effect correction and only including the pairs--and inspect and possibly stick with these results.