As a general rule, you can't have your cake and eat it too. Either you eliminate the disease effect so that cells from different conditions can cluster together, or you keep the disease effect and you get separate disease-specific clusters. If you want the same cell type from different diseases to be put in the same cluster, it is inevitable that you remove the effect of disease in your expression/PC space, otherwise you'll just end up with multiple clusters (one per disease condition) for the same cell type.
Now, neither of these outcomes is "wrong", _per se_. You could argue that getting multiple clusters for the same cell type is the right thing to do, as the disease effect represents genuine biology. However, regardless of whether this is "right", it is most definitely annoying, because we now have to go through all the clusters to see which ones match up to the same cell type across disease conditions. This is necessary to perform the most interesting comparison, i.e., between expression profiles or frequencies of the same cell type in different conditions. So if you have to do this cross-matching between conditions anyway, why not make it easier for yourself and eliminate the disease effect at the start so that you can get a single cluster per cell type? Note that the removal of disease only relates to the clustering: it doesn't mean that you discard the disease effect from your entire analysis, and it is straightforward to characterize the effect in later DE analyses. See here for a discussion and here for an actual pancreas + diabetic example.
As a final point: if you absolutely must merge your datasets without eliminating the disease effect, you need to give batchelor a control set of cells that should be the same between disease conditions. Most functions will have a restrict=
argument; if this is set, only the specified cells are used to compute the correction, and the same correction is extrapolated to all other cells. This was initially designed for experiments involving spiked-in cell controls, but personally I found it to be rather disappointing. In any case, if you can identify some shared cell population that is not expected to change across your samples, you _could_ use this option... but honestly, I would suggest just doing the correction with all cells as discussed above.