Question

Using SC3 with batch corrected MNN values

1

Entering edit mode

hamza_karakurt ▴ 60

@hamza_karakurt-17704

Last seen 2.7 years ago

Turkey

Hello, I want to use SC3 for data sets from multiple batches. I use fastMNN() function of Scran/Scater package for batch normalization but it does not effect logcounts, it creates a reduced dimension "MNN" that shows the corrected data which also used in clustering step. How can I use SC3 with these values? Can I create a new SingleCellExperiment with MNN matrix and use SC3 on that matrix? MNN matrix includes negative values so I know I should not use gene_filter parameter as TRUE.

Thank you in advance.

SC3 Scater Scran scRNA-Seq Batch Correction • 2.3k views

ADD COMMENT • link updated 6.1 years ago by Vladimir Kiselev ▴ 150 • written 6.1 years ago by hamza_karakurt ▴ 60

score 1 · Answer 1 · 2019-04-05

1

Entering edit mode

Vladimir Kiselev ▴ 150

@vladimir-kiselev-9342

Last seen 6.1 years ago

Sanger Institute, Cambridge, UK

Hi, you can always copy your corrected matrix to logcounts, so no need to create a new object. Or if you care about logcounts, then yes, it would be a good idea to create a separate object.

However, I think SC3 won't work well with negative values (as majority of other scRNAseq methods), so cannot guarantee a good result.

ADD COMMENT • link 6.1 years ago Vladimir Kiselev ▴ 150

0

Entering edit mode

Thank you for answer. As you said, negative values effects results as expected. To try it, I used sc3estimatek function on both data set itself and reduced dimension (PCA with first 50 PCs in that case), and estimated k was 27 for all data and 5 for reduced dimension. Probably it is not a good way to do it. Since the data sets from different batches are really common, what is optimal way to use SC3 on these kind of data sets? Actually I looked for a method to correct all logcounts but could not find any method.

ADD REPLY • link 6.1 years ago hamza_karakurt ▴ 60

1

Entering edit mode

There are lots of batch correction methods at the moment. Not all of them correct the expression matrix though. But for those that don't you could use other clustering methods such as louvain clustering on knn graph (default in scanpy package). Here we cover some of the batch correction methods: R - https://github.com/cellgeni/notebooks/blob/master/files/notebooks/10X-batch-correction-harmony-mnn-cca-other.Rmd python - https://github.com/cellgeni/notebooks/blob/master/files/notebooks/10X-batch-correction-bbknn-scanorama.ipynb

ADD REPLY • link 6.0 years ago Vladimir Kiselev ▴ 150

0

Entering edit mode

Thank you for answer. Actually I am planning to use MNN correction. It is more suitable in my situation and further analyses I am planning. MNN can create a corrected expression matrix but it also have negative values (due to cosine normalization I believe). I took the risk and used SC3 on this corrected matrix but I have NAs in clustering results.

ADD REPLY • link 6.0 years ago hamza_karakurt ▴ 60

1

Entering edit mode

I'll chip in here and mention that a batch correction method will only be able to preserve zeroes if it is aware that the data are derived from counts. This is not the case for the vast majority of methods, which operate on transformed expression values where the count-based nature of the data are lost. And for good reason; the theory for count-based models is difficult. (See batchelor::rescaleBatches() for a limited exception.) Indeed, there is no philosophical reason that log-expression values should be non-negative. The fact that they often are is simply a matter of practical convenience to avoid loss of sparsity.

Now, I can't remember exactly what special stuff SC3 does, but if you just want to do no-frills k-means clustering, you can apply kmeans on the low-dimensional MNN corrected values. Any feature selection should have been done before MNN correction anyway.

ADD REPLY • link 6.0 years ago Aaron Lun ★ 28k