Dear all,
please would you advise on the following : I am running the package Seurat on the dataset that was published in :
https://www.ncbi.nlm.nih.gov/pubmed/27565351.
https://github.com/broadinstitute/BipolarCell2016
In the article, there are 6 datasets on BIPOLAR cells that are all together in a matrix, and the columns are labelled :
Bipolar1_barcode1, ...., Bipolar1_barcodeXYZ,
Bipolar2_barcode1, ...., Bipolar2_barcodeXYZ,
Bipolar3_barcode1, ...., Bipolar3_barcodeXYZ,
Bipolar4_barcode1, ...., Bipolar4_barcodeXYZ,
Bipolar5_barcode1, ...., Bipolar5_barcodeXYZ,
Bipolar6_barcode1, ...., Bipolar6_barcodeXYZ,
Shall I understand that, when we would like to include multiple experiments in the same matrix, for the analysis with Seurat, we just need to label the columns according to a scheme : ExperimentA_Barcode, ...., ExperimentX_Barcode ;
thanks a lot,
-- bogdan
Thank you Steve ! Yes, I am using both pipelines : 1) Seurat and 2) the workflow based on simpleSingleCell.
Talking about batch effects, If I may add a question, as I have noted 2-3 strategies :
a. a strategy where the samples from multiple experiments are concatenated in a large matrix (as I have described above).
talking about the batch correction : one may apply the COMBAT function in SVA package on the matrix.
https://ucdavis-bioinformatics-training.github.io/2017_2018-single-cell-RNA-sequencing-Workshop-UCD_UCB_UCSF/day2/scRNA_Workshop-PART3.html
b. another strategy to use CCA (canonical correlation analysis), as recently published :
https://satijalab.org/seurat/immune_alignment.html
c. MNN-based correction, as presented at the link you've provided :
https://bioconductor.org/packages/devel/workflows/vignettes/simpleSingleCell/inst/doc/work-5-mnn.html
would a strategy work better than other ? what would you advise ? thanks !
I would advise you to reference the relevant literature ;-)
The MNN paper does a comparison against COMBAT and shows their method to be superior, and the Seurat preprint claims their method to be superior to MNN.
If it were me, I'd likely ignore COMBAT and take my time with MNN, LIGER, and Seurat v3 to see how they compare to each other. Each has their own set of parameters you should spend some time playing with to understand how they effect the results.
(Note that I've updated the original answer to add reference to LIGER as a 3rd approach to tackle dataset integration)
thanks a lot Steve ! that is very helpful and very informative !