Question

Does hashedDrops support 'imbalanced' combinations of HTOs?

0

Entering edit mode

Peter Hickey ▴ 740

@petehaitch

Last seen 4 weeks ago

WEHI, Melbourne, Australia

I've got a bit of a weird 10x scRNA-seq dataset with 6 samples that are labelled using HTOs as follows:

Samples 1-5 labelled with a single, unique HTO (e.g., sample 1 labelled with human-1, sample 2 labelled with human-2, etc.)
Sample 6 labelled with all 5 HTOs (i.e. human-1, human-2, ..., human-5)

Does DropletUtils::emptyDrops() with combinations support this? Thanks

DropletUtils scRNAseq • 1.5k views

ADD COMMENT • link updated 3.8 years ago by Aaron Lun ★ 28k • written 3.8 years ago by Peter Hickey ▴ 740

score 1 · Answer 1 · 2021-02-15

1

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 38 minutes ago

The city by the bay

tl;dr No. Also, that sounds insane.

Sample 6 will never be picked up by any parametrization of hashedDrops(), which relies on the majority of HTOs not being present in a given cell. You'll have to use a cruder strategy, e.g., fit a two-component distribution to each HTO, call a cell positive if it belongs in the "high" mode of any HTO, and then use that to generate calls based on the combinations that you see. This approach is implemented in CiteFuse but I don't know whether they support wacky designs like this; you'll probably have to write code yourself.

FYI, you can reach deep inside the package to get DropletUtils:::.get_lower_dist, which is the k-means-based two-component fitting code used by ambientProfileBimodal(). One could use this to get present/absent calls for each HTO in each cell, and then proceed from there.

ADD COMMENT • link 3.8 years ago Aaron Lun ★ 28k

0

Entering edit mode

Yeah it's definitely wacky and not something I want to deal with again. I've got a set of labels from an ad-hoc approach that clusters the HTOs followed by manually assigning each cluster to a sample; not fun.

In general, for when there are fewer HTOs available than samples, labelling each sample with a unique combination of, say, 2 HTOs would be compatible with hashedDrops() via combinations, right?

ADD REPLY • link 3.8 years ago Peter Hickey ▴ 740

0

Entering edit mode

Yes, that would be the plan, provided that you have more than 4 HTOs. Otherwise, doublets containing all HTOs would be indistinguishable from deeply-sequenced empty droplets.

There is a way to rewrite the hashedDrops() algorithm to overcome this restriction, at the cost of (i) assuming that all HTOs exhibit a bimodal profile and (ii) increasing errors due to variation in sequencing depth across cells.

ADD REPLY • link 3.8 years ago Aaron Lun ★ 28k

1

Entering edit mode

See the documentation on the constant.ambient=TRUE option in version 1.11.20 of DropletUtils, which supports situations where the total number of HTOs is less than or equal to half the expected number per cell.

ADD REPLY • link 3.8 years ago Aaron Lun ★ 28k