Trajectory analysis for single cell sequencing
1
0
Entering edit mode
@lirongrossmann-23954
Last seen 3.8 years ago

Hi everyone, I have a fundamental question regarding trajectory analysis in single cell analysis: does the trajectory predicted by different algorithms offer any directionality between the cells/clusters? for example, which cluster is more immature in terms differentiation? If so, which package would you recommend to obtain that info?

I know that RNA velocity may address the directionality question, but if this is the case, what added value does trajectory analysis has over RNA velocity?

My understanding of the basic principle of trajectory analysis (with differences between specific algorithms) is that it finds a minimum spanning tree between cells/clusters, which may help understand conncectivity, but does not give you knowledge of the direction.

Any clarifications/ comments are welcome!

single cell trajectory analysis tscan monocle • 2.8k views
ADD COMMENT
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 46 minutes ago
The city by the bay

All right, I'll bite.

does the trajectory predicted by different algorithms offer any directionality between the cells/clusters? for example, which cluster is more immature in terms differentiation?

In and of itself, no. The trajectory is just a way of stringing together cellular states. You can impose directionality with prior biological knowledge about differentiation markers, the entropy/potency relationship or RNA velocity. I talk about this briefly in the relevant chapter of the book.

I know that RNA velocity may address the directionality question, but if this is the case, what added value does trajectory analysis has over RNA velocity?

To me, the biggest and most practical one is that trajectory analysis doesn't require unspliced counts. It's hard to emphasize how relevant this is; of the ~50 public datasets in the scRNAseq package, only one has unspliced counts and that's because we specifically put it in to test RNA velocity methods. If you're dealing with a public scRNA-seq dataset and it doesn't provide unspliced counts, that's just too bad - unless you're masochistic enough to pull down the FASTQs and start from there. Even for datasets where I already have FASTQs, I'm not entirely enthusiastic about regenerating the counts because (i) I already did my analysis with the existing counts and I don't want to repeat it and (ii) the directionality is biologically obvious (or at least clear enough to define the relevant experimental hypothesis that needs to be tested in the lab).

Another reason is that trajectory analysis works for any continuum unrelated to temporal progression. For example, if a cell is moderately active in a particular pathway, that doesn't necessarily mean it's becoming more or less active; it could just be maintaining its current activity. RNA velocity would not be helpful here because the velocity would be zero for all cells, but you can still build a trajectory of pathway activity to describe this phenomenon. Then you can test for DE genes with respect to activity and so on.

Even for temporal processes like differentiation, it's likely that you'll want to build a trajectory (in addition to the velocity vectors) to explicitly nail down the branch events, relationships between clusters, etc. To me, a trajectory is a continuous generalization of clusters, serving the same purpose of summarizing the data in a convenient form. Then you can talk about (and compute on) "the cells on branch 4" or "the cells lying on the path between clusters X and Y"... unless you favor communicating your results by waving your hands at a t-SNE.

And of course, the directionality information has its own cost in that the results are sensitive to the choice of how (un)spliced counts are obtained. Different quantification methods will yield different velocity vectors, possibly pointing in opposing directions (see here) so YMMV. In addition, once you start counting reads in introns, you're exposing yourself to questions like: how should intron retention be handled? Are unspliced counts driven by novel transcripts (e.g., miRNAs inside genes) or unannotated exons? What about repeats inside introns that tend to accumulate "read stacks"? By comparison, trajectory construction cares not about these things.

ADD COMMENT
0
Entering edit mode

Thanks, Aaron. That’s helpful. I guess I’m still not sure why I would do trajectory analysis if I can’t get information about directionality. By clustering the cells I already get a sense of how close they are to each other (transcriptionally) and I can look at that umap to see how close/connected the clusters are to each other. The trajectory analysis basically gives you a mathematical and visual way of confirming it?

Thanks again!

ADD REPLY
0
Entering edit mode

You can either join the dots on the UMAP manually or you can ask a computer to do it for you by making a trajectory. That's what it really comes down to. Hell, the same could be said for clustering. Why bother running a clustering algorithm when you can just circle blobs on a t-SNE?

(Of course, I'm being facetious with my suggestion there. I would never use a low-dimensional visualization like t-SNE or UMAP as the basis for any analysis, there's an uncomfortable amount of magic happening under the hood. I would only use it to visualize findings from more quantitative analyses in higher dimensional spaces. Or more bluntly: with enough tuning of the parameters and suckers for reviewers, you can probably "prove" anything on a t-SNE/UMAP.)

Just think about what you would do if you found an interesting "path" through your dataset. You want to find genes that are DE along this path. How would you do it?

ADD REPLY
0
Entering edit mode

Thanks, Aaron. That’s helpful. I guess I’m still not sure why I would do trajectory analysis if I can’t get information about directionality. By clustering the cells I already get a sense of how close they are to each other (transcriptionally) and I can look at that umap to see how close/connected the clusters are to each other. The trajectory analysis basically gives you a mathematical and visual way of confirming it?

Thanks again!

ADD REPLY

Login before adding your answer.

Traffic: 544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6