clustering million points : transcript start and end sites
0
0
Entering edit mode
@abhishek-pratap-5083
Last seen 10.2 years ago
Hey Guys I have about million points per chromosome that are basically putative start and end regions of transcripts. My goal is to cluster them with some unsupervised learning algorithm like DBSCAN (open to suggestions) which can handle big datasets in O(nlogn) time if possible, treating each start/end point as X,Y coordinates. Any ideas ? I need to make sure I dont overshoot the memory during distance calculation. What I would like to get out is set of clusters with number of points in each which will give me a putative set of transcript (may be fragmented) from these points. Thanks! -Abhi
• 814 views
ADD COMMENT

Login before adding your answer.

Traffic: 695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6