clustering million points : transcript start and end sites

0

Entering edit mode

Abhishek Pratap ▴ 410

@abhishek-pratap-5083

Last seen 10.2 years ago

Hey Guys I have about million points per chromosome that are basically putative start and end regions of transcripts. My goal is to cluster them with some unsupervised learning algorithm like DBSCAN (open to suggestions) which can handle big datasets in O(nlogn) time if possible, treating each start/end point as X,Y coordinates. Any ideas ? I need to make sure I dont overshoot the memory during distance calculation. What I would like to get out is set of clusters with number of points in each which will give me a putative set of transcript (may be fragmented) from these points. Thanks! -Abhi

• 814 views

ADD COMMENT • link 12.7 years ago Abhishek Pratap ▴ 410

Login before adding your answer.

Similar Posts

Loading Similar Posts

Traffic: 695 users visited in the last hour

Content Search
Users
Tags
Badges

Help About
FAQ

Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the

version 2.3.6