Entering edit mode
Fahim Md
▴
250
@fahim-md-4018
Last seen 10.4 years ago
Hello
I have a file containing a list of aligned Refseq identifier in the
following format. I want to convert this file into RangedData format
in
such a way that "findOverlaps" function be as efficient as possible.
I know
how to do this when each of the identifiers has just a start and an
end
coordinate. IRanges package is very efficient in finding the
overlapping
intervals in such case. But when the structure is in the form of blobs
(as
shown below in the last three fields), I am not sure how to convert
this
structure into RangedData format and how to subsequently call the
"findOverlaps" function.
RefSeqID targetName strand blockSizes
queryStart targetStart
XM_001065892.1 chr4 + 127,986,
0,127, 124513961,124514706,
XM_578205.2 chr2 - 535,137,148,
0,535,672, 155875533,155879894,155895543,
NM_012543.2 chr1 +
506,411,212,494,
0,506,917,1129, 96173572,96174920,96176574,96177991,
Thanks and appreciate ur help.
--Fahim
[[alternative HTML version deleted]]