convert a sequence to Ranges object
1
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 19 days ago
Germany

Hi,

is there a way to convert a sequence (in my case a fastA character vector) into a IRanges object based on a numeric vector? the vector contains the positions of a specific pattern in the fastA sequence.

> myseq
"MKLSVNEAQLGFPESLKTGQMMDESDEDFKELCASFFQRVKKHGIKEVSGE"
> Positions <- words.pos("K", myseq)
 [1]  2 17 30 41 42 46

 

I would like to convert the sequence into a IRanges object were the positions of the pattern give me the end positions of each range in the list. the start position should be one bigger than the last end position

it should be something like that:

IRanges object with 90 ranges and 0 metadata columns:
           start       end     width
       <integer> <integer> <integer>
   [1]         1         2         2
   [2]         3        17        15
   [3]        18        30        13 ...

What I have until now is this:

> Start <- c(1, Positions+1)
> End <- c(Positions, nchar(myseq))
> myRanges <- IRanges(start = Start, end = End)

Is there a more efficient method to do it? 

I also have the constrain here, that I take the positions as the end position, But what if i want to have it at the beginning pf my pattern and not the end?

thanks for any advices

Assa

 

 

iranges fasta split • 1.5k views
ADD COMMENT
1
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States
PartitioningByEnd(c(Positions, nchar(myseq)))
ADD COMMENT
0
Entering edit mode

this case covers my problem, if the pattern i am looking for is at the end of the sub-sequences, as in the case above. But what if I would like to have the pattern as the beginning of my sub-sequences? (here I can probably do Positions -1) or if I have two different amino-acids I am looking for (like "K" and "R"), and would like to cut the sequence before "K", but after "R" etc.

I know it sounds very complicated, but is there a more flexible way of looking for a specific pattern and decide how to handle it based on the pattern(s) I am looking for?

ADD REPLY
1
Entering edit mode

It depends on the specific case. In special cases, just do the math directly and pass the endpoints to the IRanges constructor.

ADD REPLY
0
Entering edit mode

Thanks, that what  I was doing, but this is sometimes not so straightforward. 

ADD REPLY

Login before adding your answer.

Traffic: 532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6