Ranges on AAStrings
0
0
Entering edit mode
@tobiaskockmann-11966
Last seen 7.5 years ago

Hi BioC,

can anyone help me with the following:

I mapped a set of peptides (short AA sequences) to a set of proteins using exact string matching. Now, I would like to compare the locations of the mapped peptide to each other, in order to find peptides that locate next to one another. To me this problem sounds like something one could tackle with ranges on AAStrings objects. So I created a AAString to preresent the protein of interest:

> poi
  384-letter "AAString" instance
seq: MSSMQMDPELAKQLFFEGATVVILNMPKGTEFGIDYNSWEVGPKFR...AVEATLRKKAEKFQAHLTKKFRWDFTSEPEDCAPVVVELPEGIETA

and a set of Views to present the mapped peptides:

> v1
  Views on a 384-letter AAString subject
subject: MSSMQMDPELAKQLFFEGATVVILNMPKGTEFGIDYNSWEVGPK...EATLRKKAEKFQAHLTKKFRWDFTSEPEDCAPVVVELPEGIETA
views:
    start end width
[1]     1  10    10 [MSSMQMDPEL]
[2]    10  17     8 [LAKQLFFE]
[3]    25  29     5 [NMPKG]

Are there any BioC functions that can be used to analyse these Views? Like Finding views that follow each other at zero distance, or the next neighbour...I saw that such functions exist for ranged integers (IRanges):

pcompare(x,y)

findOverlaps(x,y)

etc.

But somehow these functions do not like Views on AAStrings:

> pcompare(v1[1], v1[-1])
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function 'pcompare' for signature '"AAString", "AAString"'
> class(v1)
[1] "XStringViews"
attr(,"package")
[1] "Biostrings"

Is there a way to make this work?

Greetings,

Tobi

AAString Views Biostrings • 1.5k views
ADD COMMENT
1
Entering edit mode

Is there a reason you can't work with these as ranges directly?  Presumably if you've done local alignment between peptides and proteins you've got a set of start and end points for the alignments already, which you're using to construct the XStringsViews object.  Might it be easier to simply construct an IRanges or IRangesList at this stage?  You can give each range a name so you can relate it back to the specific peptide if you need to.

ADD REPLY
1
Entering edit mode

I agree with Mike. Said otherwise, these range operations (pcompare, findOverlaps, etc...) don't work on Views objects but they do work on the ranges of the Views objects. You can extract the ranges of a Views object with e.g. ranges(v1), so, instead of pcompare(v1[1], v1[-1]), do pcompare(ranges(v1)[1], ranges(v1)[-1]).

H.

ADD REPLY
0
Entering edit mode

I found two exceptions. findOverlaps() and countOverlaps() work directly on the Views object if queried against itself (within proteins for a AAString subject):

hits <- findOverlaps(query = v1, maxgap = 1, drop.self=TRUE, drop.redundant=TRUE) 
c <- countOverlaps(query = v1, maxgap = 1, drop.self=TRUE, drop.redundant=TRUE)

> hits
SelfHits object with 1 hit and 0 metadata columns:
      queryHits subjectHits
      <integer>   <integer>
  [1]         1           2
  -------
  queryLength: 3 / subjectLength: 3
ADD REPLY
0
Entering edit mode

True thought!

ADD REPLY

Login before adding your answer.

Traffic: 848 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6