How to get 'shortest' distance with ChIPpeakAnno::binOverFeature
0
0
Entering edit mode
@liruiradiant-8906
Last seen 2.9 years ago
United States

Dear Jianhong,

I'm working on a project to plot the distance between Differentially Methylated Cytosine (DMC) to CpG islands. The DMC are just 1bp wide, and the CpG islands has width. I want to get the shortest distance between DMC and CpG.

That is, if the relative location is DMC------distance--------CpG.start-------width-------CpG.end I want to get 'distance'.

If the relative location is CpG.start-----width-------CpG.end-------distance2-----DMC I want to get 'distance2'.

If the relative location is CpG.start------DMC-----CpG.end I want to get distance of zero.

If their are two CpG near DMC: CpG1.start----CpG1.end------d1--------DMC---------d2------CpG2.start------CpG2.end I want to get the shorter of {d1, d2}

I tried the following:

binOverFeatureCpG.gr, annotationData=mes.gr,
               select = "nearest",
               # PeakLocForDistance="middle",
               # featureSite="FeatureStart",
               PeakLocForDistance = "all",
               radius=5000, nbins=100, FUN=length, 
               errFun=sd,
               ylab="count", 
               main="Distribution of CpG around DMC")

And I got the following output: https://www.dropbox.com/s/re1lltzosvqw2oq/hist.pdf?dl=0

R workspace including gr objects: https://www.dropbox.com/s/hamw29p67kfgobr/workspace.RData?dl=0

Questions:

  1. Will the above code achieve my analysis goal? If not, how can I achieve that goal?
  2. Is it possible to get the output data (distance, count information) in the output of the function, so that users can plot with more custom style?
  3. Can we use 'density' rather than 'count' in the y-axis?
  4. Should we use 'errFun' for my purpose? what does it do?
  5. What does 'featureSite="bothEnd"' do? I got and error when I used that option.

Sorry about the long list of questions. Thanks!

Ray

ChIPpeakAnno • 1.1k views
ADD COMMENT
0
Entering edit mode

You may want to try :

out <- binOverFeatureCpG.gr, annotationData=mes.gr,
               select = "nearest",
               PeakLocForDistance="middle",
               featureSite="bothEnd",
               radius=5000, nbins=100, FUN=length, 
               errFun=sd,
               ylab="count", 
               main="Distribution of CpG around DMC")

In this case, bothEnd will only consider outside of the feature. "bothEnd" is used to calculate the distance from peaks. output can be used to plot custom style. However, location 0 will ignored.

I think to answer your question, the best way is to split it into 2 steps: 1. annotatePeakInBatch to annotated the nearest features 2. use distance function to calculate the distance from peak to features, and then apply sign to the distance.

Hope this will help.

Jianhong.

ADD REPLY
1
Entering edit mode

Ray, You could also use annotatePeakInBatch with output="both" and maxgap = 0 to generate the nearest/overlapping features. Then select the features with the shortestDistance for each DMC after setting shortestDistance = 0 for features with fromOverlappingOrNearest = "Overlapping". Best, Julie

ADD REPLY

Login before adding your answer.

Traffic: 851 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6