I read your concern again:
I would like to have an output containg only nearest feature for those peaks that don not reside within peaks and only overlapping features for those peaks that are inside genes...
Seems like you would like to assign only one type of feature (either "nearest" or "overlapping", and "overlapping" is preferred if the "nearest" feature is not "overlapping") to each peak. Like you mentioned, if you set output = "both", select = "all"
, the tool gives both "overlapping" and "nearest" features to peaks. To obtain what you want, I suggest three steps: first, annotate peaks to the overlapping features; second, annotate the peaks that don't have overlapping features to the nearest features; last, concatenate the two. Below is some example codes.
library(ensembldb)
library(EnsDb.Hsapiens.v75)
data(myPeakList)
annoData <- annoGR(EnsDb.Hsapiens.v75)
# Step1: annotate peaks to the overlapping features, if "select = 'all'", multiple features can be assigned to a single peak.
anno_overlapping <- annotatePeakInBatch(myPeakList, AnnotationData = annoData,
output = "overlapping", select = "first")
anno_overlapping_non_na <- anno_overlapping[!is.na(anno_overlapping$feature)]
# Step2: annotate peaks that are without overlapping features to nearest features
myPeakList_non_overlapping <- myPeakList[!(names(myPeakList) %in% anno_overlapping_non_na$peak)]
anno_nearest <- annotatePeakInBatch(myPeakList_non_overlapping,
AnnotationData = annoData,
output = "nearestLocation", select = "first")
# Step3: concatenate the two
anno_final <- c(anno_overlapping_non_na, anno_nearest)
The above code assigns either "overlapping" or "nearest" feature to peak, and if "overlapping" feature is not the "nearest", only the "overlapping" one will be reported. Hope this is what you want.
Thanks James for your kind reply!. I read what you mentioned in your comment and I tried it. The point is that when the peak is overlapping a feature the tools give me both the overlapping feature and the nearest feature to the peak. I would like to have an output containg only nearest feature for those peaks that don not reside within peaks and only overlapping features for those peaks that are inside genes...
Hi Ilaria,
Thank you for your great question! To achieve your specific goal, you can utilize the insideFeature column in the output file. By setting the insideFeature to "inside," you can effectively isolate peaks that fall within features.
There are several other values for the insideFeature column:
Hope this fits your needs.
Best regards,
Julie