How to apply a function to a RleViesList object?
1
0
Entering edit mode
Assa Yeroslaviz ★ 1.5k
@assa-yeroslaviz-1597
Last seen 17 days ago
Germany

I have an RleList I want to apply some functions on.

A small reprex of the object I need

IRanges::RleList(
  chr1 = 1:20,
  chr2 = 20:1
) -> cvg
GenomicRanges::GRanges(
  S4Vectors::Rle(c("chr1", "chr2"), c(2, 1)),
  IRanges::IRanges(c(1, 4, 2), width=c(4, 8, 6), names=head(letters, 3)),
  S4Vectors::Rle(GenomicRanges::strand(c("-", "+", "-")), c(1, 1, 1))
) -> gr

(v <- IRanges::Views(cvg, gr))
RleViewsList object of length 2:
$chr1
Views on a 20-length Rle subject

views:
    start end width
[1]     1   4     4 [1 2 3 4]
[2]     4  11     8 [ 4  5  6  7  8  9 10 11]

$chr2
Views on a 20-length Rle subject

views:
    start end width
[1]     2   7     6 [19 18 17 16 15 14]

On this object we would like to apply some functions, but I can't understand the behavior. Sometimes it returns an error I can't identify.

For once, the mean function doesn't work on the list ...

sapply(X = v, FUN = mean)
Warning in mean.default(x) :
  argument is not numeric or logical: returning NA
Warning in mean.default(x) :
  argument is not numeric or logical: returning NA
chr1 chr2 
  NA   NA

but the sum function does

sapply(X = v, FUN = sum)
$chr1
 a  b 
10 60 

$chr2
 c 
99

Any ideas why there is this difference?

The main difficulty I encounter is, that I try to bin the single elements within each of the lists inside the RleList object using the cut function

For that i have created a function

my_cut <- function(x) {
  cut(1:IRanges::width(x), 3)
}

But when I try to use the function, it only takes into account only the first element in each of the groups (hence the warning).

sapply(v, my_cut)
Warning in 1:IRanges::width(x) :
  numerical expression has 2 elements: only the first used
$chr1
[1] (0.997,2] (0.997,2] (2,3]     (3,4]    
Levels: (0.997,2] (2,3] (3,4]

$chr2
[1] (0.995,2.67] (0.995,2.67] (2.67,4.33]  (2.67,4.33]  (4.33,6]     (4.33,6]    
Levels: (0.995,2.67] (2.67,4.33] (4.33,6]

I would appreciate the help for this two problems/question

thanks in advance

Assa

Views IRanges GRanges RleList • 1.1k views
ADD COMMENT
1
Entering edit mode

What is your ultimate goal? It looks like you might be trying to generate window summaries within each view? You may be looking for functions like viewSums() and viewMeans(), which can operate directly on the RleViewsList, returning another list.

ADD REPLY
0
Entering edit mode
Mike Smith ★ 6.6k
@mike-smith
Last seen 13 hours ago
EMBL Heidelberg

Regarding the difference in behaviour between sum() and mean() I think this is because there is a sum() method defined for the RleViews class, which presumably knows how to handle it in a sensible fashion. However this is not true for mean(), and hence it doesn't perform as you're hoping.

> showMethods("sum")
Function: sum (package base)
x="CompressedIntegerList"
x="CompressedLogicalList"
x="CompressedNumericList"
x="Rle"
    (definition from function "Summary")
x="RleViews"
    (inherited from: x="Views")
    (definition from function "Summary")

> showMethods("mean")
Function "mean":
 <not an S4 generic function>

However you can make a reasonable stab at replicating the behaviour you see with sum() e.g.

lapply(v, FUN = function(x) {
    vapply(x, FUN = mean, FUN.VALUE = numeric(1))
})
# $chr1
#   a   b 
# 2.5 7.5 
#
# $chr2
#    c 
# 16.5

or you can use viewMeans() as Michael suggested. That's probably much more efficient if you're working on real data:

viewMeans(v)
# NumericList of length 2
# [["chr1"]] a=2.5 b=7.5
# [["chr2"]] c=16.5

I don't think I follow what you're trying to do with cut(). What's the expected output if it works as you intend?

ADD COMMENT

Login before adding your answer.

Traffic: 649 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6