Dear All,
'seqnames<-' can not work as expected in the following scene.
library(GenomicRanges)
gr <- GRanges(seqnames = "chr1", strand = c("+", "-", "+"),
ranges = IRanges(start = c(1,3,5), width = 3))
seqlevels(gr) <- c("chr1", "chr2", "chr10")
seqnames(gr) <- c("chr1", "chr2", "chr10")
## Error in .normalize_seqnames_replacement_value(value, x) :
## levels of supplied 'seqnames' must be identical to 'seqlevels(x)'
GenomicRanges:::.normalize_seqnames_replacement_value
## function (value, x)
## {
## if (!is(value, "Rle"))
## value <- Rle(value)
## if (!is.factor(runValue(value)))
## runValue(value) <- factor(runValue(value))
## if (!identical(levels(value), seqlevels(x)))
## stop("levels of supplied 'seqnames' must be ", "identical to 'seqlevels(x)'")
## S4Vectors:::V_recycle(value, x, x_what = "value", skeleton_what = "x")
## }
## <environment: namespace:GenomicRanges>
value <- c("chr1", "chr2", "chr10")
x <- gr
if (!is(value, "Rle"))
value <- Rle(value)
if (!is.factor(runValue(value)))
runValue(value) <- factor(runValue(value)) # cause the problem
if (!identical(levels(value), seqlevels(x)))
stop("levels of supplied 'seqnames' must be ", "identical to 'seqlevels(x)'")
## Error: levels of supplied 'seqnames' must be identical to 'seqlevels(x)'
levels(value)
## [1] "chr1" "chr10" "chr2"
seqlevels(x)
## [1] "chr1" "chr2" "chr10"
# Maybe the codes can be replaced by value <- c("chr1", "chr2", "chr10") x <- gr if (!is(value, "Rle")) value <- Rle(value) if (!is.factor(runValue(value))) runValue(value) <- factor(runValue(value), levels = sortSeqlevels(unique(value), X.is.sexchrom = TRUE)) # modified if (!identical(levels(value), seqlevels(x))) stop("levels of supplied 'seqnames' must be ", "identical to 'seqlevels(x)'") # No Error!
# The above depend on X.is.sexchrom # The below can be better value <- c("chr1", "chr2", "chr10") x <- gr if (!is(value, "Rle")) value <- Rle(value) if (!is.factor(runValue(value))) runValue(value) <- factor(runValue(value), levels = seqlevels(x)) # modified if (any(is.na(value))) # modified stop("levels of supplied 'seqnames' must be ", "identical to 'seqlevels(x)'") # No Error!
My sessionInfo:
R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
[3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936
attached base packages:
[1] parallel stats4 stats graphics
[5] grDevices utils datasets methods
[9] base
other attached packages:
[1] GenomicRanges_1.26.1 GenomeInfoDb_1.10.0
[3] IRanges_2.8.0 S4Vectors_0.12.0
[5] BiocGenerics_0.20.0
loaded via a namespace (and not attached):
[1] zlibbioc_1.20.0 XVector_0.14.0
[3] tools_3.3.1
Are there any better solutions? Or do I use 'seqnames<-' the wrong way?
Thanks in advance,
Can Wang
I think the last modification is on the right track, but it should still check for identical levels (to check factor input). The NA check needs to be added, with its own error message. It should be specific to the coercion branch, where it says something like "Some 'seqnames' are not in 'seqlevels(x)'".
Hi Micheal, thanks for your reply. I agree that the first modification is not right. Because it would not work on
seqnames(gr) <- c("chr1", "chr2", "chr2").
By the way, I also find a way to do the assignment without error:I don't know why but it indeed works.