Entering edit mode
Ana Rodrigues
▴
30
@ana-rodrigues-2623
Last seen 10.2 years ago
Dear all,
I am using the rtracklayer package to fetch phastCons scores for
certain positions from the UCSC genome browser. For some of these
positions, there will be no phastCons score available (ie no
alignment), and the code below will crash, rather than spit out an
error an move on to the next position.
library("rtracklayer")
get.phastCons.rtl <- function(pos, ref="ce6") {
if(!exists("session")) {
session <<- browserSession("UCSC")
}
pos <- unlist(strsplit(pos, "[:-]"))
pos.track <- GenomicData(IRanges(start=as.numeric(pos[2]),
end=as.numeric(pos[3])),
strand = "+", chrom = pos[1],
genome = ref)
track(session, "pos") <- pos.track
query <- ucscTableQuery(session, "multiz6way",
GenomicRanges(as.numeric(pos[2]),
as.numeric(pos[3]),
pos[1]))
tableName(query) <- "phastCons6way"
return(track(query)$score)
}
positions <- c("chrX:16039269-16039273",
"chrX:16039357-16039361",
"chrX:16039609-16039613") # no phastCons score
# this works
pc.scores.works <- sapply(positions[1:2], get.phastCons.rtl)
# this crashes with an error and no scores get reported
pc.scores.fails <- sapply(positions, get.phastCons.rtl)
Here is the error:
Error in read.table(con, colClasses = bedClasses, as.is = TRUE) :
no lines available in input
I am wondering if you could give me some advice on making this
function able to cope with missing phastCons data, so that I can apply
it for any given position?
Thank you in advance for any help you can provide,
Ana
> sessionInfo()
R version 2.10.1 (2009-12-14)
i386-apple-darwin9.8.0
locale:
[1] en_US.US-ASCII/en_US.UTF-8/C/C/en_US.US-ASCII/en_US.US-ASCII
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rtracklayer_1.6.0 RCurl_1.3-1 bitops_1.0-4.1
loaded via a namespace (and not attached):
[1] Biobase_2.6.1 Biostrings_2.14.11 BSgenome_1.14.2
IRanges_1.4.11
[5] XML_2.6-0
> traceback()
35: stop("no lines available in input")
34: read.table(con, colClasses = bedClasses, as.is = TRUE)
33: DataFrame(read.table(con, colClasses = bedClasses, as.is = TRUE))
32: .local(con, variant, trackLine, genome, ...)
31: fun(con, ...)
30: fun(con, ...)
29: import(con, format, ...)
28: import(con, format, ...)
27: import(text = lines, format = "bed", variant = "bedGraph", genome
= genome)
26: import(text = lines, format = "bed", variant = "bedGraph", genome
= genome)
25: .local(con, genome, ...)
24: fun(con, ...)
23: fun(con, ...)
22: import(con, format, ...)
21: import(con, format, ...)
20: import(format = subformat, text = text, ...)
19: import(format = subformat, text = text, ...)
18: FUN(1L[[1L]], ...)
17: lapply(seq_along(trackLines), makeTrackSet)
16: lapply(seq_along(trackLines), makeTrackSet)
15: import.ucsc(con, "wig", TRUE, genome = genome)
14: import.ucsc(con, "wig", TRUE, genome = genome)
13: .local(con, genome, ...)
12: fun(con, ...)
11: fun(con, ...)
10: import(con, format, ...)
9: import(con, format, ...)
8: import(text = output, format = format)
7: import(text = output, format = format)
6: .local(object, ...)
5: track(query)
4: track(query)
3: FUN(c("chrX:16039269-16039273", "chrX:16039357-16039361",
"chrX:16039609-16039613"
)[[3L]], ...)
2: lapply(X, FUN, ...)
1: sapply(positions, get.phastCons.rtl)
--
Ana P. C. Rodrigues, Ph.D.
Salk Institute for Biological Studies
T +1 858.453.4100 x1067
W bioinformatics.salk.edu/people/ana/