Of course this track really exists or I wouldn't post an issue about it. :)
> browseUCSCtrack("hg38", "refGene")
Easy to reproduce the error:
> makeTxDbFromUCSC(genome="hg38", tablename="refGene")
Error in .tablename2track(tablename, session) :
UCSC table "refGene" is not supported
The RefSeq Genes track actually doesn't exist anymore for hg38. It has been replaced with a new composite track named NCBI RefSeq and made of 6 subtracks. See announcement here (from March 3, 2017):
I just fixed GenomicFeatures in devel (GenomicFeatures 1.27.11) to support this new track. Will port the fix to BioC release later today. It will take about 24 hours for the fix to propagate to the public repositories and become available via biocLite().
Also I still need to fix browseUCSCtrack(). It's still taking you to what looks like a stale page for the old RefSeq Genes track for hg38.
Cheers,
H.
Edit: This is now ported to GenomicFeatures in release (GenomicFeatures 1.26.4). For reasons I don't really understand, browseUCSCtrack() seems to be working as expected again (I didn't touch it).
You still have access to it. As reported by supportedUCSCtables("hg38") (from GenomicFeatures 1.27.11), this table is now associated to the new UCSC RefSeq subtrack of the composite NCBI RefSeq track.
Is there any way rtracklayer could help with that function? Hard-coding the mappings does not seem sustainable. I'm not sure why the names of the tracks are even needed there. The table browser (and UCSCTableQuery) supports direct table access, without need for a track name. Calling trackNames(session) is particularly problematic, because long names get truncated for the UI and won't match the track names in the table browser.
Yes hard-coding the mapping between tables and tracks in supportedUCSCtables() is ugly and I welcome any suggestion to improve this. This function has 2 purposes:
Provide the list of tables/tracks that are known to be compatible with makeTxDbFromUCSC()
Map tables to tracks (many-to-one mapping), and to subtracks (if any).
makeTxDbFromUCSC() requires a table name but table names can be somewhat obscure. Most of the time the user knows the name of the track/subtrack that s/he is interested in so supportedUCSCtables() provides a quick and easy way for him/her to find the name of the central table for a given track/subtrack.
Probably the reason for specifying the track when calling ucscTableQuery() is that neither the signature of the function nor its documentation suggest that the track argument can be omitted. It actually seems to work for some tables but not all of them:
library(rtracklayer)
session <- browserSession()
genome(session) <- "hg19"
ucscTableQuery(session, track="RefSeq Genes", table="hgFixed.refLink") # OK
# Get table 'hgFixed.refLink' from track 'RefSeq Genes' within hg19:*:*-*
ucscTableQuery(session, table="hgFixed.refLink") # error!
# Error in normArgTable(value, x) : unknown table name 'hgFixed.refLink'
I should add that one benefit of having supportedUCSCtables() use a hard coded list of tables/tracks that are known to be compatible with makeTxDbFromUCSC() is that it makes the function very snappy. Having a smart supportedUCSCtables() that builds that list on-the-fly by querying the Genome Browser would probably make the function much slower. This function is typically called interactively by the user before calling makeTxDbFromUCSC() (and is called again inside makeTxDbFromUCSC()) so it should not take too long (e.g. < 20 sec.).
Yea, I guess there's no feasible way to automate it, even if cached inside the package. If a table moves to another track we'd never find it without an exhaustive search.
Seems like
supportedUCSCtables()
should look attableNames(ucscTableQuery(session))
instead ofnames(trackNames(session))
when filtering.