Hi Gnanendra,
Thanks for reporting this problem. Note that you should always tag your question with the package that is causing problem, which in this case is the GenomicFeatures package (where the makeTxDbFromBiomart()
function belongs). The problem should now be fixed in GenomicFeatures 1.28.5.
Note that, as Mike pointed out, using makeTxDbFromGFF()
works but please note that mpoae_eg_gene
is the dataset for Magnaporthe poae, not for Magnaporthe oryzae. So the URL to the GFF3 file is:
gff3_url <- "ftp://ftp.ensemblgenomes.org/pub/fungi/current/gff3/magnaporthe_poae/Magnaporthe_poae.Mag_poae_ATCC_64411_V1.37.gff3.gz"
Anyway, I would recommend using makeTxDbFromBiomart()
over makeTxDbFromGFF()
here. You get the same set of transcripts but the former gives you the chromosome lengths and also better/cleaner transcript names:
library(GenomicFeatures)
txdb <- makeTxDbFromBiomart(biomart="fungal_mart",
dataset="mpoae_eg_gene",
host="fungi.ensembl.org")
txdb2 <- makeTxDbFromGFF(gff3_url)
length(transcripts(txdb))
# [1] 12760
length(transcripts(txdb2))
# [1] 12760
all(transcripts(txdb) == transcripts(txdb2))
# [1] TRUE
tail(transcripts(txdb))
# GRanges object with 6 ranges and 2 metadata columns:
# seqnames ranges strand | tx_id tx_name
# <Rle> <IRanges> <Rle> | <integer> <character>
# [1] supercont1.97 [2746, 3603] - | 12755 MAPG_11970T0
# [2] supercont1.98 [ 598, 5413] + | 12756 MAPG_11974T0
# [3] supercont1.98 [ 1, 699] - | 12757 MAPG_11973T0
# [4] supercont1.98 [5439, 5647] - | 12758 MAPG_11975T0
# [5] supercont1.99 [1261, 5381] + | 12759 MAPG_11977T0
# [6] supercont1.99 [ 1, 1191] - | 12760 MAPG_11976T0
# -------
# seqinfo: 205 sequences from an unspecified genome
tail(transcripts(txdb2))
# GRanges object with 6 ranges and 2 metadata columns:
# seqnames ranges strand | tx_id tx_name
# <Rle> <IRanges> <Rle> | <integer> <character>
# [1] supercont1.97 [2746, 3603] - | 12755 transcript:MAPG_11970T0
# [2] supercont1.98 [ 598, 5413] + | 12756 transcript:MAPG_11974T0
# [3] supercont1.98 [ 1, 699] - | 12757 <NA>
# [4] supercont1.98 [5439, 5647] - | 12758 transcript:MAPG_11975T0
# [5] supercont1.99 [1261, 5381] + | 12759 transcript:MAPG_11977T0
# [6] supercont1.99 [ 1, 1191] - | 12760 <NA>
# -------
# seqinfo: 200 sequences from an unspecified genome; no seqlengths
GenomicFeatures 1.28.5 should become available via biocLite()
in the next 48 hours.
Cheers,
H.