makeTxDbFromGFF returns empty object
1
0
Entering edit mode
@timotheeflutre-6727
Last seen 5.6 years ago
France

I would like to make a TxDb package from a GFF file using GenomicFeatures, but can't get it to work. Below is a reproducible example on a small subset.

Retrieve the GFF file:

gff.file <- "Vitis_vinifera_annotation.gff.gz"
url <- paste0("https://urgi.versailles.inra.fr/content/download/2157/19376/file/", gff.file)
cmd <- paste0("wget ", url, " ", gff.file)
system(cmd)

Extract a small subset:

gff.file.small <- "subset.gff"
cmd <- paste0("zcat ", gff.file, " | grep -w 'chr2' | head -100 > ", gff.file.small)
system(cmd)

Make a txdb object:

library(GenomicFeatures)
library(BSgenome.Vvinifera.URGI.IGGP12Xv0)
txdb <- makeTxDbFromGFF(file=gff.file.small, format="auto", dataSource=url,
                        organism="Vitis vinifera", taxonomyId=29760,
                        chrominfo=seqinfo(BSgenome.Vvinifera.URGI.IGGP12Xv0))
txdb # shows transcript_nrow=0, exon_nrow=0, etc
length(tmp <- transcripts(txdb)) # 0

Is it because the initial GFF file is badly formatted?

genomicfeatures txdb maketxdbfromgff • 1.9k views
ADD COMMENT
0
Entering edit mode
@michael-lawrence-3846
Last seen 3.0 years ago
United States

It's a GFF2 file, while the TxDb stuff only supports GTF and GFF3. There is no standard way of expressing gene models with GFF2. You could probably figure out a way to convert that file to GFF3.

ADD COMMENT
0
Entering edit mode

Just jumping in; alternatively you could try to get a GFF3 or a GTF file from Ensembl plants, e.g.

ftp://ftp.ensemblgenomes.org/pub/plants/current/gff3/vitis_vinifera

(for other versions than "current" just browse the ftp)

By the way, if you're working with Ensembl annotations you could also consider to give a quick glance to the ensembldb package. The EnsDb objects from that package provide a similar (almost the same) functionality than the TxDb objects. Also, you have the ensDbFromGtf and ensDbFromGff methods to create such an EnsDb from a GTF or GFF3; ideally check out the current devel version of the package (will be released soon with Bioc 3.3).

cheers, jo

ADD REPLY
0
Entering edit mode

@MichaelLawrence Thanks, I will (try to) figure out a way to convert the GFF2 file into GFF3

@Johannes Rainer I am aware that I can retrieve annotations at Ensembl, but it happens that I specifically want these, which may be a bit different than the ones at Ensembl, which is something that I should indeed check at some point

ADD REPLY
0
Entering edit mode

From here: https://urgi.versailles.inra.fr/Species/Vitis/Annotations

It looks like there are GFF3 annotations under the "V1" heading. It's only V0 that are GFF2.

ADD REPLY
0
Entering edit mode

@Michael Lawrence yes, but I would like the V0, too. (I'll ask another question about makeTxDbFromGFF because I don't even understand how it works on the example given in the official specification.)

ADD REPLY

Login before adding your answer.

Traffic: 556 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6