Hi,
I am trying to extract the promoter regions from gencode gtf file only for "lincRNA" and "antisense" (under "gene_type" column in the gtf file downloaded from the gencode website).
I think I need to start by using GenomicFeatures to first import the gtf file:
library("GenomicFeatures")
# Download the version 19 gencode gtf file from gencode website, and then load here
txdb = makeTxDbFromGFF("file.gtf", format="gtf")
But I don't think the column in the gtf file called "gene_type" is imported.
How can I import and select the columns "lincRNA" and "antisense" from the TxDb file?
The gtf file looks like this:
#!genome-build GRCh37.p13
#!genome-version GRCh37
#!genome-date 2009-02
#!genome-build-accession NCBI:GCA_000001405.14
#!genebuild-last-updated 2013-09
chr16 HAVANA gene 53069602 53086785 . - . gene_id "ENSG00000261550.1"; transcript_id "ENSG00000261550.1"; gene_type "antisense"; gene_status "NOVEL"; gene_name "RP11-467J12.4"; transcript_type "antisense"; transcript_status "NOVEL"; transcript_name "RP11-467J12.4"; level 2; havana_gene "OTTHUMG00000173186.1";
chr5 HAVANA transcript 10493639 10502840 . - . gene_id "ENSG00000249396.1"; transcript_id "ENST00000515243.1"; gene_type "lincRNA"; gene_status "NOVEL"; gene_name "RP11-1C1.4"; transcript_type "lincRNA"; transcript_status "KNOWN"; transcript_name "RP11-1C1.4-001"; level 2; tag "basic"; havana_gene "OTTHUMG00000162051.1"; havana_transcript "OTTHUMT00000367039.1";
Thanks very much for your help.