IRanges, GenomicRanges, GenomicFeatures?
1
0
Entering edit mode
Oleg Moskvin ▴ 60
@oleg-moskvin-4293
Last seen 9.8 years ago
United States
Hello list members, For a RNA-seq analysis, what would you suggest to use to convert raw- sequence-based read coverage to annotated ORF-based coverage, if the genome of interest is NOT supported in neither UCSC nor ENSEMBL, which means that creation of a TranscriptDB object in a straightforward way (I.e. according to the GenomicFeatures pipeline) is impossible? What would you recommend to import a .gff file (containing annotation of a particular genome, from GenBank) into R/Bioconductor to eventually generate a gene-centric countTable readable by packages like DESeq? Thank you! Oleg [[alternative HTML version deleted]]
Coverage Annotation TranscriptDb convert GenomicFeatures Coverage Annotation TranscriptDb • 972 views
ADD COMMENT
0
Entering edit mode
@steve-lianoglou-2771
Last seen 19 months ago
United States
Hi, On Sun, Oct 31, 2010 at 11:10 PM, Oleg Moskvin <moskvin at="" wisc.edu=""> wrote: > Hello list members, > > For a RNA-seq analysis, what would you suggest to use to convert raw-sequence-based read coverage to annotated ORF-based coverage, if the genome of interest is NOT supported in neither UCSC nor ENSEMBL, which means that creation of a TranscriptDB object in a straightforward way (I.e. according to the GenomicFeatures pipeline) is impossible? What would you recommend to import a .gff file (containing annotation of a particular genome, from GenBank) into R/Bioconductor to eventually generate a gene-centric countTable readable by packages like DESeq? Assuming I've understood your question and how you have your data available to you, here is one (maybe too simple) approach: I think I'd parse the GFF into a GRangesList object (each item of the list would be a GRanges object that stores the exon structure of your transcripts (or genes) (which I'm assuming is what's in your GFF file)). If you had your rna-seq data in its own GRanges object, you could then countOverlaps between your data and GRangesList-transcript info pretty easily, which you could use to create your countTable. Hope that helps, -steve ps - I think rtracklayer has some facilities to import GFF files, which might be helpful to you. -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
ADD COMMENT
0
Entering edit mode
On Sun, Oct 31, 2010 at 10:19 PM, Steve Lianoglou < mailinglist.honeypot@gmail.com> wrote: > Hi, > > On Sun, Oct 31, 2010 at 11:10 PM, Oleg Moskvin <moskvin@wisc.edu> wrote: > > Hello list members, > > > > For a RNA-seq analysis, what would you suggest to use to convert > raw-sequence-based read coverage to annotated ORF-based coverage, if the > genome of interest is NOT supported in neither UCSC nor ENSEMBL, which means > that creation of a TranscriptDB object in a straightforward way (I.e. > according to the GenomicFeatures pipeline) is impossible? What would you > recommend to import a .gff file (containing annotation of a particular > genome, from GenBank) into R/Bioconductor to eventually generate a > gene-centric countTable readable by packages like DESeq? > > Assuming I've understood your question and how you have your data > available to you, here is one (maybe too simple) approach: > > I think I'd parse the GFF into a GRangesList object (each item of the > list would be a GRanges object that stores the exon structure of your > transcripts (or genes) (which I'm assuming is what's in your GFF > file)). > > If you had your rna-seq data in its own GRanges object, you could then > countOverlaps between your data and GRangesList-transcript info pretty > easily, which you could use to create your countTable. > > Hope that helps, > -steve > > ps - I think rtracklayer has some facilities to import GFF files, > which might be helpful to you. > > Right. Just import(asRangedData=FALSE) and then split() it into a GRangesList. > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact<http: cbio.mskc="" c.org="" %7elianos="" contact=""> > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 467 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6