On Wed, Oct 3, 2012 at 6:50 AM, Fenton Christopher Graham
<christopher.fenton at="" uit.no=""> wrote:
> library(GenomicFeatures)
>
> #get known genes
> genes <- makeTranscriptDbFromUCSC(genome = "hg19",tablename =
"knownGene")
> saveFeatures(genes, file="hg19_knownGene.sqlite")
> txdb <- loadFeatures("hg19_knownGene.sqlite")
>
> #get introns
> introns <- intronsByTranscript(txdb)
> introns <- unlist(introns)
>
> #get exons
> exons <- exonsBy(txdb)
> exons <- unlist(exons)
>
> #take chr1 and "+" stand
> introns_ind <- which(seqnames(introns) == "chr1" & strand(introns)
== "+")
> exons_ind <- which(seqnames(exons) == "chr1" & stand(exons) ==
"+")
>
> #convert to ranges
> introns_chr1 <- ranges(introns[introns_ind,])
> exons_chr1 <- ranges(exons[exons_ind,])
>
> #look for introns in exons?
> sum(countOverlaps(introns_chr1, exons_chr1, type="within"))
>
> Why isn't this number 0?
> Why are we looking at over 1000 introns within exons?
The genome is a strange place, isn't it? To see what is going on,
extend your example just a bit:
intronsWithinExons_ind =
queryHits(findOverlaps(introns_chr1,exons_chr1,type='within'))
intronsWithinExons = introns_chr1[intronsWithinExons_ind,]
Try looking at a few of these in the UCSC genome browser. The few I
checked are there.
Sean
Thanks, though that each transcript (splice variant) would negate
overlap.
But I see from UCSC and the code, that the logic is flawed, and not
the genome.
Chris
________________________________________
Fra: seandavi at gmail.com [seandavi at gmail.com] på vegne av
Sean Davis [sdavis2 at mail.nih.gov]
Sendt: 3. oktober 2012 13:12
Til: Fenton Christopher Graham
Kopi: bioconductor
Emne: Re: [BioC] Introns in exons and vice versa (Genomic Features)
On Wed, Oct 3, 2012 at 6:50 AM, Fenton Christopher Graham
<christopher.fenton at="" uit.no=""> wrote:
> library(GenomicFeatures)
>
> #get known genes
> genes <- makeTranscriptDbFromUCSC(genome = "hg19",tablename =
"knownGene")
> saveFeatures(genes, file="hg19_knownGene.sqlite")
> txdb <- loadFeatures("hg19_knownGene.sqlite")
>
> #get introns
> introns <- intronsByTranscript(txdb)
> introns <- unlist(introns)
>
> #get exons
> exons <- exonsBy(txdb)
> exons <- unlist(exons)
>
> #take chr1 and "+" stand
> introns_ind <- which(seqnames(introns) == "chr1" & strand(introns)
== "+")
> exons_ind <- which(seqnames(exons) == "chr1" & stand(exons) ==
"+")
>
> #convert to ranges
> introns_chr1 <- ranges(introns[introns_ind,])
> exons_chr1 <- ranges(exons[exons_ind,])
>
> #look for introns in exons?
> sum(countOverlaps(introns_chr1, exons_chr1, type="within"))
>
> Why isn't this number 0?
> Why are we looking at over 1000 introns within exons?
The genome is a strange place, isn't it? To see what is going on,
extend your example just a bit:
intronsWithinExons_ind =
queryHits(findOverlaps(introns_chr1,exons_chr1,type='within'))
intronsWithinExons = introns_chr1[intronsWithinExons_ind,]
Try looking at a few of these in the UCSC genome browser. The few I
checked are there.
Sean
On 03.10.2012 11:50, Fenton Christopher Graham wrote:
> library(GenomicFeatures)
>
> #get known genes
> genes <- makeTranscriptDbFromUCSC(genome = "hg19",tablename =
> "knownGene")
> saveFeatures(genes, file="hg19_knownGene.sqlite")
> txdb <- loadFeatures("hg19_knownGene.sqlite")
>
> #get introns
> introns <- intronsByTranscript(txdb)
> introns <- unlist(introns)
>
> #get exons
> exons <- exonsBy(txdb)
> exons <- unlist(exons)
>
> #take chr1 and "+" stand
> introns_ind <- which(seqnames(introns) == "chr1" & strand(introns)
> == "+")
> exons_ind <- which(seqnames(exons) == "chr1" & stand(exons) ==
"+")
>
> #convert to ranges
> introns_chr1 <- ranges(introns[introns_ind,])
> exons_chr1 <- ranges(exons[exons_ind,])
>
> #look for introns in exons?
> sum(countOverlaps(introns_chr1, exons_chr1, type="within"))
>
> Why isn't this number 0?
> Why are we looking at over 1000 introns within exons?
>
>
> Chris
Alternative splicing. An exon in one transcript from a given gene may
well have introns removed in other transcripts from the same gene (and
will hence overlap).
--
Alex Gutteridge