Entering edit mode
Wu, Huiyun
▴
70
@wu-huiyun-5271
Last seen 10.1 years ago
Hello there,
I am new in seq data analysis, and have difficulty in miRNA
annotation. To clarify my question, I have the following codes.
bamfile <- "GSE33858_1.bowtie_alignment.bam"
aln <- readBamGappedAlignments(bamfile)
txdb<-makeTranscriptDbFromUCSC(genome="hg19",tablename="knownGene")
exonRangesList <- exonsBy(txdb, by = "gene")
exonRanges <- unlist(exonRangesList)
strand(exonRanges) <- "*"
geneNames <- sub("\\..*$", "", names(exonRanges))
exonRangesListNoStrand <- split(exonRanges, geneNames)
numberOfexons <- sapply(width(exonRangesListNoStrand), length)
geneLength <- sum(width(reduce(exonRangesListNoStrand)))
# Counting reads for the BAM file
counts <- countOverlaps(exonRangesListNoStrand, aln)
names(counts) <- names(exonRangesListNoStrand)
> length(counts)
[1] 22932
So my question is from aligned file to annotated file. Specifically,
why I got 22932 features after mapping to hg19? I learned there are
only less than 2000 miRNAs matched genes. I also know there are
several RNA databases including miRBase for annotation. Did I do
something logically wrong? It would be greatly appreciated if someone
could show me how to implement this annotation in R.
Many thanks,
William Wu
[[alternative HTML version deleted]]