Hi all, I would like to know if it there was a Bioconductor package that could help me in the small RNAs quantifications. In particular I would use bam files as input and obtain a table with the read counts for each small RNA.
Thanks.
Riccardo
Hi all, I would like to know if it there was a Bioconductor package that could help me in the small RNAs quantifications. In particular I would use bam files as input and obtain a table with the read counts for each small RNA.
Thanks.
Riccardo
EDIT: It's not clear what you mean by 'small RNAs quantifications'. I suppose that could mean lots of different things. Assuming you mean something like 'I have some aligned data in bam files, and I want to count the number of reads that overlap just the small RNAs', then The conventional way to do that would be to use summarizeOverlaps from the GenomicAlignments package. You need a GRanges or GRangesList that identifies the genomic regions that the small RNAs come from, for which you can use an EnsDb package. Using the EnsDb.Hsapiens.v79 package as an example:
> tx <- transcripts(EnsDb.Hsapiens.v79, filter = list(TxbiotypeFilter(c("miRNA","snRNA","snoRNA")))) > tx GRanges object with 7597 ranges and 5 metadata columns: seqnames ranges strand | tx_id <Rle> <IRanges> <Rle> | <character> ENST00000619216 1 [ 17369, 17436] - | ENST00000619216 ENST00000607096 1 [ 30366, 30503] + | ENST00000607096 ENST00000410691 1 [157784, 157887] - | ENST00000410691 ENST00000612080 1 [187891, 187958] - | ENST00000612080 ENST00000611868 1 [200880, 201017] + | ENST00000611868 ... ... ... ... . ... ENST00000516617 Y [25723342, 25723495] + | ENST00000516617 ENST00000516816 Y [25928979, 25929142] + | ENST00000516816 ENST00000515987 Y [26247384, 26247521] + | ENST00000515987 ENST00000517139 Y [26360989, 26361092] + | ENST00000517139 ENST00000620883 Y [26411059, 26411158] - | ENST00000620883 tx_biotype tx_cds_seq_start tx_cds_seq_end gene_id <character> <numeric> <numeric> <character> ENST00000619216 miRNA <NA> <NA> ENSG00000278267 ENST00000607096 miRNA <NA> <NA> ENSG00000274890 ENST00000410691 snRNA <NA> <NA> ENSG00000222623 ENST00000612080 miRNA <NA> <NA> ENSG00000273874 ENST00000611868 miRNA <NA> <NA> ENSG00000275135 ... ... ... ... ... ENST00000516617 snRNA <NA> <NA> ENSG00000252426 ENST00000516816 snRNA <NA> <NA> ENSG00000252625 ENST00000515987 snoRNA <NA> <NA> ENSG00000251796 ENST00000517139 snRNA <NA> <NA> ENSG00000252948 ENST00000620883 miRNA <NA> <NA> ENSG00000275510 ------- seqinfo: 193 sequences from GRCh38 genome
You can read the GenomicAlignments vignette and the help page for summarizeOverlaps
to figure out the remaining steps.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.