ChIPpeakAnno to find peaks nearest to miRNA

0

Entering edit mode

Paolo Kunderfranco ▴ 350

@paolo-kunderfranco-5158

Last seen 7.4 years ago

Dear All, I would like to use ChIPpeakAnno to find peaks nearest to miRNA. I loaded my bed file and created a ranged data, load mmusculus_gene_ensembl dataset through mart and annotated my peaks, and it seems ok, test.rangedData = BED2RangedData(test.bed) mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl") Annotation = getAnnotation(mart, featureType="miRNA") annotatedPeak = annotatePeakInBatch(test.rangedData, AnnotationData=Annotation) as.data.frame(annotatedPeak) <factor> <iranges> | <character> <character> <character> <numeric> <numeric> <character> MACS_peak_109 ENSMUSG00000089245 1 [54494876, 54496209] | MACS_peak_109 + ENSMUSG00000089245 54826062 54826166 upstream numeric> <numeric> <character> -331186 329853 NearestStart Now I would like to add miRNA Id as I already did when I annotated for TSS, but something goes wrong, any ideas how to solve it? library("org.Mm.eg.db") b<- addGeneIDs(annotatedPeak,"org.Mm.eg.db",c("symbol")) Error: No entrez identifier can be mapped by input data based on the feature_id_type. Please consider to use correct feature_id_type, orgAnn or annotatedPeak Thanks, Paolo > traceback() 2: stop("No entrez identifier can be mapped by input data based on the feature_id_type.\nPlease consider to use correct feature_id_type, orgAnn or annotatedPeak\n", call. = FALSE) 1: addGeneIDs(annotatedPeak, "org.Mm.eg.db", c("symbol")) > sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C [5] LC_TIME=Italian_Italy.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] targetscan.Mm.eg.db_0.5.0 BiocInstaller_1.4.7 org.Mm.eg.db_2.7.1 ChIPpeakAnno_2.4.0 [5] limma_3.12.1 org.Hs.eg.db_2.7.1 GO.db_2.7.1 RSQLite_0.11.1 [9] DBI_0.2-5 AnnotationDbi_1.18.1 BSgenome.Ecoli.NCBI.20080805_1.3.17 BSgenome_1.24.0 [13] GenomicRanges_1.8.7 Biostrings_2.24.1 IRanges_1.14.4 multtest_2.12.0 [17] Biobase_2.16.0 biomaRt_2.12.0 BiocGenerics_0.2.0 gplots_2.11.0 [21] MASS_7.3-19 KernSmooth_2.23-8 caTools_1.13 bitops_1.0-4.1 [25] gdata_2.11.0 gtools_2.7.0 loaded via a namespace (and not attached): [1] RCurl_1.91-1.1 splines_2.15.0 stats4_2.15.0 survival_2.36-14 tools_2.15.0 XML_3.9-4.1

miRNA GO BSgenome BSgenome ChIPpeakAnno miRNA GO BSgenome BSgenome ChIPpeakAnno • 2.0k views

ADD COMMENT • link updated 12.3 years ago by Julie Zhu ★ 4.3k • written 12.3 years ago by Paolo Kunderfranco ▴ 350

0

Entering edit mode

Julie Zhu ★ 4.3k

@julie-zhu-3596

Last seen 13 months ago

United States

Paolo, Could you please send us a few rows of miRNAs in annotatedPeaks? Thanks! Best regards, Julie ________________________________________ From: bioconductor-bounces@r-project.org [bioconductor- bounces@r-project.org] on behalf of Paolo Kunderfranco [paolo.kunderfranco@gmail.com] Sent: Friday, July 27, 2012 5:50 AM To: bioconductor at r-project.org Subject: [BioC] ChIPpeakAnno to find peaks nearest to miRNA Dear All, I would like to use ChIPpeakAnno to find peaks nearest to miRNA. I loaded my bed file and created a ranged data, load mmusculus_gene_ensembl dataset through mart and annotated my peaks, and it seems ok, test.rangedData = BED2RangedData(test.bed) mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl") Annotation = getAnnotation(mart, featureType="miRNA") annotatedPeak = annotatePeakInBatch(test.rangedData, AnnotationData=Annotation) as.data.frame(annotatedPeak) <factor> <iranges> | <character> <character> <character> <numeric> <numeric> <character> MACS_peak_109 ENSMUSG00000089245 1 [54494876, 54496209] | MACS_peak_109 + ENSMUSG00000089245 54826062 54826166 upstream numeric> <numeric> <character> -331186 329853 NearestStart Now I would like to add miRNA Id as I already did when I annotated for TSS, but something goes wrong, any ideas how to solve it? library("org.Mm.eg.db") b<- addGeneIDs(annotatedPeak,"org.Mm.eg.db",c("symbol")) Error: No entrez identifier can be mapped by input data based on the feature_id_type. Please consider to use correct feature_id_type, orgAnn or annotatedPeak Thanks, Paolo > traceback() 2: stop("No entrez identifier can be mapped by input data based on the feature_id_type.\nPlease consider to use correct feature_id_type, orgAnn or annotatedPeak\n", call. = FALSE) 1: addGeneIDs(annotatedPeak, "org.Mm.eg.db", c("symbol")) > sessionInfo() R version 2.15.0 (2012-03-30) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C [5] LC_TIME=Italian_Italy.1252 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] targetscan.Mm.eg.db_0.5.0 BiocInstaller_1.4.7 org.Mm.eg.db_2.7.1 ChIPpeakAnno_2.4.0 [5] limma_3.12.1 org.Hs.eg.db_2.7.1 GO.db_2.7.1 RSQLite_0.11.1 [9] DBI_0.2-5 AnnotationDbi_1.18.1 BSgenome.Ecoli.NCBI.20080805_1.3.17 BSgenome_1.24.0 [13] GenomicRanges_1.8.7 Biostrings_2.24.1 IRanges_1.14.4 multtest_2.12.0 [17] Biobase_2.16.0 biomaRt_2.12.0 BiocGenerics_0.2.0 gplots_2.11.0 [21] MASS_7.3-19 KernSmooth_2.23-8 caTools_1.13 bitops_1.0-4.1 [25] gdata_2.11.0 gtools_2.7.0 loaded via a namespace (and not attached): [1] RCurl_1.91-1.1 splines_2.15.0 stats4_2.15.0 survival_2.36-14 tools_2.15.0 XML_3.9-4.1 _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 12.3 years ago Julie Zhu ★ 4.3k

0

Entering edit mode

Hi Paolo, Because the org database do not contain the info for ENSMUSG00000089245, there will show an error by addGeneIDs. In this case, you'd better use biomaRt to get the annotation, please try, feature_ids <- unique(annotatedPeak$feature) feature_ids<-feature_ids[!is.na(feature_ids)] feature_ids<-feature_ids[feature_ids!=""] mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl") IDs2Add<-getBM(attributes=c("ensembl_gene_id","mirbase_transcript_name ","mirbase_id","mirbase_accession","external_gene_id"),filters = "ensembl_gene_id", values = feature_ids, mart=mart) duplicated_ids<-IDs2Add[duplicated(IDs2Add[,"ensembl_gene_id"]),"ensem bl_gene_id"] if(length(duplicated_ids)>0){ IDs2Add.duplicated<-IDs2Add[IDs2Add[,"ensembl_gene_id"] %in% duplicated_ids,] IDs2Add.duplicated<-condenseMatrixByColnames(as.matrix(IDs2Add .duplicated),"ensembl_gene_id") IDs2Add<-IDs2Add[!(IDs2Add[,"ensembl_gene_id"] %in% duplicated_ids),] IDs2Add<-rbind(IDs2Add,IDs2Add.duplicated) } And then merge the useful information to the annotatedPeak. If you have any questions, please let me know. Yours sincerely, Jianhong Ou jianhong.ou at umassmed.edu On Jul 27, 2012, at 9:57 AM, Zhu, Lihua (Julie) wrote: > Paolo, > > Could you please send us a few rows of miRNAs in annotatedPeaks? Thanks! > > Best regards, > > Julie > ________________________________________ > From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] on behalf of Paolo Kunderfranco [paolo.kunderfranco at gmail.com] > Sent: Friday, July 27, 2012 5:50 AM > To: bioconductor at r-project.org > Subject: [BioC] ChIPpeakAnno to find peaks nearest to miRNA > > Dear All, > I would like to use ChIPpeakAnno to find peaks nearest to miRNA. > > I loaded my bed file and created a ranged data, load > mmusculus_gene_ensembl dataset through mart and annotated my peaks, > and it seems ok, > > test.rangedData = BED2RangedData(test.bed) > mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl") > Annotation = getAnnotation(mart, featureType="miRNA") > annotatedPeak = annotatePeakInBatch(test.rangedData, AnnotationData=Annotation) > as.data.frame(annotatedPeak) > > <factor> <iranges> | <character> <character> > <character> <numeric> <numeric> <character> > MACS_peak_109 ENSMUSG00000089245 1 [54494876, 54496209] | > MACS_peak_109 + ENSMUSG00000089245 54826062 > 54826166 upstream > numeric> <numeric> <character> > -331186 329853 NearestStart > > > Now I would like to add miRNA Id as I already did when I annotated for > TSS, but something goes wrong, any ideas how to solve it? > > library("org.Mm.eg.db") > b<- addGeneIDs(annotatedPeak,"org.Mm.eg.db",c("symbol")) > Error: No entrez identifier can be mapped by input data based on the > feature_id_type. Please consider to use correct feature_id_type, > orgAnn or annotatedPeak > > > Thanks, > > Paolo > > >> traceback() > 2: stop("No entrez identifier can be mapped by input data based on the > feature_id_type.\nPlease consider to use correct feature_id_type, > orgAnn or annotatedPeak\n", > call. = FALSE) > 1: addGeneIDs(annotatedPeak, "org.Mm.eg.db", c("symbol")) >> sessionInfo() > R version 2.15.0 (2012-03-30) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 > LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C > [5] LC_TIME=Italian_Italy.1252 > > attached base packages: > [1] grid stats graphics grDevices utils datasets > methods base > > other attached packages: > [1] targetscan.Mm.eg.db_0.5.0 BiocInstaller_1.4.7 > org.Mm.eg.db_2.7.1 ChIPpeakAnno_2.4.0 > [5] limma_3.12.1 org.Hs.eg.db_2.7.1 > GO.db_2.7.1 RSQLite_0.11.1 > [9] DBI_0.2-5 AnnotationDbi_1.18.1 > BSgenome.Ecoli.NCBI.20080805_1.3.17 BSgenome_1.24.0 > [13] GenomicRanges_1.8.7 Biostrings_2.24.1 > IRanges_1.14.4 multtest_2.12.0 > [17] Biobase_2.16.0 biomaRt_2.12.0 > BiocGenerics_0.2.0 gplots_2.11.0 > [21] MASS_7.3-19 KernSmooth_2.23-8 > caTools_1.13 bitops_1.0-4.1 > [25] gdata_2.11.0 gtools_2.7.0 > > loaded via a namespace (and not attached): > [1] RCurl_1.91-1.1 splines_2.15.0 stats4_2.15.0 > survival_2.36-14 tools_2.15.0 XML_3.9-4.1 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 12.3 years ago Ou, Jianhong ★ 1.3k

0

Entering edit mode

Hello, Ok perfect now is working fine, Thanks again for your precious help, Paolo 2012/7/27 Ou, Jianhong <jianhong.ou at="" umassmed.edu="">: > Hi Paolo, > > Because the org database do not contain the info for ENSMUSG00000089245, there will show an error by addGeneIDs. > In this case, you'd better use biomaRt to get the annotation, please try, > > feature_ids <- unique(annotatedPeak$feature) > feature_ids<-feature_ids[!is.na(feature_ids)] > feature_ids<-feature_ids[feature_ids!=""] > mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl") > IDs2Add<-getBM(attributes=c("ensembl_gene_id","mirbase_transcript_na me","mirbase_id","mirbase_accession","external_gene_id"),filters = "ensembl_gene_id", values = feature_ids, mart=mart) > duplicated_ids<-IDs2Add[duplicated(IDs2Add[,"ensembl_gene_id"]),"ens embl_gene_id"] > if(length(duplicated_ids)>0){ > IDs2Add.duplicated<-IDs2Add[IDs2Add[,"ensembl_gene_id"] %in% duplicated_ids,] > IDs2Add.duplicated<-condenseMatrixByColnames(as.matrix(IDs2A dd.duplicated),"ensembl_gene_id") > IDs2Add<-IDs2Add[!(IDs2Add[,"ensembl_gene_id"] %in% duplicated_ids),] > IDs2Add<-rbind(IDs2Add,IDs2Add.duplicated) > } > > And then merge the useful information to the annotatedPeak. > > If you have any questions, please let me know. > > Yours sincerely, > > Jianhong Ou > > jianhong.ou at umassmed.edu > > > On Jul 27, 2012, at 9:57 AM, Zhu, Lihua (Julie) wrote: > >> Paolo, >> >> Could you please send us a few rows of miRNAs in annotatedPeaks? Thanks! >> >> Best regards, >> >> Julie >> ________________________________________ >> From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] on behalf of Paolo Kunderfranco [paolo.kunderfranco at gmail.com] >> Sent: Friday, July 27, 2012 5:50 AM >> To: bioconductor at r-project.org >> Subject: [BioC] ChIPpeakAnno to find peaks nearest to miRNA >> >> Dear All, >> I would like to use ChIPpeakAnno to find peaks nearest to miRNA. >> >> I loaded my bed file and created a ranged data, load >> mmusculus_gene_ensembl dataset through mart and annotated my peaks, >> and it seems ok, >> >> test.rangedData = BED2RangedData(test.bed) >> mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl") >> Annotation = getAnnotation(mart, featureType="miRNA") >> annotatedPeak = annotatePeakInBatch(test.rangedData, AnnotationData=Annotation) >> as.data.frame(annotatedPeak) >> >> <factor> <iranges> | <character> <character> >> <character> <numeric> <numeric> <character> >> MACS_peak_109 ENSMUSG00000089245 1 [54494876, 54496209] | >> MACS_peak_109 + ENSMUSG00000089245 54826062 >> 54826166 upstream >> numeric> <numeric> <character> >> -331186 329853 NearestStart >> >> >> Now I would like to add miRNA Id as I already did when I annotated for >> TSS, but something goes wrong, any ideas how to solve it? >> >> library("org.Mm.eg.db") >> b<- addGeneIDs(annotatedPeak,"org.Mm.eg.db",c("symbol")) >> Error: No entrez identifier can be mapped by input data based on the >> feature_id_type. Please consider to use correct feature_id_type, >> orgAnn or annotatedPeak >> >> >> Thanks, >> >> Paolo >> >> >>> traceback() >> 2: stop("No entrez identifier can be mapped by input data based on the >> feature_id_type.\nPlease consider to use correct feature_id_type, >> orgAnn or annotatedPeak\n", >> call. = FALSE) >> 1: addGeneIDs(annotatedPeak, "org.Mm.eg.db", c("symbol")) >>> sessionInfo() >> R version 2.15.0 (2012-03-30) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: >> [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 >> LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C >> [5] LC_TIME=Italian_Italy.1252 >> >> attached base packages: >> [1] grid stats graphics grDevices utils datasets >> methods base >> >> other attached packages: >> [1] targetscan.Mm.eg.db_0.5.0 BiocInstaller_1.4.7 >> org.Mm.eg.db_2.7.1 ChIPpeakAnno_2.4.0 >> [5] limma_3.12.1 org.Hs.eg.db_2.7.1 >> GO.db_2.7.1 RSQLite_0.11.1 >> [9] DBI_0.2-5 AnnotationDbi_1.18.1 >> BSgenome.Ecoli.NCBI.20080805_1.3.17 BSgenome_1.24.0 >> [13] GenomicRanges_1.8.7 Biostrings_2.24.1 >> IRanges_1.14.4 multtest_2.12.0 >> [17] Biobase_2.16.0 biomaRt_2.12.0 >> BiocGenerics_0.2.0 gplots_2.11.0 >> [21] MASS_7.3-19 KernSmooth_2.23-8 >> caTools_1.13 bitops_1.0-4.1 >> [25] gdata_2.11.0 gtools_2.7.0 >> >> loaded via a namespace (and not attached): >> [1] RCurl_1.91-1.1 splines_2.15.0 stats4_2.15.0 >> survival_2.36-14 tools_2.15.0 XML_3.9-4.1 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 12.3 years ago Paolo Kunderfranco ▴ 350

Login before adding your answer.