identifying drosophila miRNA targets
1
0
Entering edit mode
Fiona ▴ 70
@fiona-5790
Last seen 8.9 years ago
United Kingdom
Hi everyone, I am working with mRNA data from Affy 'drosophila2' arrays and miRNA data from Affy 'mirna3' arrays. I have identified a list of differentially expressed mRNAs and miRNAs. I'm having a bit of trouble with some downstream analyses and I'm hoping someone might be able to offer some help. I would like to use my list of differentially expressed miRNAs to access online databases (e.g. miRBase, microRNA.org ) and extract the names of all the potential target mRNAs. Then I'd like to use this list of mRNAs to look through my mRNA expression data. I'm aware of packages like 'RmiR' and 'microRNA' which have built-in functions for finding miRNA targets, but as far as I can tell, 'RmiR' uses miRNA databases for humans only and 'microRNA' works with human and mouse data only. So is there a package I am unaware of (or another application of 'RmiR'/'microRNA' that I am unaware of) for looking at drosophila data? So far I have also considered the 'biomaRt' package to see if the database query function on there can help me, but I haven't had much luck. For instance, if I try an example list of miRNAs: mirna<-c("dme-miR-1002","dme-miR-312","dme-miR-973") library(biomaRt) ensembl<-useMart("ensembl",dataset="dmelanogaster_gene_ensembl") getBM(attributes="mirbase_accession",filters="mirbase_id",values=mirna ,mart=ensembl) then 'logical(0)' is returned, as if there are no records for those miRNAs - but by searching the database manually I know the records are there. Alternatively I can try: miRNA <- getBM(c("mirbase_accession","mirbase_id", "ensembl_gene_id", "start_position", "chromosome_name"), filters = c("with_mirbase"), values = list(T), mart = ensembl) which returns a table of various bits of information on miRNAs, but I cannot adapt this command to just look at my list of miRNAs of interest (ie. the 'mirna' vector above). I've included the sessionInfo() output for these at the bottom of the email, but I suspect my problem is more to do with the fact I'm not going about this the right way (as opposed to a problem with package versions and coding etc.). I'm not even sure that using 'biomaRt' will give me the information I eventually want (the target mRNAs of these miRNAs), I was just trying it out, to see what it was capable of in terms of querying these databases. So I apologise for the vagueness. Since I haven't managed to get very far by myself then it's difficult to be more specific, but I'd really appreciate it if anyone could offer some advice, even just to point me in the direction of a useful package which might have gone unnoticed by me. Many thanks, Fiona Dr Fiona C Ingleby Postdoctoral Research Fellow University of Sussex Email: F.Ingleby@sussex.ac.uk Website: fionaingleby.weebly.com > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] biomaRt_2.14.0 affy_1.36.1 Biobase_2.18.0 BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] affyio_1.26.0 BiocInstaller_1.8.3 grid_2.15.2 lattice_0.20-14 Matrix_1.0-11 MCMCglmm_2.17 [7] preprocessCore_1.20.0 RCurl_1.95-4.1 tools_2.15.2 XML_3.95-0.2 zlibbioc_1.4.0 [[alternative HTML version deleted]]
miRNA affy microRNA miRNA affy microRNA • 1.4k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 8 hours ago
United States
Hi Fiona, I have a function called mirna2mrna (yeah, I know, lame function name...) in my affycoretools package that does this, based on the sanger microcosm targets data that you can download here: http://www.ebi.ac.uk/enright-srv/microcosm/cgi- bin/targets/v5/download.pl there is also a function makeHmap() that will create a heatmap with the miRNA/mRNA pairs, where the color of the cells is based on the correlation between the two RNA species (with the intent to show negative correlations, indicating that the miRNA is hypothetically causing premature degradation of the mRNA). I think the help pages for these two functions are reasonable, but please let me know if you have any questions. Best, Jim On 3/28/2013 12:30 PM, Fiona Ingleby wrote: > Hi everyone, > > I am working with mRNA data from Affy 'drosophila2' arrays and miRNA data from Affy 'mirna3' arrays. I have identified a list of differentially expressed mRNAs and miRNAs. I'm having a bit of trouble with some downstream analyses and I'm hoping someone might be able to offer some help. > > I would like to use my list of differentially expressed miRNAs to access online databases (e.g. miRBase, microRNA.org?) and extract the names of all the potential target mRNAs. Then I'd like to use this list of mRNAs to look through my mRNA expression data. I'm aware of packages like 'RmiR' and 'microRNA' which have built-in functions for finding miRNA targets, but as far as I can tell, 'RmiR' uses miRNA databases for humans only and 'microRNA' works with human and mouse data only. So is there a package I am unaware of (or another application of 'RmiR'/'microRNA' that I am unaware of) for looking at drosophila data? > > So far I have also considered the 'biomaRt' package to see if the database query function on there can help me, but I haven't had much luck. For instance, if I try an example list of miRNAs: > > mirna<-c("dme-miR-1002","dme-miR-312","dme-miR-973") > library(biomaRt) > ensembl<-useMart("ensembl",dataset="dmelanogaster_gene_ensembl") > getBM(attributes="mirbase_accession",filters="mirbase_id",values=mir na,mart=ensembl) > > then 'logical(0)' is returned, as if there are no records for those miRNAs - but by searching the database manually I know the records are there. > > Alternatively I can try: > > miRNA<- getBM(c("mirbase_accession","mirbase_id", "ensembl_gene_id", "start_position", "chromosome_name"), filters = c("with_mirbase"), values = list(T), mart = ensembl) > > which returns a table of various bits of information on miRNAs, but I cannot adapt this command to just look at my list of miRNAs of interest (ie. the 'mirna' vector above). I've included the sessionInfo() output for these at the bottom of the email, but I suspect my problem is more to do with the fact I'm not going about this the right way (as opposed to a problem with package versions and coding etc.). I'm not even sure that using 'biomaRt' will give me the information I eventually want (the target mRNAs of these miRNAs), I was just trying it out, to see what it was capable of in terms of querying these databases. So I apologise for the vagueness. Since I haven't managed to get very far by myself then it's difficult to be more specific, but I'd really appreciate it if anyone could offer some advice, even just to point me in the direction of a useful package which might have gone unnoticed by me. > > Many thanks, > > Fiona > > Dr Fiona C Ingleby > Postdoctoral Research Fellow > University of Sussex > Email: F.Ingleby at sussex.ac.uk > Website: fionaingleby.weebly.com > > >> sessionInfo() > R version 2.15.2 (2012-10-26) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] biomaRt_2.14.0 affy_1.36.1 Biobase_2.18.0 BiocGenerics_0.4.0 > > loaded via a namespace (and not attached): > [1] affyio_1.26.0 BiocInstaller_1.8.3 grid_2.15.2 lattice_0.20-14 Matrix_1.0-11 MCMCglmm_2.17 > [7] preprocessCore_1.20.0 RCurl_1.95-4.1 tools_2.15.2 XML_3.95-0.2 zlibbioc_1.4.0 > [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT
0
Entering edit mode
Hi Jim, Thanks very much for pointing that out - it seems mirna2mrna is exactly what I was after, I don't know how I managed to overlook it . I'm a bit puzzled about the results I'm getting, however, and so if you have a minute to think this through then I'd be really grateful. The help pages are pretty clear, and so I've managed to get the function to run with my data without any problems .but I'm getting 'named list()' as output. Which might simply suggest that there are no correlations between the miRNAs and mRNAs in my data (?). But I'm not convinced and I'm wondering if I've done something wrong somewhere along the way (I'm looking at 39 differentially expressed miRNAs along with 2638 differentially expressed mRNAs, so I'd be surprised if there were none that correlate with each other). I'm wondering if I'm doing something daft like using RNA IDs in the wrong format (which might be one explanation for getting 0 matches returned from the database?). At the moment I'm just taking character vectors directly from the ExpressionSet. So I have 2 ExpressionSets, each representing only the probes which are significantly differentially expressed in each dataset - I've called these sigmRNA (2638 x 12 samples) and sigmiRNA (39 x 12 samples) for mRNA and miRNA respectively. >featureNames(sigmRNA) [1] "1622906_at" "1622915_at" "1622917_a_at" "1622920_at" "1622926_at" "1622932_s_at" "1622935_at" "1622940_at" "1622946_at" [10] "1622952_at" "1622956_at" "1622959_at" "1622960_at" "1622965_s_at" "1622974_at" "1622975_at" "1622978_at" "1622992_at" [19] "1623002_at" "1623004_a_at" "1623008_at" "1623019_a_at" "1623022_at" "1623025_at" "1623026_a_at" "1623030_at" "1623031_a_at" and so on for 2638 entries. >featureNames(sigmiRNA) [1] "dme-miR-1002_st" "dme-miR-1004_st" "dme-miR-1017_st" "dme-miR- 124_st" "dme-miR-2500_st" "dme-miR-286_st" [7] "dme-miR-2a_st" "dme-miR-306_st" "dme-miR-310_st" "dme-miR- 311_st" "dme-miR-312_st" "dme-miR-313_st" etc. So I'm using mirna2mrna like this: test<-mirna2mrna(miRNAids=featureNames(sigmiRNA), miRNAannot="v5.txt.drosophila_melanogaster", #downloaded from the rbi website and saved in the working directory mRNAids=featureNames(sigmRNA), orgPkg="org.Dm.eg.db",chipPkg="drosophila2.db", sanger=T,miRNAcol=NULL,mRNAcol=NULL,transType="ensembl") and then I get: > test named list() I've put the sessionInfo() output at the bottom of the email. I also looked through the source code on the Bioconductor code search website, pulled out the 'convertIDs' function, and ran this as an independent function on the lists of RNAs to check to see what it was doing, but I can't see anything that looks odd to me - it removes the '_st'/'_at' as I expected. So I'm a bit stuck. I'm sure I've misunderstood something, but can't pick out what it is myself. I suppose it's totally possible that the analysis is fine and there are just no correlations between the miRNAs and mRNAs of interest in my data - but I thought I would check. If you (or anyone) has any ideas, I'd really appreciate the help. Thanks again, Fiona Dr Fiona C Ingleby Postdoctoral Research Fellow University of Sussex Email: F.Ingleby@sussex.ac.uk Website: fionaingleby.weebly.com Tel: +44(0)1273678559 > sessionInfo() R version 2.15.2 (2012-10-26) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] drosophila2.db_2.8.1 org.Dm.eg.db_2.8.0 RSQLite_0.11.2 DBI_0.2-5 AnnotationDbi_1.20.7 Biobase_2.18.0 [7] BiocGenerics_0.4.0 loaded via a namespace (and not attached): [1] IRanges_1.16.6 parallel_2.15.2 stats4_2.15.2 tools_2.15.2 On 28 Mar 2013, at 16:43, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Fiona, > > I have a function called mirna2mrna (yeah, I know, lame function name...) in my affycoretools package that does this, based on the sanger microcosm targets data that you can download here: > > http://www.ebi.ac.uk/enright-srv/microcosm/cgi- bin/targets/v5/download.pl > > there is also a function makeHmap() that will create a heatmap with the miRNA/mRNA pairs, where the color of the cells is based on the correlation between the two RNA species (with the intent to show negative correlations, indicating that the miRNA is hypothetically causing premature degradation of the mRNA). > > I think the help pages for these two functions are reasonable, but please let me know if you have any questions. > > Best, > > Jim > > > > On 3/28/2013 12:30 PM, Fiona Ingleby wrote: >> Hi everyone, >> >> I am working with mRNA data from Affy 'drosophila2' arrays and miRNA data from Affy 'mirna3' arrays. I have identified a list of differentially expressed mRNAs and miRNAs. I'm having a bit of trouble with some downstream analyses and I'm hoping someone might be able to offer some help. >> >> I would like to use my list of differentially expressed miRNAs to access online databases (e.g. miRBase, microRNA.org ) and extract the names of all the potential target mRNAs. Then I'd like to use this list of mRNAs to look through my mRNA expression data. I'm aware of packages like 'RmiR' and 'microRNA' which have built-in functions for finding miRNA targets, but as far as I can tell, 'RmiR' uses miRNA databases for humans only and 'microRNA' works with human and mouse data only. So is there a package I am unaware of (or another application of 'RmiR'/'microRNA' that I am unaware of) for looking at drosophila data? >> >> So far I have also considered the 'biomaRt' package to see if the database query function on there can help me, but I haven't had much luck. For instance, if I try an example list of miRNAs: >> >> mirna<-c("dme-miR-1002","dme-miR-312","dme-miR-973") >> library(biomaRt) >> ensembl<-useMart("ensembl",dataset="dmelanogaster_gene_ensembl") >> getBM(attributes="mirbase_accession",filters="mirbase_id",values=mi rna,mart=ensembl) >> >> then 'logical(0)' is returned, as if there are no records for those miRNAs - but by searching the database manually I know the records are there. >> >> Alternatively I can try: >> >> miRNA<- getBM(c("mirbase_accession","mirbase_id", "ensembl_gene_id", "start_position", "chromosome_name"), filters = c("with_mirbase"), values = list(T), mart = ensembl) >> >> which returns a table of various bits of information on miRNAs, but I cannot adapt this command to just look at my list of miRNAs of interest (ie. the 'mirna' vector above). I've included the sessionInfo() output for these at the bottom of the email, but I suspect my problem is more to do with the fact I'm not going about this the right way (as opposed to a problem with package versions and coding etc.). I'm not even sure that using 'biomaRt' will give me the information I eventually want (the target mRNAs of these miRNAs), I was just trying it out, to see what it was capable of in terms of querying these databases. So I apologise for the vagueness. Since I haven't managed to get very far by myself then it's difficult to be more specific, but I'd really appreciate it if anyone could offer some advice, even just to point me in the direction of a useful package which might have gone unnoticed by me. >> >> Many thanks, >> >> Fiona >> >> Dr Fiona C Ingleby >> Postdoctoral Research Fellow >> University of Sussex >> Email: F.Ingleby@sussex.ac.uk >> Website: fionaingleby.weebly.com >> >> >>> sessionInfo() >> R version 2.15.2 (2012-10-26) >> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> >> locale: >> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] biomaRt_2.14.0 affy_1.36.1 Biobase_2.18.0 BiocGenerics_0.4.0 >> >> loaded via a namespace (and not attached): >> [1] affyio_1.26.0 BiocInstaller_1.8.3 grid_2.15.2 lattice_0.20-14 Matrix_1.0-11 MCMCglmm_2.17 >> [7] preprocessCore_1.20.0 RCurl_1.95-4.1 tools_2.15.2 XML_3.95-0.2 zlibbioc_1.4.0 >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Hi Fiona, It's not you. The function is set up with the assumption that you will be using ensembl data, not flybase. I'll take a look and see if I can make the requisite changes. Best, Jim On 3/29/2013 8:09 AM, Fiona Ingleby wrote: > Hi Jim, > > Thanks very much for pointing that out - it seems mirna2mrna is > exactly what I was after, I don't know how I managed to overlook it?. > > I'm a bit puzzled about the results I'm getting, however, and so if > you have a minute to think this through then I'd be really > grateful. The help pages are pretty clear, and so I've managed to get > the function to run with my data without any problems?.but I'm getting > 'named list()' as output. Which might simply suggest that there are no > correlations between the miRNAs and mRNAs in my data (?). But I'm not > convinced and I'm wondering if I've done something wrong somewhere > along the way (I'm looking at 39 differentially expressed miRNAs along > with 2638 differentially expressed mRNAs, so I'd be surprised if there > were none that correlate with each other). > > I'm wondering if I'm doing something daft like using RNA IDs in the > wrong format (which might be one explanation for getting 0 matches > returned from the database?). At the moment I'm just taking character > vectors directly from the ExpressionSet. So I have 2 ExpressionSets, > each representing only the probes which are significantly > differentially expressed in each dataset - I've called these sigmRNA > (2638 x 12 samples) and sigmiRNA (39 x 12 samples) for mRNA and miRNA > respectively. > > >featureNames(sigmRNA) > [1] "1622906_at" "1622915_at" "1622917_a_at" "1622920_at" > "1622926_at" "1622932_s_at" "1622935_at" "1622940_at" "1622946_at" > [10] "1622952_at" "1622956_at" "1622959_at" "1622960_at" > "1622965_s_at" "1622974_at" "1622975_at" "1622978_at" "1622992_at" > [19] "1623002_at" "1623004_a_at" "1623008_at" "1623019_a_at" > "1623022_at" "1623025_at" "1623026_a_at" "1623030_at" "1623031_a_at" > > ?and so on for 2638 entries. > > >featureNames(sigmiRNA) > [1] "dme-miR-1002_st" "dme-miR-1004_st" "dme-miR-1017_st" > "dme-miR-124_st" "dme-miR-2500_st" "dme-miR-286_st" > [7] "dme-miR-2a_st" "dme-miR-306_st" "dme-miR-310_st" > "dme-miR-311_st" "dme-miR-312_st" "dme-miR-313_st" > > ?etc. So I'm using mirna2mrna like this: > > test<-mirna2mrna(miRNAids=featureNames(sigmiRNA), > miRNAannot="v5.txt.drosophila_melanogaster", #downloaded from the > rbi website and saved in the working directory > mRNAids=featureNames(sigmRNA), > orgPkg="org.Dm.eg.db",chipPkg="drosophila2.db", > sanger=T,miRNAcol=NULL,mRNAcol=NULL,transType="ensembl") > > and then I get: > > > test > named list() > > I've put the sessionInfo() output at the bottom of the email. I also > looked through the source code on the Bioconductor code search > website, pulled out the 'convertIDs' function, and ran this as an > independent function on the lists of RNAs to check to see what it was > doing, but I can't see anything that looks odd to me - it removes the > '_st'/'_at' as I expected. > > So I'm a bit stuck. I'm sure I've misunderstood something, but can't > pick out what it is myself. I suppose it's totally possible that the > analysis is fine and there are just no correlations between the miRNAs > and mRNAs of interest in my data - but I thought I would check. If you > (or anyone) has any ideas, I'd really appreciate the help. > > Thanks again, > > Fiona > > Dr Fiona C Ingleby > > Postdoctoral Research Fellow > University of Sussex > > Email: F.Ingleby at sussex.ac.uk <mailto:f.ingleby at="" sussex.ac.uk=""> > Website: fionaingleby.weebly.com <http: fionaingleby.weebly.com=""> > Tel: +44(0)1273678559 > > > sessionInfo() > R version 2.15.2 (2012-10-26) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] drosophila2.db_2.8.1 org.Dm.eg.db_2.8.0 RSQLite_0.11.2 > DBI_0.2-5 AnnotationDbi_1.20.7 Biobase_2.18.0 > [7] BiocGenerics_0.4.0 > > loaded via a namespace (and not attached): > [1] IRanges_1.16.6 parallel_2.15.2 stats4_2.15.2 tools_2.15.2 > > > > On 28 Mar 2013, at 16:43, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > >> Hi Fiona, >> >> I have a function called mirna2mrna (yeah, I know, lame function >> name...) in my affycoretools package that does this, based on the >> sanger microcosm targets data that you can download here: >> >> http://www.ebi.ac.uk/enright-srv/microcosm/cgi- bin/targets/v5/download.pl >> >> there is also a function makeHmap() that will create a heatmap with >> the miRNA/mRNA pairs, where the color of the cells is based on the >> correlation between the two RNA species (with the intent to show >> negative correlations, indicating that the miRNA is hypothetically >> causing premature degradation of the mRNA). >> >> I think the help pages for these two functions are reasonable, but >> please let me know if you have any questions. >> >> Best, >> >> Jim >> >> >> >> On 3/28/2013 12:30 PM, Fiona Ingleby wrote: >>> Hi everyone, >>> >>> I am working with mRNA data from Affy 'drosophila2' arrays and miRNA >>> data from Affy 'mirna3' arrays. I have identified a list of >>> differentially expressed mRNAs and miRNAs. I'm having a bit of >>> trouble with some downstream analyses and I'm hoping someone might >>> be able to offer some help. >>> >>> I would like to use my list of differentially expressed miRNAs to >>> access online databases (e.g. miRBase, microRNA.org?) and extract >>> the names of all the potential target mRNAs. Then I'd like to use >>> this list of mRNAs to look through my mRNA expression data. I'm >>> aware of packages like 'RmiR' and 'microRNA' which have built-in >>> functions for finding miRNA targets, but as far as I can tell, >>> 'RmiR' uses miRNA databases for humans only and 'microRNA' works >>> with human and mouse data only. So is there a package I am unaware >>> of (or another application of 'RmiR'/'microRNA' that I am unaware >>> of) for looking at drosophila data? >>> >>> So far I have also considered the 'biomaRt' package to see if the >>> database query function on there can help me, but I haven't had much >>> luck. For instance, if I try an example list of miRNAs: >>> >>> mirna<-c("dme-miR-1002","dme-miR-312","dme-miR-973") >>> library(biomaRt) >>> ensembl<-useMart("ensembl",dataset="dmelanogaster_gene_ensembl") >>> getBM(attributes="mirbase_accession",filters="mirbase_id",values=m irna,mart=ensembl) >>> >>> then 'logical(0)' is returned, as if there are no records for those >>> miRNAs - but by searching the database manually I know the records >>> are there. >>> >>> Alternatively I can try: >>> >>> miRNA<- getBM(c("mirbase_accession","mirbase_id", "ensembl_gene_id", >>> "start_position", "chromosome_name"), filters = c("with_mirbase"), >>> values = list(T), mart = ensembl) >>> >>> which returns a table of various bits of information on miRNAs, but >>> I cannot adapt this command to just look at my list of miRNAs of >>> interest (ie. the 'mirna' vector above). I've included the >>> sessionInfo() output for these at the bottom of the email, but I >>> suspect my problem is more to do with the fact I'm not going about >>> this the right way (as opposed to a problem with package versions >>> and coding etc.). I'm not even sure that using 'biomaRt' will give >>> me the information I eventually want (the target mRNAs of these >>> miRNAs), I was just trying it out, to see what it was capable of in >>> terms of querying these databases. So I apologise for the >>> vagueness. Since I haven't managed to get very far by myself then >>> it's difficult to be more specific, but I'd really appreciate it if >>> anyone could offer some advice, even just to point me in the >>> direction of a useful package which might have gone unnoticed by me. >>> >>> Many thanks, >>> >>> Fiona >>> >>> Dr Fiona C Ingleby >>> Postdoctoral Research Fellow >>> University of Sussex >>> Email: F.Ingleby at sussex.ac.uk >>> Website: fionaingleby.weebly.com >>> >>> >>>> sessionInfo() >>> R version 2.15.2 (2012-10-26) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] biomaRt_2.14.0 affy_1.36.1 Biobase_2.18.0 >>> BiocGenerics_0.4.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affyio_1.26.0 BiocInstaller_1.8.3 grid_2.15.2 >>> lattice_0.20-14 Matrix_1.0-11 MCMCglmm_2.17 >>> [7] preprocessCore_1.20.0 RCurl_1.95-4.1 tools_2.15.2 >>> XML_3.95-0.2 zlibbioc_1.4.0 >>> [[alternative HTML version deleted]] >>> >>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY
0
Entering edit mode
Hi Fiona, Probably the easiest way to do this is to convert the flybase_cg ids to ensembl IDs. ## read sanger data in ## there is some weird cruft in line 4685, best to just remove the thirteenth column dat <- read.table("v5.txt.drosophila_melanogaster", sep = "\t", stringsAsFactors = FALSE)[,-13] library(drosophila.db) ## map flybase_cg IDs to ensembl x <- select(org.Dm.eg.db, gsub("-[A-Z]+", "",dat[,12]), c("ENSEMBL"), "FLYBASECG") ## there are some duplicates here, but I don't think it will matter ## merge back together and write back out dat$merge <- gsub("-[A-Z]+", "",dat[,12]) dat2 <- merge(dat, x, by.x="merge", by.y=1, all.x = TRUE) write.table(dat2, "v5.txt.drosophila_melanogaster2", sep = "\t", col.names = FALSE, row.names = FALSE, quote = FALSE) ## note that I say the file is not sanger, and then tell mirna2mrna() which columns to use. test <- mirna2mrna(miRNA, "v5.txt.drosophila_melanogaster2", mRNA, "org.Dm.eg.db","drosophila2.db", FALSE, 2,14) With the truncated mRNA and miRNA probe IDs you give below, I get no mappings, but I assume you have way more mRNA transcripts. Let me know if this works for you. Best, Jim On 3/29/2013 8:09 AM, Fiona Ingleby wrote: > Hi Jim, > > Thanks very much for pointing that out - it seems mirna2mrna is > exactly what I was after, I don't know how I managed to overlook it?. > > I'm a bit puzzled about the results I'm getting, however, and so if > you have a minute to think this through then I'd be really > grateful. The help pages are pretty clear, and so I've managed to get > the function to run with my data without any problems?.but I'm getting > 'named list()' as output. Which might simply suggest that there are no > correlations between the miRNAs and mRNAs in my data (?). But I'm not > convinced and I'm wondering if I've done something wrong somewhere > along the way (I'm looking at 39 differentially expressed miRNAs along > with 2638 differentially expressed mRNAs, so I'd be surprised if there > were none that correlate with each other). > > I'm wondering if I'm doing something daft like using RNA IDs in the > wrong format (which might be one explanation for getting 0 matches > returned from the database?). At the moment I'm just taking character > vectors directly from the ExpressionSet. So I have 2 ExpressionSets, > each representing only the probes which are significantly > differentially expressed in each dataset - I've called these sigmRNA > (2638 x 12 samples) and sigmiRNA (39 x 12 samples) for mRNA and miRNA > respectively. > > >featureNames(sigmRNA) > [1] "1622906_at" "1622915_at" "1622917_a_at" "1622920_at" > "1622926_at" "1622932_s_at" "1622935_at" "1622940_at" "1622946_at" > [10] "1622952_at" "1622956_at" "1622959_at" "1622960_at" > "1622965_s_at" "1622974_at" "1622975_at" "1622978_at" "1622992_at" > [19] "1623002_at" "1623004_a_at" "1623008_at" "1623019_a_at" > "1623022_at" "1623025_at" "1623026_a_at" "1623030_at" "1623031_a_at" > > ?and so on for 2638 entries. > > >featureNames(sigmiRNA) > [1] "dme-miR-1002_st" "dme-miR-1004_st" "dme-miR-1017_st" > "dme-miR-124_st" "dme-miR-2500_st" "dme-miR-286_st" > [7] "dme-miR-2a_st" "dme-miR-306_st" "dme-miR-310_st" > "dme-miR-311_st" "dme-miR-312_st" "dme-miR-313_st" > > ?etc. So I'm using mirna2mrna like this: > > test<-mirna2mrna(miRNAids=featureNames(sigmiRNA), > miRNAannot="v5.txt.drosophila_melanogaster", #downloaded from the > rbi website and saved in the working directory > mRNAids=featureNames(sigmRNA), > orgPkg="org.Dm.eg.db",chipPkg="drosophila2.db", > sanger=T,miRNAcol=NULL,mRNAcol=NULL,transType="ensembl") > > and then I get: > > > test > named list() > > I've put the sessionInfo() output at the bottom of the email. I also > looked through the source code on the Bioconductor code search > website, pulled out the 'convertIDs' function, and ran this as an > independent function on the lists of RNAs to check to see what it was > doing, but I can't see anything that looks odd to me - it removes the > '_st'/'_at' as I expected. > > So I'm a bit stuck. I'm sure I've misunderstood something, but can't > pick out what it is myself. I suppose it's totally possible that the > analysis is fine and there are just no correlations between the miRNAs > and mRNAs of interest in my data - but I thought I would check. If you > (or anyone) has any ideas, I'd really appreciate the help. > > Thanks again, > > Fiona > > Dr Fiona C Ingleby > > Postdoctoral Research Fellow > University of Sussex > > Email: F.Ingleby at sussex.ac.uk <mailto:f.ingleby at="" sussex.ac.uk=""> > Website: fionaingleby.weebly.com <http: fionaingleby.weebly.com=""> > Tel: +44(0)1273678559 > > > sessionInfo() > R version 2.15.2 (2012-10-26) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] drosophila2.db_2.8.1 org.Dm.eg.db_2.8.0 RSQLite_0.11.2 > DBI_0.2-5 AnnotationDbi_1.20.7 Biobase_2.18.0 > [7] BiocGenerics_0.4.0 > > loaded via a namespace (and not attached): > [1] IRanges_1.16.6 parallel_2.15.2 stats4_2.15.2 tools_2.15.2 > > > > On 28 Mar 2013, at 16:43, James W. MacDonald <jmacdon at="" uw.edu=""> <mailto:jmacdon at="" uw.edu="">> wrote: > >> Hi Fiona, >> >> I have a function called mirna2mrna (yeah, I know, lame function >> name...) in my affycoretools package that does this, based on the >> sanger microcosm targets data that you can download here: >> >> http://www.ebi.ac.uk/enright-srv/microcosm/cgi- bin/targets/v5/download.pl >> >> there is also a function makeHmap() that will create a heatmap with >> the miRNA/mRNA pairs, where the color of the cells is based on the >> correlation between the two RNA species (with the intent to show >> negative correlations, indicating that the miRNA is hypothetically >> causing premature degradation of the mRNA). >> >> I think the help pages for these two functions are reasonable, but >> please let me know if you have any questions. >> >> Best, >> >> Jim >> >> >> >> On 3/28/2013 12:30 PM, Fiona Ingleby wrote: >>> Hi everyone, >>> >>> I am working with mRNA data from Affy 'drosophila2' arrays and miRNA >>> data from Affy 'mirna3' arrays. I have identified a list of >>> differentially expressed mRNAs and miRNAs. I'm having a bit of >>> trouble with some downstream analyses and I'm hoping someone might >>> be able to offer some help. >>> >>> I would like to use my list of differentially expressed miRNAs to >>> access online databases (e.g. miRBase, microRNA.org?) and extract >>> the names of all the potential target mRNAs. Then I'd like to use >>> this list of mRNAs to look through my mRNA expression data. I'm >>> aware of packages like 'RmiR' and 'microRNA' which have built-in >>> functions for finding miRNA targets, but as far as I can tell, >>> 'RmiR' uses miRNA databases for humans only and 'microRNA' works >>> with human and mouse data only. So is there a package I am unaware >>> of (or another application of 'RmiR'/'microRNA' that I am unaware >>> of) for looking at drosophila data? >>> >>> So far I have also considered the 'biomaRt' package to see if the >>> database query function on there can help me, but I haven't had much >>> luck. For instance, if I try an example list of miRNAs: >>> >>> mirna<-c("dme-miR-1002","dme-miR-312","dme-miR-973") >>> library(biomaRt) >>> ensembl<-useMart("ensembl",dataset="dmelanogaster_gene_ensembl") >>> getBM(attributes="mirbase_accession",filters="mirbase_id",values=m irna,mart=ensembl) >>> >>> then 'logical(0)' is returned, as if there are no records for those >>> miRNAs - but by searching the database manually I know the records >>> are there. >>> >>> Alternatively I can try: >>> >>> miRNA<- getBM(c("mirbase_accession","mirbase_id", "ensembl_gene_id", >>> "start_position", "chromosome_name"), filters = c("with_mirbase"), >>> values = list(T), mart = ensembl) >>> >>> which returns a table of various bits of information on miRNAs, but >>> I cannot adapt this command to just look at my list of miRNAs of >>> interest (ie. the 'mirna' vector above). I've included the >>> sessionInfo() output for these at the bottom of the email, but I >>> suspect my problem is more to do with the fact I'm not going about >>> this the right way (as opposed to a problem with package versions >>> and coding etc.). I'm not even sure that using 'biomaRt' will give >>> me the information I eventually want (the target mRNAs of these >>> miRNAs), I was just trying it out, to see what it was capable of in >>> terms of querying these databases. So I apologise for the >>> vagueness. Since I haven't managed to get very far by myself then >>> it's difficult to be more specific, but I'd really appreciate it if >>> anyone could offer some advice, even just to point me in the >>> direction of a useful package which might have gone unnoticed by me. >>> >>> Many thanks, >>> >>> Fiona >>> >>> Dr Fiona C Ingleby >>> Postdoctoral Research Fellow >>> University of Sussex >>> Email: F.Ingleby at sussex.ac.uk >>> Website: fionaingleby.weebly.com >>> >>> >>>> sessionInfo() >>> R version 2.15.2 (2012-10-26) >>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >>> >>> locale: >>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] biomaRt_2.14.0 affy_1.36.1 Biobase_2.18.0 >>> BiocGenerics_0.4.0 >>> >>> loaded via a namespace (and not attached): >>> [1] affyio_1.26.0 BiocInstaller_1.8.3 grid_2.15.2 >>> lattice_0.20-14 Matrix_1.0-11 MCMCglmm_2.17 >>> [7] preprocessCore_1.20.0 RCurl_1.95-4.1 tools_2.15.2 >>> XML_3.95-0.2 zlibbioc_1.4.0 >>> [[alternative HTML version deleted]] >>> >>> >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD REPLY

Login before adding your answer.

Traffic: 512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6