Entering edit mode
This is great! This seems to be running fine now and the results are
making more sense. I hoped the problem was something to do with not
having all my lists of RNAs in the right format, but working out how
to code the conversion was beyond me, so thanks very much for taking
the time to work that through.
Thanks!
Fiona
Dr Fiona C Ingleby
Postdoctoral Research Fellow
University of Sussex
Email: F.Ingleby@sussex.ac.uk
Website: fionaingleby.weebly.com
Tel: +44(0)1273678559
On 29 Mar 2013, at 22:11, James W. MacDonald <jmacdon@uw.edu> wrote:
> Hi Fiona,
>
> Probably the easiest way to do this is to convert the flybase_cg ids
to ensembl IDs.
>
> ## read sanger data in
> ## there is some weird cruft in line 4685, best to just remove the
thirteenth column
> dat <- read.table("v5.txt.drosophila_melanogaster", sep = "\t",
stringsAsFactors = FALSE)[,-13]
> library(drosophila.db)
> ## map flybase_cg IDs to ensembl
> x <- select(org.Dm.eg.db, gsub("-[A-Z]+", "",dat[,12]),
c("ENSEMBL"), "FLYBASECG")
> ## there are some duplicates here, but I don't think it will matter
> ## merge back together and write back out
> dat$merge <- gsub("-[A-Z]+", "",dat[,12])
> dat2 <- merge(dat, x, by.x="merge", by.y=1, all.x = TRUE)
> write.table(dat2, "v5.txt.drosophila_melanogaster2", sep = "\t",
col.names = FALSE, row.names = FALSE, quote = FALSE)
>
> ## note that I say the file is not sanger, and then tell
mirna2mrna() which columns to use.
> test <- mirna2mrna(miRNA, "v5.txt.drosophila_melanogaster2", mRNA,
"org.Dm.eg.db","drosophila2.db", FALSE, 2,14)
>
> With the truncated mRNA and miRNA probe IDs you give below, I get no
mappings, but I assume you have way more mRNA transcripts.
>
> Let me know if this works for you.
>
> Best,
>
> Jim
>
>
>
> On 3/29/2013 8:09 AM, Fiona Ingleby wrote:
>> Hi Jim,
>>
>> Thanks very much for pointing that out - it seems mirna2mrna is
exactly what I was after, I don't know how I managed to overlook it
.
>>
>> I'm a bit puzzled about the results I'm getting, however, and so if
you have a minute to think this through then I'd be really grateful.
The help pages are pretty clear, and so I've managed to get the
function to run with my data without any problems
.but I'm getting 'named list()' as output. Which might simply suggest
that there are no correlations between the miRNAs and mRNAs in my data
(?). But I'm not convinced and I'm wondering if I've done something
wrong somewhere along the way (I'm looking at 39 differentially
expressed miRNAs along with 2638 differentially expressed mRNAs, so
I'd be surprised if there were none that correlate with each other).
>>
>> I'm wondering if I'm doing something daft like using RNA IDs in the
wrong format (which might be one explanation for getting 0 matches
returned from the database?). At the moment I'm just taking character
vectors directly from the ExpressionSet. So I have 2 ExpressionSets,
each representing only the probes which are significantly
differentially expressed in each dataset - I've called these sigmRNA
(2638 x 12 samples) and sigmiRNA (39 x 12 samples) for mRNA and miRNA
respectively.
>>
>> >featureNames(sigmRNA)
>> [1] "1622906_at" "1622915_at" "1622917_a_at" "1622920_at"
"1622926_at" "1622932_s_at" "1622935_at" "1622940_at"
"1622946_at"
>> [10] "1622952_at" "1622956_at" "1622959_at" "1622960_at"
"1622965_s_at" "1622974_at" "1622975_at" "1622978_at"
"1622992_at"
>> [19] "1623002_at" "1623004_a_at" "1623008_at" "1623019_a_at"
"1623022_at" "1623025_at" "1623026_a_at" "1623030_at"
"1623031_a_at"
>>
>>
and so on for 2638 entries.
>>
>> >featureNames(sigmiRNA)
>> [1] "dme-miR-1002_st" "dme-miR-1004_st" "dme-miR-1017_st" "dme-miR-
124_st" "dme-miR-2500_st" "dme-miR-286_st"
>> [7] "dme-miR-2a_st" "dme-miR-306_st" "dme-miR-310_st" "dme-miR-
311_st" "dme-miR-312_st" "dme-miR-313_st"
>>
>>
etc. So I'm using mirna2mrna like this:
>>
>> test<-mirna2mrna(miRNAids=featureNames(sigmiRNA),
>> miRNAannot="v5.txt.drosophila_melanogaster", #downloaded from
the rbi website and saved in the working directory
>> mRNAids=featureNames(sigmRNA),
>> orgPkg="org.Dm.eg.db",chipPkg="drosophila2.db",
>> sanger=T,miRNAcol=NULL,mRNAcol=NULL,transType="ensembl")
>>
>> and then I get:
>>
>> > test
>> named list()
>>
>> I've put the sessionInfo() output at the bottom of the email. I
also looked through the source code on the Bioconductor code search
website, pulled out the 'convertIDs' function, and ran this as an
independent function on the lists of RNAs to check to see what it was
doing, but I can't see anything that looks odd to me - it removes the
'_st'/'_at' as I expected.
>>
>> So I'm a bit stuck. I'm sure I've misunderstood something, but
can't pick out what it is myself. I suppose it's totally possible that
the analysis is fine and there are just no correlations between the
miRNAs and mRNAs of interest in my data - but I thought I would check.
If you (or anyone) has any ideas, I'd really appreciate the help.
>>
>> Thanks again,
>>
>> Fiona
>>
>> Dr Fiona C Ingleby
>>
>> Postdoctoral Research Fellow
>> University of Sussex
>>
>> Email: F.Ingleby@sussex.ac.uk <mailto:f.ingleby@sussex.ac.uk>
>> Website: fionaingleby.weebly.com <http: fionaingleby.weebly.com="">
>> Tel: +44(0)1273678559
>>
>> > sessionInfo()
>> R version 2.15.2 (2012-10-26)
>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>
>> locale:
>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods
base
>>
>> other attached packages:
>> [1] drosophila2.db_2.8.1 org.Dm.eg.db_2.8.0 RSQLite_0.11.2
DBI_0.2-5 AnnotationDbi_1.20.7 Biobase_2.18.0
>> [7] BiocGenerics_0.4.0
>>
>> loaded via a namespace (and not attached):
>> [1] IRanges_1.16.6 parallel_2.15.2 stats4_2.15.2 tools_2.15.2
>>
>>
>>
>> On 28 Mar 2013, at 16:43, James W. MacDonald <jmacdon@uw.edu <mailto:jmacdon@uw.edu="">> wrote:
>>
>>> Hi Fiona,
>>>
>>> I have a function called mirna2mrna (yeah, I know, lame function
name...) in my affycoretools package that does this, based on the
sanger microcosm targets data that you can download here:
>>>
>>> http://www.ebi.ac.uk/enright-srv/microcosm/cgi-
bin/targets/v5/download.pl
>>>
>>> there is also a function makeHmap() that will create a heatmap
with the miRNA/mRNA pairs, where the color of the cells is based on
the correlation between the two RNA species (with the intent to show
negative correlations, indicating that the miRNA is hypothetically
causing premature degradation of the mRNA).
>>>
>>> I think the help pages for these two functions are reasonable, but
please let me know if you have any questions.
>>>
>>> Best,
>>>
>>> Jim
>>>
>>>
>>>
>>> On 3/28/2013 12:30 PM, Fiona Ingleby wrote:
>>>> Hi everyone,
>>>>
>>>> I am working with mRNA data from Affy 'drosophila2' arrays and
miRNA data from Affy 'mirna3' arrays. I have identified a list of
differentially expressed mRNAs and miRNAs. I'm having a bit of trouble
with some downstream analyses and I'm hoping someone might be able to
offer some help.
>>>>
>>>> I would like to use my list of differentially expressed miRNAs to
access online databases (e.g. miRBase, microRNA.org
) and extract the names of all the potential target mRNAs. Then I'd
like to use this list of mRNAs to look through my mRNA expression
data. I'm aware of packages like 'RmiR' and 'microRNA' which have
built-in functions for finding miRNA targets, but as far as I can
tell, 'RmiR' uses miRNA databases for humans only and 'microRNA' works
with human and mouse data only. So is there a package I am unaware of
(or another application of 'RmiR'/'microRNA' that I am unaware of) for
looking at drosophila data?
>>>>
>>>> So far I have also considered the 'biomaRt' package to see if the
database query function on there can help me, but I haven't had much
luck. For instance, if I try an example list of miRNAs:
>>>>
>>>> mirna<-c("dme-miR-1002","dme-miR-312","dme-miR-973")
>>>> library(biomaRt)
>>>> ensembl<-useMart("ensembl",dataset="dmelanogaster_gene_ensembl")
>>>> getBM(attributes="mirbase_accession",filters="mirbase_id",values=
mirna,mart=ensembl)
>>>>
>>>> then 'logical(0)' is returned, as if there are no records for
those miRNAs - but by searching the database manually I know the
records are there.
>>>>
>>>> Alternatively I can try:
>>>>
>>>> miRNA<- getBM(c("mirbase_accession","mirbase_id",
"ensembl_gene_id", "start_position", "chromosome_name"), filters =
c("with_mirbase"), values = list(T), mart = ensembl)
>>>>
>>>> which returns a table of various bits of information on miRNAs,
but I cannot adapt this command to just look at my list of miRNAs of
interest (ie. the 'mirna' vector above). I've included the
sessionInfo() output for these at the bottom of the email, but I
suspect my problem is more to do with the fact I'm not going about
this the right way (as opposed to a problem with package versions and
coding etc.). I'm not even sure that using 'biomaRt' will give me the
information I eventually want (the target mRNAs of these miRNAs), I
was just trying it out, to see what it was capable of in terms of
querying these databases. So I apologise for the vagueness. Since I
haven't managed to get very far by myself then it's difficult to be
more specific, but I'd really appreciate it if anyone could offer some
advice, even just to point me in the direction of a useful package
which might have gone unnoticed by me.
>>>>
>>>> Many thanks,
>>>>
>>>> Fiona
>>>>
>>>> Dr Fiona C Ingleby
>>>> Postdoctoral Research Fellow
>>>> University of Sussex
>>>> Email: F.Ingleby@sussex.ac.uk
>>>> Website: fionaingleby.weebly.com
>>>>
>>>>
>>>>> sessionInfo()
>>>> R version 2.15.2 (2012-10-26)
>>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>>>
>>>> locale:
>>>> [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods
base
>>>>
>>>> other attached packages:
>>>> [1] biomaRt_2.14.0 affy_1.36.1 Biobase_2.18.0
BiocGenerics_0.4.0
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] affyio_1.26.0 BiocInstaller_1.8.3 grid_2.15.2
lattice_0.20-14 Matrix_1.0-11 MCMCglmm_2.17
>>>> [7] preprocessCore_1.20.0 RCurl_1.95-4.1 tools_2.15.2
XML_3.95-0.2 zlibbioc_1.4.0
>>>> [[alternative HTML version deleted]]
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor@r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>> --
>>> James W. MacDonald, M.S.
>>> Biostatistician
>>> University of Washington
>>> Environmental and Occupational Health Sciences
>>> 4225 Roosevelt Way NE, # 100
>>> Seattle WA 98105-6099
>>>
>>>
>>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
[[alternative HTML version deleted]]