Entering edit mode
Jack Zhu
▴
170
@jack-zhu-3338
Last seen 7.1 years ago
------ Forwarded Message
From: Sean Davis <sdavis2@mail.nih.gov>
Date: Fri, 13 Mar 2009 10:44:30 -0400
To: Wacek Kusnierczyk <waclaw.marcin.kusnierczyk@idi.ntnu.no>
Cc: Jack Zhu <zhujack@mail.nih.gov>
Subject: Re: GEO metadata
On Fri, Mar 13, 2009 at 10:25 AM, Wacek Kusnierczyk
<waclaw.marcin.kusnierczyk@idi.ntnu.no> wrote:
> Hi Jack and Sean (I've concatenated your two emails into one),
>
> Jack Zhu wrote:
>> > Hi Wacek,
>> >
>> > Thanks for your interest in using GEOmetadb. I would give you
more specific
>> > path if I knew a little bit more your tasks with GEO data since
GEOmetadb
>> > provides several ways to find and retrieve GEO data. As a general
approach,
>> > you can use GEOmetadb package with power of SQL command to find
right GEO
>> > datasets and then use GEOquery package to download or get data
into R
>> > environment.
>> >
> Thanks a lot for your prompt answer.
>
> The issue I am dealing with is that I want to find data sets (or
series)
> where there is an interesting expression profile for a small set of
> chosen genes, and then retrieve those data sets in toto. �So here
the
> filter condition is (in addition to preliminary filtering by
organism
> etc., easily done with GEOmetadb) the expression profile.�
>
>
> Now, to get at the expression profile, can I use GEOquery to get
just
> profiles, or do I need to download whole data sets and only then
examine
> profiles, possibly discarding large amounts of downloaded data? �As
far
> as I can see, GEOquery allows me to download and parse whole
experiment
> files; �what about individual gene profiles?
This is only available via Eutils or Entrez.� GEOquery downloads and
parses
data into objects that work well with Bioconductor.�
�
>
> Can I access the GEO db with an sql client directly? �I failed to
find
> information on this.
No.� In fact, the data at GEO are not in a database.� Only the
metadata,
etc., are in the database there.�
�
>
>
> Sean Davis wrote:
>> > Hi, Wacek. �Let me know if you have any problems using GEOquery.
�Also,
>> feel
>> > free to ask questions like this on the Bioconductor mailing list,
as then
>> > others benefit from the questions and answers. �As a matter of
fact, would
>> > it be OK if Jack forwards his answer to the list?
>> >
>> >
>
> Thanks for support. �I'm investigating the ways of using GEOquery
for my
> tasks. �As far as I can see, it's very helpful.
>
> It is ok to forward the exchange to the mailing lists where
relevant. �I
> need to subscribe myself.
>
> Cheers,
> vQ
>
>
>> > This is an example:
>> >
>> > ## example of finding all GSEs having 'breast cancer' in the
summary:
>> >
>> >
>>> >> source("http://bioconductor.org/biocLite.R")
>>> >> biocLite("GEOmetadb")
>>> >> library(GEOmetadb)
>>> >> getSQLiteFile()
>>> >> con <- dbConnect(SQLite(), "GEOmetadb.sqlite")
>>> >> rs <- dbGetQuery(con, paste("select gse, title from gse where
summary >>>
like
>>> >>
>> > '%breast cancer%' limit 5"))
>> >
>>> >> dbDisconnect(con)
>>> >>
>> >
>> > ## Using functions in GEOquery to download or parse data into R
objects
>> > ## getGEO � �Get a GEO object from NCBI or file
>> > ## getGEOfile � �Download a file from GEO soft file to the local
machine
>> > ## getGEOSuppFiles � �Get Supplemental Files from GEO
>> > ## biocLite("GEOquery")
>> >
>> >
>>> >> library(GEOquery)
>>> >> myGSEMatrix <- getGEO(rs$gse[1], GSEMatrix = TRUE)
>>> >>
>> > ## using a loop to get multiple
>> >
>> > Hope this helps. Sean Davis might have more tips for you. Please
let us
>> know
>> > if you have any questions.
>> >
>> > Jack
>> >
>> >
>> > On 3/13/09 7:17 AM, "Wacek Kusnierczyk"
>> > <waclaw.marcin.kusnierczyk@idi.ntnu.no> wrote:
>> >
>> >
>>> >> Dear Jack Zhu,
>>> >>
>>> >> I am using your excellent GEOmetadb Bioconductor package, it's
very
>>> >> useful.
>>> >>
>>> >> I wonder whether you could hint me how I could access the GEO
data --
>>> >> both data and metadata -- directly, if possible. �As you
remarked
>>> >> yourself, using eutils is not always convenient, and the only
>>> >> programmatic access to the expression data I found is through
>>> >> downloading a file by ftp.
>>> >>
>>> >> My issue is that I'd like to scan through a large set of
experiments
>>> >> where selected genes have interesting expression patterns, and
only then
>>> >> retrieve the whole relevant data sets. �As far as I can see,
eutils give
>>> >> access to the metadata only, and while it is possible to browse
>>> >> expression profiles on the GEO profiles site, they are not
accessible
>>> >> programmatically.
>>> >>
>>> >> Best regards,
>>> >> Wacek
>>> >>
>>> >>
>
>
> --
>
----------------------------------------------------------------------
-------->
-
> Wacek Kusnierczyk, MD PhD
>
> Email: waku@idi.ntnu.no
> Phone: +47 73591875, +47 72574609
>
> Department of Computer and Information Science (IDI)
> Faculty of Information Technology, Mathematics and Electrical
Engineering
> (IME)
> Norwegian University of Science and Technology (NTNU)
> Sem Saelands vei 7, 7491 Trondheim, Norway
> Room itv303
>
> Bioinformatics & Gene Regulation Group
> Department of Cancer Research and Molecular Medicine (IKM)
> Faculty of Medicine (DMF)
> Norwegian University of Science and Technology (NTNU)
> Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway
> Room 231.05.060
>
>
----------------------------------------------------------------------
-------->
-
>
------ End of Forwarded Message
[[alternative HTML version deleted]]