Entering edit mode
Wacek Kusnierczyk
▴
180
@wacek-kusnierczyk-88
Last seen 10.2 years ago
Hello,
The getGEO function from GEOquery parses GEO soft files. With a
particular GSE file (GSE13638), it took over 15 minutes on my
not-so-crappy machine to parse the file (a local file, download time
excluded). I've written a simple parser in perl, and parsing the same
file and storing the data in a nested hash/array structure takes ca. 2
seconds. I'm pretty sure there is more essential processing done by
getGEO to organize the data into a GSE object, but still, there seems
to
be an incredibly inefficient implementation underneath.
I haven't looked at the source code yet, but here's a question: what
is
the likely reason getGEO is so slow? Is it the parsing itself, or
rather wraping the data into the appropriate structure? Where should
I
start to look for code to be improved?
vQ