Entering edit mode
Elizabeth Purdom
▴
210
@elizabeth-purdom-2486
Last seen 3.0 years ago
USA/ Berkeley/UC Berkeley
Hi,
Lately when I make queries via biomaRt, they are onerously long as
compared to before (last spring/early summer).
For example, I frequently pull down the following information
mart <- useMart("ensembl",dataset="hsapiens_gene_ensembl")
WHAT <- c("ensembl_gene_id","ensembl_transcript_id",
"chromosome_name",
"strand",
"exon_chrom_start",
"exon_chrom_end",
"rank",
"ensembl_exon_id",
"gene_biotype")
anno <- getBM(WHAT, mart = mart,
filters="ensembl_gene_id",values=unique(gene),verbose = FALSE)
where I sometimes filter by gene (as indicated in the above code), or
otherwise just bring it everything down. It was so fast to bring
everything down, that I just automatically reran the code so as to
make
sure I was current (When I say fast, I don't have numbers, but much
less
than 15 minutes, I'm think). Now it's very slow -- the above code
hasn't
finished after an hour (unfortunately, I don't know how many genes it
is
because its the result of processing something and I don't know the
results a priori). This has been my experience several times now (both
with and without filtering on gene id) and I know someone else in my
department has experienced the same thing without change of code.
So my question is: Is there something about this set of values that
makes it now very slow combination when it wasn't before? Would
dropping
a specific value(s) speed it up?
Thanks,
Elizabeth
--
Elizabeth Purdom
Assistant Professor
Department of Statistics
UC, Berkeley
Evans Hall, Rm 433
epurdom at stat.berkeley.edu
(510) 642-6154 (office)
(510) 642-7892 (fax)