non-BiomaRt die
2
2
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
I tried to retrieve ensembl_gene_id and go_term for my arabidopsis thaliana gene from my gene_name list: > head(gene_name) gene_short_name 1 ANAC001 2 DCL1 3 MIR838A 4 AT1G01073 5 IQD18 6 AT1G01115 > unimart = useMart("plants_mart_20",dataset="athaliana_eg_gene") > getBM(attributes=c("ensembl_gene_id", "go_accession"),filters=c("ens embl_gene_id"),values=gene_name,mart=unimart) but got the folloeing error? I did not figure it out? is it an error from my side or from the biomart server? -- output of sessionInfo(): Error in getBM(attributes = c("ensembl_gene_id", "go_accession"), filters = c("ensembl_gene_id"), : Query ERROR: caught BioMart::Exception: non-BioMart die(): not well-formed (invalid token) at line 1, column 21728, byte 21728 at /usr/lib/perl5/XML/Parser.pm line 187 -- Sent via the guest posting facility at bioconductor.org.
biomaRt biomaRt • 2.2k views
ADD COMMENT
1
Entering edit mode
@stephen-turner-4916
Last seen 6.4 years ago
United States

I found a bunch of weird characters in my gene names and had the same problems (chicken gene names from Ensembl that had -, /, ., ', and other oddities. Fixed it with a regex. 

gene_name <- gene_name[grep("^[A-Za-z0-9]+$", gene_name, perl=TRUE)]

 

ADD COMMENT
0
Entering edit mode
Dan Tenenbaum ★ 8.2k
@dan-tenenbaum-4256
Last seen 6 months ago
United States
----- Original Message ----- > From: "Waqasuddin Khan [guest]" <guest at="" bioconductor.org=""> > To: bioconductor at r-project.org, waqasuddin at picb.ac.cn > Sent: Friday, January 24, 2014 6:42:06 PM > Subject: [BioC] non-BiomaRt die > > > I tried to retrieve ensembl_gene_id and go_term for my arabidopsis > thaliana gene from my gene_name list: > > head(gene_name) > gene_short_name > 1 ANAC001 > 2 DCL1 > 3 MIR838A > 4 AT1G01073 > 5 IQD18 > 6 AT1G01115 > > > unimart = useMart("plants_mart_20",dataset="athaliana_eg_gene") > > > getBM(attributes=c("ensembl_gene_id", > > "go_accession"),filters=c("ensembl_gene_id"),values=gene_name,mart =unimart) > > but got the folloeing error? I did not figure it out? is it an error > from my side or from the biomart server? > > -- output of sessionInfo(): > > Error in getBM(attributes = c("ensembl_gene_id", "go_accession"), > filters = c("ensembl_gene_id"), : > Query ERROR: caught BioMart::Exception: non-BioMart die(): > not well-formed (invalid token) at line 1, column 21728, byte 21728 > at /usr/lib/perl5/XML/Parser.pm line 187 > This error is happening on the server side. I know this because this is a perl error and there is no perl on the client side. The question is why it is happening. My guess is there is an invalid item in your gene_name vector. It could be a blank line, or something that's too long, or invalid characters. Things to try: tools::showNonASCII(gene_name) # non-ascii characters? max(nchar(gene_name)) # length of the longest gene name which(nchar(gene_name)==0) # which lines are blank? Dan > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6