possible bug in getBM{biomaRt}
1
0
Entering edit mode
@teresa-colombo-3383
Last seen 10.2 years ago
Dear list, it looks like there may be a bug in function 'getBM' affecting the use of attribute 'validated': ---------------------------------------------------------------------- ---------------------------------------------- >library(biomaRt) >mart = useMart("ensembl") >ensembl = useDataset("hsapiens_gene_ensembl", mart = mart); >library(hgu133plus2.db); >EID <- toTable(hgu133plus2ENTREZID); >I <- sample(dim(EID)[1],100) >p <- EID[I,"probe_id"]; > head(p) [1] "205953_at" "214718_at" "233633_at" "241572_at" "224704_at" [6] "221510_s_at" >Q <- getBM(attributes=c("affy_hg_u133_plus_2","entrezgene","validated"),fil ters="affy_hg_u133_plus_2",mart=ensembl,values=p); V1 1 Query ERROR: caught BioMart::Exception::Usage: Attributes from multiple attribute pages are not allowed Errore in getBM(attributes = c("affy_hg_u133_plus_2", "entrezgene", "validated"), : Number of columns in the query result doesn't equal number of attributes in query. This is probably an internal error, please report. ---------------------------------------------------------------------- ---------------------------------------------- The above error message disappears when running the same query after having removed "validated" from the list of attributes. ---------------------------------------------------------------------- ---------------------------------------------- >Q <- getBM(attributes=c("affy_hg_u133_plus_2","entrezgene"),filters="affy_h g_u133_plus_2",mart=ensembl,values=p); >head(Q) affy_hg_u133_plus_2 entrezgene 1 1552538_a_at NA 2 1552538_a_at 221458 3 1554485_s_at 140738 4 1555097_a_at 5737 5 1564015_at NA 6 1564198_a_at 118611 >sessionInfo() R version 2.8.1 (2008-12-22) i486-pc-linux-gnu locale: LC_CTYPE=it_IT.UTF-8;LC_NUMERIC=C;LC_TIME=it_IT.UTF-8;LC_COLLATE=it_I T.UTF-8;LC_MONETARY=C;LC_MESSAGES=it_IT.UTF-8;LC_PAPER=it_IT.UTF-8;LC_ NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENT IFICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets methods [8] base other attached packages: [1] hgu133plus2.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4 [4] AnnotationDbi_1.4.3 Biobase_2.2.2 biomaRt_1.16.0 loaded via a namespace (and not attached): [1] RCurl_0.94-1 XML_2.1-0 ---------------------------------------------------------------------- ---------------------------------------------- Best wishes teresa ____________________________ Teresa Colombo, PhD Student Prof G. Macino Lab Dept of Cellular Biotechnology and Hematology 'Sapienza' University of Rome Phone/Fax: +39 06 4457731 [[alternative HTML version deleted]]
hgu133plus2 hgu133plus2 • 1.4k views
ADD COMMENT
0
Entering edit mode
@rhoda-kinsella-3200
Last seen 10.2 years ago
Hi Teresa, The Ensembl mart attributes are set up as 4 separate categories or "pages". If you take a look at the martview interface you will see that these categories are: Features, Homologs, Structures, Sequences and Variations. At present, it is not possible to mix attributes from multiple sections as you will get the error message you received (i.e. in your query the "validated" attribute came from the "Variations" section and the rest of the attributes came from the "Features" section). The way around this is to perform two separate queries; one to select the features attributes and one to retrieve the variations attribute. You may also be able to link to two separate datasets, one for the validated part of the query and the other for the features part of the query and pull out all the information you need. I'm not sure how this is done using biomaRt, but perhaps someone else from the mailing list can help you to do this. I hope that helps, Regards, Rhoda On 3 Apr 2009, at 15:09, Teresa Colombo wrote: > Dear list, > > it looks like there may be a bug in function 'getBM' affecting the > use of > attribute 'validated': > > -------------------------------------------------------------------- ------------------------------------------------ >> library(biomaRt) >> mart = useMart("ensembl") >> ensembl = useDataset("hsapiens_gene_ensembl", mart = mart); > >> library(hgu133plus2.db); >> EID <- toTable(hgu133plus2ENTREZID); >> I <- sample(dim(EID)[1],100) >> p <- EID[I,"probe_id"]; >> head(p) > [1] "205953_at" "214718_at" "233633_at" "241572_at" > "224704_at" > [6] "221510_s_at" > > >> Q <- > getBM > (attributes > = > c > ("affy_hg_u133_plus_2 > ","entrezgene > ","validated"),filters="affy_hg_u133_plus_2",mart=ensembl,values=p); > > V1 > 1 Query ERROR: caught BioMart::Exception::Usage: Attributes from > multiple > attribute pages are not allowed > Errore in getBM(attributes = c("affy_hg_u133_plus_2", "entrezgene", > "validated"), : > Number of columns in the query result doesn't equal number of > attributes > in query. This is probably an internal error, please report. > > > -------------------------------------------------------------------- ------------------------------------------------ > The above error message disappears when running the same query after > having > removed "validated" from the list of attributes. > -------------------------------------------------------------------- ------------------------------------------------ >> Q <- > getBM > (attributes > = > c > ("affy_hg_u133_plus_2 > ","entrezgene"),filters="affy_hg_u133_plus_2",mart=ensembl,values=p); >> head(Q) > affy_hg_u133_plus_2 entrezgene > 1 1552538_a_at NA > 2 1552538_a_at 221458 > 3 1554485_s_at 140738 > 4 1555097_a_at 5737 > 5 1564015_at NA > 6 1564198_a_at 118611 > > >> sessionInfo() > R version 2.8.1 (2008-12-22) > i486-pc-linux-gnu > > locale: > LC_CTYPE > = > it_IT > .UTF > -8 > ;LC_NUMERIC > = > C > ;LC_TIME > = > it_IT > .UTF > -8 > ;LC_COLLATE > = > it_IT > .UTF > -8 > ;LC_MONETARY > = > C > ;LC_MESSAGES > = > it_IT > .UTF > -8 > ;LC_PAPER > = > it_IT > .UTF > -8 > ;LC_NAME > = > C > ;LC_ADDRESS > =C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTIFICATION=C > > attached base packages: > [1] tools stats graphics grDevices utils datasets > methods > [8] base > > other attached packages: > [1] hgu133plus2.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4 > [4] AnnotationDbi_1.4.3 Biobase_2.2.2 biomaRt_1.16.0 > > loaded via a namespace (and not attached): > [1] RCurl_0.94-1 XML_2.1-0 > -------------------------------------------------------------------- ------------------------------------------------ > > > Best wishes > teresa > ____________________________ > Teresa Colombo, PhD Student > Prof G. Macino Lab > Dept of Cellular Biotechnology and Hematology > 'Sapienza' University of Rome > Phone/Fax: +39 06 4457731 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor Rhoda Kinsella Ph.D. Ensembl Bioinformatician, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD, UK.
ADD COMMENT
0
Entering edit mode
Dear Rhoda, Teresa if I try that suggestion, by: library("biomaRt") mart = useMart("ensembl") ensembl = useDataset("hsapiens_gene_ensembl", mart = mart); library("hgu133plus2.db") EID = toTable(hgu133plus2ENTREZID) set.seed(0xbadbeef) I = sample(nrow(EID), 100) p = EID[I,"probe_id"] Q1=getBM(attributes=c("affy_hg_u133_plus_2","entrezgene"),filters="aff y_hg_u133_plus_2",mart=ensembl,values=p) Q2=getBM(attributes=c("affy_hg_u133_plus_2","validated"),filters="affy _hg_u133_plus_2",mart=ensembl,values=p) I get: Error in getBM(attributes = c("affy_hg_u133_plus_2", "validated"), filters = "affy_hg_u133_plus_2", : Query ERROR: caught BioMart::Exception::Usage: Attributes from multiple attribute pages are not allowed (Note that this is the same error that Teresa encountered, just that in more recent versions of biomaRt the error message is more informative.) OTOH, if I only do Q3=getBM(attributes=c("validated"),filters="affy_hg_u133_plus_2",mart= ensembl,values=p) this returns successfully, but then Q3 has 27 rows and there seems no way to find out to which of the filter values the rows belong to - i.e., it is useless. So what's the best way to proceed when one wants to query across multiple pages like in Teresa's example? Best wishes Wolfgang ---------------------------------------------------- Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber > sessionInfo() R version 2.10.0 Under development (unstable) (2009-04-02 r48271) x86_64-unknown-linux-gnu locale: LC_CTYPE=C;LC_NUMERIC=C;LC_TIME=C;LC_COLLATE=C;LC_MONETARY=C;LC_MESSAG ES=la_AU.UTF-8;LC_PAPER=C;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEA SUREMENT=C;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] hgu133plus2.db_2.2.11 RSQLite_0.7-1 DBI_0.2-4 [4] AnnotationDbi_1.5.23 Biobase_2.3.11 biomaRt_1.99.8 [7] fortunes_1.3-6 loaded via a namespace (and not attached): [1] RCurl_0.94-1 XML_2.3-0 Rhoda Kinsella wrote: > Hi Teresa, > The Ensembl mart attributes are set up as 4 separate categories or > "pages". If you take a look at the martview interface you will see that > these categories are: Features, Homologs, Structures, Sequences and > Variations. At present, it is not possible to mix attributes from > multiple sections as you will get the error message you received (i.e. > in your query the "validated" attribute came from the "Variations" > section and the rest of the attributes came from the "Features" > section). The way around this is to perform two separate queries; one > to select the features attributes and one to retrieve the variations > attribute. You may also be able to link to two separate datasets, one > for the validated part of the query and the other for the features part > of the query and pull out all the information you need. I'm not sure how > this is done using biomaRt, but perhaps someone else from the mailing > list can help you to do this. > I hope that helps, > Regards, > Rhoda > > > On 3 Apr 2009, at 15:09, Teresa Colombo wrote: > >> Dear list, >> >> it looks like there may be a bug in function 'getBM' affecting the use of >> attribute 'validated': >> >> ------------------------------------------------------------------- ------------------------------------------------- >> >>> library(biomaRt) >>> mart = useMart("ensembl") >>> ensembl = useDataset("hsapiens_gene_ensembl", mart = mart); >> >>> library(hgu133plus2.db); >>> EID <- toTable(hgu133plus2ENTREZID); >>> I <- sample(dim(EID)[1],100) >>> p <- EID[I,"probe_id"]; >>> head(p) >> [1] "205953_at" "214718_at" "233633_at" "241572_at" "224704_at" >> [6] "221510_s_at" >> >> >>> Q <- >> getBM(attributes=c("affy_hg_u133_plus_2","entrezgene","validated"), filters="affy_hg_u133_plus_2",mart=ensembl,values=p); >> >> >> >> V1 >> 1 Query ERROR: caught BioMart::Exception::Usage: Attributes from multiple >> attribute pages are not allowed >> Errore in getBM(attributes = c("affy_hg_u133_plus_2", "entrezgene", >> "validated"), : >> Number of columns in the query result doesn't equal number of attributes >> in query. This is probably an internal error, please report. >> >> >> ------------------------------------------------------------------- ------------------------------------------------- >> >> The above error message disappears when running the same query after >> having >> removed "validated" from the list of attributes. >> ------------------------------------------------------------------- ------------------------------------------------- >> >>> Q <- >> getBM(attributes=c("affy_hg_u133_plus_2","entrezgene"),filters="aff y_hg_u133_plus_2",mart=ensembl,values=p); >> >>> head(Q) >> affy_hg_u133_plus_2 entrezgene >> 1 1552538_a_at NA >> 2 1552538_a_at 221458 >> 3 1554485_s_at 140738 >> 4 1555097_a_at 5737 >> 5 1564015_at NA >> 6 1564198_a_at 118611 >> >> >>> sessionInfo() >> R version 2.8.1 (2008-12-22) >> i486-pc-linux-gnu >> >> locale: >> LC_CTYPE=it_IT.UTF-8;LC_NUMERIC=C;LC_TIME=it_IT.UTF-8;LC_COLLATE=it _IT.UTF-8;LC_MONETARY=C;LC_MESSAGES=it_IT.UTF-8;LC_PAPER=it_IT.UTF-8;L C_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDE NTIFICATION=C >> >> >> attached base packages: >> [1] tools stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] hgu133plus2.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4 >> [4] AnnotationDbi_1.4.3 Biobase_2.2.2 biomaRt_1.16.0 >> >> loaded via a namespace (and not attached): >> [1] RCurl_0.94-1 XML_2.1-0 >> ------------------------------------------------------------------- ------------------------------------------------- >> >> >> >> Best wishes >> teresa >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Rhoda Kinsella Ph.D. > Ensembl Bioinformatician, > European Bioinformatics Institute (EMBL-EBI), > Wellcome Trust Genome Campus, > Hinxton > Cambridge CB10 1SD, > UK.
ADD REPLY
0
Entering edit mode
Hi Wolfgang and Teresa, In the query below, Q1 and Q2 are still accessing the same mart (mart=ensembl) so this will try to get the attributes from multiple pages of the same mart. You will need to set up two distinct mart datasets (mart1 and mart2) and as Ensembl is gene centric, you will also have to include something for the two datasets to link on (e.g. Ensembl gene ID or Ensembl transcript ID as these are present on the two pages that you are trying to access). I think you need to do something like this: > library(biomaRt) > mart1 = useMart("ensembl") > mart2 = useMart("ensembl") > ensembl1 = useDataset("hsapiens_gene_ensembl",mart=mart1); Checking attributes and filters ... ok > ensembl2 = useDataset("hsapiens_gene_ensembl",mart=mart2); Checking attributes and filters ... ok > getBM(attributes=c("affy_hg_u133_plus_2","entrezgene", "ensembl_gene_id"), filters="affy_hg_u133_plus_2",mart=ensembl1,values="205953_at"); affy_hg_u133_plus_2 entrezgene ensembl_gene_id 1 205953_at 9860 ENSG00000198799 > getBM(attributes=c("validated", "ensembl_gene_id"),filters="affy_hg_u133_plus_2",mart=ensembl2, values="205953_at"); validated ensembl_gene_id 1 ENSG00000198799 2 hapmap ENSG00000198799 3 cluster,freq,submitter,doublehit,hapmap ENSG00000198799 4 freq,hapmap ENSG00000198799 5 cluster,freq,submitter,hapmap ENSG00000198799 6 freq,doublehit,hapmap ENSG00000198799 7 cluster,doublehit ENSG00000198799 8 cluster ENSG00000198799 9 cluster,freq,hapmap ENSG00000198799 10 cluster,freq,doublehit,hapmap ENSG00000198799 11 cluster,freq,doublehit ENSG00000198799 12 freq ENSG00000198799 13 doublehit ENSG00000198799 14 cluster,hapmap ENSG00000198799 15 freq,submitter,hapmap ENSG00000198799 I am sure that there is a more elegant way of doing the linking using biomaRt. I hope this helps but if anything is not clear, please don't hesitate to get in touch. Regards, Rhoda > Dear Rhoda, Teresa > > if I try that suggestion, by: > > > library("biomaRt") > mart = useMart("ensembl") > ensembl = useDataset("hsapiens_gene_ensembl", mart = mart); > > library("hgu133plus2.db") > EID = toTable(hgu133plus2ENTREZID) > set.seed(0xbadbeef) > I = sample(nrow(EID), 100) > p = EID[I,"probe_id"] > > Q1 > = > getBM > (attributes > = > c > ("affy_hg_u133_plus_2 > ","entrezgene"),filters="affy_hg_u133_plus_2",mart=ensembl,values=p) > Q2 > = > getBM > (attributes > = > c > ("affy_hg_u133_plus_2 > ","validated"),filters="affy_hg_u133_plus_2",mart=ensembl,values=p) > > > I get: > > Error in getBM(attributes = c("affy_hg_u133_plus_2", "validated"), > filters = "affy_hg_u133_plus_2", : > Query ERROR: caught BioMart::Exception::Usage: Attributes from > multiple attribute pages are not allowed > > > (Note that this is the same error that Teresa encountered, just that > in more recent versions of biomaRt the error message is more > informative.) > > OTOH, if I only do > > Q3 > = > getBM > (attributes > =c("validated"),filters="affy_hg_u133_plus_2",mart=ensembl,values=p) > > this returns successfully, but then Q3 has 27 rows and there seems > no way to find out to which of the filter values the rows belong to > - i.e., it is useless. > > So what's the best way to proceed when one wants to query across > multiple pages like in Teresa's example? > > Best wishes > Wolfgang > > ---------------------------------------------------- > Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber > > > > sessionInfo() > R version 2.10.0 Under development (unstable) (2009-04-02 r48271) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE > = > C > ;LC_NUMERIC > = > C > ;LC_TIME > = > C > ;LC_COLLATE > = > C > ;LC_MONETARY > = > C > ;LC_MESSAGES > = > la_AU > .UTF > -8 > ;LC_PAPER > = > C > ;LC_NAME > =C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C;LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices datasets utils methods base > > other attached packages: > [1] hgu133plus2.db_2.2.11 RSQLite_0.7-1 DBI_0.2-4 > [4] AnnotationDbi_1.5.23 Biobase_2.3.11 biomaRt_1.99.8 > [7] fortunes_1.3-6 > > loaded via a namespace (and not attached): > [1] RCurl_0.94-1 XML_2.3-0 > > > > Rhoda Kinsella wrote: >> Hi Teresa, >> The Ensembl mart attributes are set up as 4 separate categories or >> "pages". If you take a look at the martview interface you will see >> that these categories are: Features, Homologs, Structures, >> Sequences and Variations. At present, it is not possible to mix >> attributes from multiple sections as you will get the error message >> you received (i.e. in your query the "validated" attribute came >> from the "Variations" section and the rest of the attributes came >> from the "Features" section). The way around this is to perform >> two separate queries; one to select the features attributes and one >> to retrieve the variations attribute. You may also be able to link >> to two separate datasets, one for the validated part of the query >> and the other for the features part of the query and pull out all >> the information you need. I'm not sure how this is done using >> biomaRt, but perhaps someone else from the mailing list can help >> you to do this. >> I hope that helps, >> Regards, >> Rhoda >> On 3 Apr 2009, at 15:09, Teresa Colombo wrote: >>> Dear list, >>> >>> it looks like there may be a bug in function 'getBM' affecting the >>> use of >>> attribute 'validated': >>> >>> ------------------------------------------------------------------ -------------------------------------------------- >>>> library(biomaRt) >>>> mart = useMart("ensembl") >>>> ensembl = useDataset("hsapiens_gene_ensembl", mart = mart); >>> >>>> library(hgu133plus2.db); >>>> EID <- toTable(hgu133plus2ENTREZID); >>>> I <- sample(dim(EID)[1],100) >>>> p <- EID[I,"probe_id"]; >>>> head(p) >>> [1] "205953_at" "214718_at" "233633_at" "241572_at" >>> "224704_at" >>> [6] "221510_s_at" >>> >>> >>>> Q <- >>> getBM >>> (attributes >>> = >>> c >>> ("affy_hg_u133_plus_2 >>> ","entrezgene >>> ","validated"),filters="affy_hg_u133_plus_2",mart=ensembl,values=p); >>> >>> V1 >>> 1 Query ERROR: caught BioMart::Exception::Usage: Attributes from >>> multiple >>> attribute pages are not allowed >>> Errore in getBM(attributes = c("affy_hg_u133_plus_2", "entrezgene", >>> "validated"), : >>> Number of columns in the query result doesn't equal number of >>> attributes >>> in query. This is probably an internal error, please report. >>> >>> >>> ------------------------------------------------------------------ -------------------------------------------------- >>> The above error message disappears when running the same query >>> after having >>> removed "validated" from the list of attributes. >>> ------------------------------------------------------------------ -------------------------------------------------- >>>> Q <- >>> getBM >>> (attributes >>> = >>> c >>> ("affy_hg_u133_plus_2 >>> ","entrezgene >>> "),filters="affy_hg_u133_plus_2",mart=ensembl,values=p); >>>> head(Q) >>> affy_hg_u133_plus_2 entrezgene >>> 1 1552538_a_at NA >>> 2 1552538_a_at 221458 >>> 3 1554485_s_at 140738 >>> 4 1555097_a_at 5737 >>> 5 1564015_at NA >>> 6 1564198_a_at 118611 >>> >>> >>>> sessionInfo() >>> R version 2.8.1 (2008-12-22) >>> i486-pc-linux-gnu >>> >>> locale: >>> LC_CTYPE >>> = >>> it_IT >>> .UTF >>> -8 >>> ;LC_NUMERIC >>> = >>> C >>> ;LC_TIME >>> = >>> it_IT >>> .UTF >>> -8 >>> ;LC_COLLATE >>> = >>> it_IT >>> .UTF >>> -8 >>> ;LC_MONETARY >>> = >>> C >>> ;LC_MESSAGES >>> = >>> it_IT >>> .UTF >>> -8 >>> ;LC_PAPER >>> = >>> it_IT >>> .UTF >>> -8 >>> ;LC_NAME >>> = >>> C >>> ;LC_ADDRESS >>> =C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] tools stats graphics grDevices utils datasets >>> methods >>> [8] base >>> >>> other attached packages: >>> [1] hgu133plus2.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4 >>> [4] AnnotationDbi_1.4.3 Biobase_2.2.2 biomaRt_1.16.0 >>> >>> loaded via a namespace (and not attached): >>> [1] RCurl_0.94-1 XML_2.1-0 >>> ------------------------------------------------------------------ -------------------------------------------------- >>> >>> >>> Best wishes >>> teresa >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >> Rhoda Kinsella Ph.D. >> Ensembl Bioinformatician, >> European Bioinformatics Institute (EMBL-EBI), >> Wellcome Trust Genome Campus, >> Hinxton >> Cambridge CB10 1SD, >> UK. Rhoda Kinsella Ph.D. Ensembl Bioinformatician, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton Cambridge CB10 1SD, UK. [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Dear Rhoda & Teresa thanks! That clarifies - the two query results are linked by "ensembl_gene_id". There is no need to open two different connections in biomaRt, so example code including merging and some clean up could look like: library("biomaRt") mart = useMart("ensembl") ensembl = useDataset("hsapiens_gene_ensembl", mart = mart) p = c("200641_s_at" , "229411_at" , "223376_s_at") Q1 = getBM(attributes = c("affy_hg_u133_plus_2","entrezgene","ensembl_gene_id"), filters = "affy_hg_u133_plus_2", mart=ensembl, values=p) Q2 = getBM(attributes=c("validated","ensembl_gene_id"), filters="affy_hg_u133_plus_2", mart=ensembl, values=p) ## clean up: ## Q1 = subset(Q1, !is.na(entrezgene)) Q = merge(Q1,Q2) Best wishes Wolfgang ---------------------------------------------------- Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber Rhoda Kinsella wrote: > Hi Wolfgang and Teresa, > In the query below, Q1 and Q2 are still accessing the same mart > (mart=ensembl) so this will try to get the attributes from multiple > pages of the same mart. You will need to set up two distinct mart > datasets (mart1 and mart2) and as Ensembl is gene centric, you will also > have to include something for the two datasets to link on (e.g. Ensembl > gene ID or Ensembl transcript ID as these are present on the two pages > that you are trying to access). I think you need to do something like this: > > > library(biomaRt) > > mart1 = useMart("ensembl") > > mart2 = useMart("ensembl") > > ensembl1 = useDataset("hsapiens_gene_ensembl",mart=mart1); > Checking attributes and filters ... ok > > ensembl2 = useDataset("hsapiens_gene_ensembl",mart=mart2); > Checking attributes and filters ... ok > > getBM(attributes=c("affy_hg_u133_plus_2","entrezgene", > "ensembl_gene_id"), > filters="affy_hg_u133_plus_2",mart=ensembl1,values="205953_at"); > > affy_hg_u133_plus_2 entrezgene ensembl_gene_id > 1 205953_at 9860 ENSG00000198799 > > > getBM(attributes=c("validated", > "ensembl_gene_id"),filters="affy_hg_u133_plus_2",mart=ensembl2, > values="205953_at"); > validated ensembl_gene_id > 1 ENSG00000198799 > 2 hapmap ENSG00000198799 > 3 cluster,freq,submitter,doublehit,hapmap ENSG00000198799 > 4 freq,hapmap ENSG00000198799 > 5 cluster,freq,submitter,hapmap ENSG00000198799 > 6 freq,doublehit,hapmap ENSG00000198799 > 7 cluster,doublehit ENSG00000198799 > 8 cluster ENSG00000198799 > 9 cluster,freq,hapmap ENSG00000198799 > 10 cluster,freq,doublehit,hapmap ENSG00000198799 > 11 cluster,freq,doublehit ENSG00000198799 > 12 freq ENSG00000198799 > 13 doublehit ENSG00000198799 > 14 cluster,hapmap ENSG00000198799 > 15 freq,submitter,hapmap ENSG00000198799 > > I am sure that there is a more elegant way of doing the linking using > biomaRt. I hope this helps but if anything is not clear, please don't > hesitate to get in touch. > Regards, > Rhoda > > >> Dear Rhoda, Teresa >> >> if I try that suggestion, by: >> >> >> library("biomaRt") >> mart = useMart("ensembl") >> ensembl = useDataset("hsapiens_gene_ensembl", mart = mart); >> >> library("hgu133plus2.db") >> EID = toTable(hgu133plus2ENTREZID) >> set.seed(0xbadbeef) >> I = sample(nrow(EID), 100) >> p = EID[I,"probe_id"] >> >> Q1=getBM(attributes=c("affy_hg_u133_plus_2","entrezgene"),filters=" affy_hg_u133_plus_2",mart=ensembl,values=p) >> Q2=getBM(attributes=c("affy_hg_u133_plus_2","validated"),filters="a ffy_hg_u133_plus_2",mart=ensembl,values=p) >> >> >> I get: >> >> Error in getBM(attributes = c("affy_hg_u133_plus_2", "validated"), >> filters = "affy_hg_u133_plus_2", : >> Query ERROR: caught BioMart::Exception::Usage: Attributes from >> multiple attribute pages are not allowed >> >> >> (Note that this is the same error that Teresa encountered, just that >> in more recent versions of biomaRt the error message is more informative.) >> >> OTOH, if I only do >> >> Q3=getBM(attributes=c("validated"),filters="affy_hg_u133_plus_2",ma rt=ensembl,values=p) >> >> this returns successfully, but then Q3 has 27 rows and there seems no >> way to find out to which of the filter values the rows belong to - >> i.e., it is useless. >> >> So what's the best way to proceed when one wants to query across >> multiple pages like in Teresa's example? >> >> Best wishes >> Wolfgang >> >> ---------------------------------------------------- >> Wolfgang Huber, EMBL-EBI, http://www.ebi.ac.uk/huber >> >> >> > sessionInfo() >> R version 2.10.0 Under development (unstable) (2009-04-02 r48271) >> x86_64-unknown-linux-gnu >> >> locale: >> LC_CTYPE=C;LC_NUMERIC=C;LC_TIME=C;LC_COLLATE=C;LC_MONETARY=C;LC_MES SAGES=la_AU.UTF-8;LC_PAPER=C;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_ MEASUREMENT=C;LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices datasets utils methods base >> >> other attached packages: >> [1] hgu133plus2.db_2.2.11 RSQLite_0.7-1 DBI_0.2-4 >> [4] AnnotationDbi_1.5.23 Biobase_2.3.11 biomaRt_1.99.8 >> [7] fortunes_1.3-6 >> >> loaded via a namespace (and not attached): >> [1] RCurl_0.94-1 XML_2.3-0 >> >> >> >> Rhoda Kinsella wrote: >>> Hi Teresa, >>> The Ensembl mart attributes are set up as 4 separate categories or >>> "pages". If you take a look at the martview interface you will see >>> that these categories are: Features, Homologs, Structures, Sequences >>> and Variations. At present, it is not possible to mix attributes from >>> multiple sections as you will get the error message you received >>> (i.e. in your query the "validated" attribute came from the >>> "Variations" section and the rest of the attributes came from the >>> "Features" section). The way around this is to perform two separate >>> queries; one to select the features attributes and one to retrieve >>> the variations attribute. You may also be able to link to two >>> separate datasets, one for the validated part of the query and the >>> other for the features part of the query and pull out all the >>> information you need. I'm not sure how this is done using biomaRt, >>> but perhaps someone else from the mailing list can help you to do this. >>> I hope that helps, >>> Regards, >>> Rhoda >>> On 3 Apr 2009, at 15:09, Teresa Colombo wrote: >>>> Dear list, >>>> >>>> it looks like there may be a bug in function 'getBM' affecting the >>>> use of >>>> attribute 'validated': >>>> >>>> ----------------------------------------------------------------- --------------------------------------------------- >>>> >>>>> library(biomaRt) >>>>> mart = useMart("ensembl") >>>>> ensembl = useDataset("hsapiens_gene_ensembl", mart = mart); >>>> >>>>> library(hgu133plus2.db); >>>>> EID <- toTable(hgu133plus2ENTREZID); >>>>> I <- sample(dim(EID)[1],100) >>>>> p <- EID[I,"probe_id"]; >>>>> head(p) >>>> [1] "205953_at" "214718_at" "233633_at" "241572_at" "224704_at" >>>> [6] "221510_s_at" >>>> >>>> >>>>> Q <- >>>> getBM(attributes=c("affy_hg_u133_plus_2","entrezgene","validated" ),filters="affy_hg_u133_plus_2",mart=ensembl,values=p); >>>> >>>> >>>> V1 >>>> 1 Query ERROR: caught BioMart::Exception::Usage: Attributes from >>>> multiple >>>> attribute pages are not allowed >>>> Errore in getBM(attributes = c("affy_hg_u133_plus_2", "entrezgene", >>>> "validated"), : >>>> Number of columns in the query result doesn't equal number of attributes >>>> in query. This is probably an internal error, please report. >>>> >>>> >>>> ----------------------------------------------------------------- --------------------------------------------------- >>>> >>>> The above error message disappears when running the same query after >>>> having >>>> removed "validated" from the list of attributes. >>>> ----------------------------------------------------------------- --------------------------------------------------- >>>> >>>>> Q <- >>>> getBM(attributes=c("affy_hg_u133_plus_2","entrezgene"),filters="a ffy_hg_u133_plus_2",mart=ensembl,values=p); >>>> >>>>> head(Q) >>>> affy_hg_u133_plus_2 entrezgene >>>> 1 1552538_a_at NA >>>> 2 1552538_a_at 221458 >>>> 3 1554485_s_at 140738 >>>> 4 1555097_a_at 5737 >>>> 5 1564015_at NA >>>> 6 1564198_a_at 118611 >>>> >>>> >>>>> sessionInfo() >>>> R version 2.8.1 (2008-12-22) >>>> i486-pc-linux-gnu >>>> >>>> locale: >>>> LC_CTYPE=it_IT.UTF-8;LC_NUMERIC=C;LC_TIME=it_IT.UTF-8;LC_COLLATE= it_IT.UTF-8;LC_MONETARY=C;LC_MESSAGES=it_IT.UTF-8;LC_PAPER=it_IT.UTF-8 ;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=it_IT.UTF-8;LC_I DENTIFICATION=C >>>> >>>> >>>> attached base packages: >>>> [1] tools stats graphics grDevices utils datasets methods >>>> [8] base >>>> >>>> other attached packages: >>>> [1] hgu133plus2.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4 >>>> [4] AnnotationDbi_1.4.3 Biobase_2.2.2 biomaRt_1.16.0 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] RCurl_0.94-1 XML_2.1-0 >>>> ----------------------------------------------------------------- --------------------------------------------------- >>>> >>>> >>>> >>>> Best wishes >>>> teresa >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at stat.math.ethz.ch <mailto:bioconductor at="" stat.math.ethz.ch=""> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> Rhoda Kinsella Ph.D. >>> Ensembl Bioinformatician, >>> European Bioinformatics Institute (EMBL-EBI), >>> Wellcome Trust Genome Campus, >>> Hinxton >>> Cambridge CB10 1SD, >>> UK. > > Rhoda Kinsella Ph.D. > Ensembl Bioinformatician, > European Bioinformatics Institute (EMBL-EBI), > Wellcome Trust Genome Campus, > Hinxton > Cambridge CB10 1SD, > UK. >
ADD REPLY

Login before adding your answer.

Traffic: 770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6