Dear Steffen
I am downloading some snoRNA sequences using biomaRt and it seems that the column names of the final data.frame are swapped.
I am copying my code below.
Thanks for your package
David
> library (biomaRt)
>
> mart <- useDataset ("hsapiens_gene_ensembl", mart = useMart ("ensembl"))
>
> mydat <- getBM (c ("ensembl_gene_id", "gene_exon_intron"),
+ filters = "biotype",
+ values = "snoRNA",
+ mart = mart)
>
> mydat[1:3,]
ensembl_gene_id
1 CAGCCCTAAAATGGAAAAAATTTAAAATTACTTAGACAATGTGATGTCATCAAAGGAACCCTAAGTAA
2 GGGTGGTGATGAGAACCTTGTATTCTTCTGAAGAGAGGTGATGACTTAAAAACCATGCTCAATAGGATTACACTTAGGCCG
3 TCATCAGGTGGGATAATCCTTACCTGTTCCTCGTTTTGGAGGGCAGATAGAACAGGATAATTGGAGTTTGCATGATCCATGATTAATGTCTCTGTGTAATCAGGACTTGCAAACTCTGATTGTTCATATCTGAT
gene_exon_intron
1 ENSG00000201209
2 ENSG00000200801
3 ENSG00000199713
>
> sessionInfo ()
R version 3.2.0 (2015-04-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.2 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=es_ES.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomaRt_2.24.0
loaded via a namespace (and not attached):
[1] IRanges_2.2.1 DBI_0.3.1 parallel_3.2.0
[4] RCurl_1.95-4.6 Biobase_2.28.0 AnnotationDbi_1.30.1
[7] RSQLite_1.0.0 S4Vectors_0.6.0 BiocGenerics_0.14.0
[10] GenomeInfoDb_1.4.0 stats4_3.2.0 bitops_1.0-6
[13] XML_3.98-1.1
>
Hi Thomas,
This definitely looks like a case where columns aren't returned in the same order they're requested, and biomaRt by default assigned the column names incorrectly. I've got a forked version of the package here (https://github.com/grimbough/biomaRt) which tries to match the correct attributes to the column when you set
bmHeader=TRUE