Hi,
I'm using biomaRt
to extract from ENSEMBL amino-acid sequences together with the APPRIS transcript status. I am using the getBM
function, as I couldn't find a way to add the APPRIS status using getSequence
. However, I get an error when I request (beside the sequence) both the SwissProt ID & the APPRIS status, but not when I request only one of them.
I have an error with:
mart <- biomaRt::useMart(biomart="ENSEMBL_MART_ENSEMBL",
dataset="mmusculus_gene_ensembl",
host="www.ensembl.org")
biomaRt::getBM(values="Q80YE7",
filters="uniprot_gn",
attributes=c("uniprotswissprot", "peptide", "transcript_appris"),
mart=mart)
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 2 did not have 3 elements
but not with (edited output):
> biomaRt::getBM(values="Q80YE7",
filters="uniprot_gn",
attributes=c("uniprotswissprot", "peptide"),
mart=mart)
uniprotswissprot peptide
1 Q80YE7 MTVFRQENVDDYYDTGEELGSGQ...
2 XYENKTDVILILELR*
3 Q80YE7 MTVFRQENVDDYYDTGEELGSGQ...
> biomaRt::getBM(values="Q80YE7",
filters="uniprot_gn",
attributes=c("peptide", "transcript_appris"),
mart=mart)
peptide transcript_appris
1 MTVFRQENVDDYYDTGEELGSGQ... principal1
2 Sequence unavailable
3 XYENKTDVILILELR*
4 MTVFRQENVDDYYDTGEELGSGQ...
> biomaRt::getBM(values="Q80YE7",
+ filters="uniprot_gn",
+ attributes=c("uniprotswissprot", "transcript_appris"),
+ mart=mart)
uniprotswissprot transcript_appris
1 Q80YE7
2 Q80YE7 principal1
3
I am not quite sure what to do next, any help is appreciated.
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS
Matrix products: default
BLAS: /home/eblanc/R/R-3.5.1/lib/libRblas.so
LAPACK: /home/eblanc/R/R-3.5.1/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 AnnotationDbi_1.42.1 magrittr_1.5
[4] BiocGenerics_0.26.0 hms_0.4.2 progress_1.2.0
[7] IRanges_2.14.12 bit_1.1-14 R6_2.2.2
[10] rlang_0.2.2 httr_1.3.1 stringr_1.3.1
[13] blob_1.1.1 tools_3.5.1 parallel_3.5.1
[16] Biobase_2.40.0 DBI_1.0.0 bit64_0.9-7
[19] digest_0.6.17 assertthat_0.2.0 crayon_1.3.4
[22] S4Vectors_0.18.3 bitops_1.0-6 curl_3.2
[25] RCurl_1.95-4.11 biomaRt_2.36.1 memoise_1.1.0
[28] RSQLite_2.1.1 stringi_1.2.4 compiler_3.5.1
[31] prettyunits_1.0.2 stats4_3.5.1 XML_3.98-1.16
[34] pkgconfig_2.0.2