Entering edit mode
ROka
▴
10
@roka-11670
Last seen 8.5 years ago
Hi, I am trying to create a custom BSgenome package for the maize genome. I am keep on getting an error in XVector, which I do not know how to solve.
Here is what I do and what I get:
>library(BSgenome) Loading required package: BiocGenerics Loading required package: parallel Attaching package: ‘BiocGenerics’ The following objects are masked from ‘package:parallel’: clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following objects are masked from ‘package:stats’: IQR, mad, xtabs The following objects are masked from ‘package:base’: anyDuplicated, append, as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit Loading required package: S4Vectors Loading required package: stats4 Attaching package: ‘S4Vectors’ The following objects are masked from ‘package:base’: colMeans, colSums, expand.grid, rowMeans, rowSums Loading required package: IRanges Loading required package: GenomeInfoDb Loading required package: GenomicRanges Loading required package: Biostrings Loading required package: XVector Loading required package: rtracklayer > forgeBSgenomeDataPkg("seed_Zea_mays.AGPv4.txt") Creating package in ./BSgenome.Zmays.EnsemblPlants.AGPv4r32 Loading '1' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.1.fa' ... DONE Loading '2' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.2.fa' ... DONE Loading '3' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.3.fa' ... DONE Loading '4' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.4.fa' ... DONE Loading '5' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.5.fa' ... DONE Loading '6' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.6.fa' ... DONE Loading '7' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.7.fa' ... DONE Loading '8' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.8.fa' ... DONE Loading '9' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.9.fa' ... DONE Loading '10' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.10.fa' ... DONE Loading '1' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.1.fa' ... DONE Loading '2' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.2.fa' ... DONE Loading '3' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.3.fa' ... DONE Loading '4' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.4.fa' ... DONE Loading '5' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.5.fa' ... DONE Loading '6' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.6.fa' ... DONE Loading '7' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.7.fa' ... DONE Loading '8' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.8.fa' ... DONE Loading '9' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.9.fa' ... DONE Loading '10' sequence from FASTA file '~/Documents/ref/AGPv4/Zea_mays.AGPv4.dna.chromosome.10.fa' ... DONE Error in XVector:::new_XVectorList_from_list_of_XVector(tmp_class, x) : all elements in 'x' must be DNAString objects
This is the seed file content:
Package: BSgenome.Zmays.EnsemblPlants.AGPv4r32 Title: Zea mays (EnsemblPlants AGPv4 release 32) Description: Zea mays full genome as provided by EnsemblPlants (AGPv4, release 32) Version: 4.32 organism: Zea mays common_name: maize provider: EnsemblPlants provider_version: 4.32 release_date: Aug. 2016 release_name: AGPv4 source_url: ftp://ftp.ensemblgenomes.org/pub/release-32/plants/fasta/zea_mays/dna/ organism_biocview: Zea_mays BSgenomeObjname: Zmays seqs_srcdir: ~/Documents/ref/AGPv4 seqfiles_prefix: Zea_mays.AGPv4.dna.chromosome. seqfiles_suffix: .fa seqnames: paste(c(1:9, 10, paste(c(1:9, 10), sep="")), sep="")
The versions I use are:
> sessionInfo() R version 3.3.1 (2016-06-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.3 LTS locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets [8] methods base other attached packages: [1] BSgenome_1.40.1 rtracklayer_1.32.2 Biostrings_2.40.2 [4] XVector_0.12.1 GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 [7] IRanges_2.6.1 S4Vectors_0.10.3 BiocGenerics_0.18.0 loaded via a namespace (and not attached): [1] XML_3.98-1.4 Rsamtools_1.24.0 [3] bitops_1.0-6 GenomicAlignments_1.8.4 [5] zlibbioc_1.18.0 BiocParallel_1.6.6 [7] tools_3.3.1 Biobase_2.32.0 [9] RCurl_1.95-4.8 SummarizedExperiment_1.2.3
Does anyone know how to fix the problem?
Thank you in advance!
I never could make it work with reading the sequence files once, but your solution worked! Thank you!