Entering edit mode
David Iles
▴
130
@david-iles-4487
Last seen 9.9 years ago
Hi,
I need to re-map the probe sequences of the Affymetrix Bovine genome
array to a recent draft sequence of the sheep genome (please, don't
ask why...). As a first step, I successfully created a new BSgenome
package from a seed file, listing individual chromosomes as 'seqnames'
and unmapped, and two multiple sequence fasta files as 'mseqnames',
as per the forgeBSgenomeDataPkg vignette (see session info below).
When calling the matchPDict() function to map the probe sequences to
the + and - strands of individual chromosomes, all went smoothly, but
the following error occurred with multiple sequences:
> runAnConScaff(bt.probes.all,
outfile="bt.probes.2.oarv3.1.unmapped.txt")
Target: strand + of Oar v3.1 sequence unmapped_scaffolds,
unmapped_contigs
>>> Finding all hits in strand + of sequence unmapped_scaffolds ...
Error in matchPDict(pdict, subject) :
please use vmatchPDict() when 'subject' is an XStringSet object
(multiple sequence)
So, I edited my script to call vmatchPDict() instead, with the
following result....
> runAnConScaff(bt.probes.all,
outfile="bt.probes.2.oarv3.1.unmapped.txt")
Target: strand + of Oar v3.1 sequence unmapped_scaffolds,
unmapped_contigs
>>> Finding all hits in strand + of sequence unmapped_scaffolds ...
Error in .local(pdict, subject, max.mismatch, min.mismatch,
with.indels, :
vmatchPDict() is not ready yet, sorry
While I can work around this by splitting the multiple sequences into
loads of small fasta files, each with a single sequence, I wondered,
will the vmatchPDict() function be ready in the not-too-distant
future?
Many thanks
Dr David Iles
School of Biology
University of Leeds
Leeds LS2 9JT
d.e.iles at leeds.ac.uk
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] BSgenome.Oaries.ISGC.Oarv3.1 BSgenome_1.26.1
Biostrings_2.26.2
[4] GenomicRanges_1.10.5 IRanges_1.16.4
BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] parallel_2.15.2 stats4_2.15.2 tools_2.15.2
>