Version 1.18.3 of Rsamtools introduced support for parsing a PEDIGREE header line from a BCF file in the function .bcfHeaderAsSimpleList; however, I'm no longer able to parse any of the VCFs I have with a PEDIGREE header. My VCFs have headers such as
PEDIGREE=<Derived=ID1,Original=ID2>
If I use the function readVcf (from the VariantAnnotation package) I get the following error:
Error in FUN(X[[1L]], ...) : subscript out of bounds
traceback() reveals this error is originating from .bcfHeaderAsSimpleList. Is this a bug in Rsamtools or am I misunderstanding how the PEDIGREE header should be used (I've taken the PEDIGREE line directly from the VCF file format specification).
Regards,
Jonathan
I think there are a couple of problems here. SAMPLE was not parsed correctly and it looks like your SAMPLE lines are not valid. As per the specs, the value of 'description' should have enclosing quotes with a semicolon separating the 2 values. Maybe the quotes got mangled or this was a cut and paste error? If not, please show me where they came from.
I took these sample lines
##SAMPLE=<ID=Blood,Genomes=Germline,Mixture=1.,Description="Patient germline genome">
##SAMPLE=<ID=TissueSample,Genomes=Germline;Tumor,Mixture=.3;.7,Description="Patient germline genome;Patient tumor genome">
from page 18 of the 4.2 specs, http://samtools.github.io/hts-specs/VCFv4.2.pdf, and added them to the extdata/ex2.vcf sample file in VariantAnnotation.
With Rsamtools 1.19.52 and VariantAnnotation 1.13.48:
Valerie
Yes, that appears to be a cut and paste error: my actual VCF does contain correctly formatted lines. Sorry about that. Thanks for the fixes.
Jonathan