Hi,
I'm trying to write a VCF file using the VariantAnnotation package.
Some of my sites are physically phased and therefore have the GATK PGT and PID FORMAT fields (see VCF comment records below):
##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">
##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
For downstream analyses I need that only the VCF records which are physically phased to have these FORMAT fields but all other VCF records no to have that.
I can't seem to be able to set this using the geno(out.vcf)$PGT and geno(out.vcf)$PID commands - they seem only to be able to assign these fields to either all records in the VCF or none.
Any attempt to subset these gives the error:
Error in geno(out.vcf)$PGT[idx, 1] = NULL :
number of items to replace is not a multiple of replacement length
Help would be appreciated.