seqVCF2GDS Error Converting VCF to GDS file
2
0
Entering edit mode
@winstondunnmd-21581
Last seen 5.3 years ago

Dear Bioconductor:

I am a student of SISG Module 17 and used the code to convert my VCF file to GDS file. vcffile <- "data/72S1.vcf.gz" gdsfile <- "data/72S1.gds" seqVCF2GDS(vcffile, gdsfile, fmt.import="GT", storage.option="LZMA_RA", verbose=FALSE)

The VCF file is generated from WES of human, by basespace.illumina.com using the Enrichment App. by Illumina. The VCF file contains a single patient.

I received the following error message.

Error in seqVCF2GDS(vcffile, gdsfile, fmt.import = "GT", storage.option = "LZMARA", : INFO ID 'GMAF' (Number=A) should have 0 value(s), but receives 1. FILE: C:\Users\winst\Documents\data\72S1.vcf.gz LINE: 160, COLUMN: 8, RefMinor;GMAF=C|0.04812;phyloP=-1.165;CSQT=1|DDX11L1|ENST00000456328|downstreamgenevariant,1|WASH7P|ENST00000438504|intronvariant&noncodingtranscriptvariant

Please help.

Winston Dunn

software error SeqArray seqVCF2GDS • 1.5k views
ADD COMMENT
1
Entering edit mode
@stephanie-m-gogarten-5121
Last seen 4 months ago
University of Washington

seqVCF2GDS is particular about VCF files conforming to the VCF standard. In this case it looks like the header line for "GMAF" has "Number=A", which means there should be one value per alternate allele. The file itself appears to have a row where there is no alternate allele (hence seqVCF2GDS is expecting 0 values), but there is a value provided for "GMAF". You might be able to solve this just by modifying the header, which you can do in the VCF file itself, or by saving a separate file with just the header and modifying that instead. You could then specify that alternate header in seqVCF2GDS:

hdr <- seqVCF_Header("revised_header.vcf")
gdsfile <- seqVCF2GDS(vcffile, gdsfile, header=hdr)
ADD COMMENT
0
Entering edit mode

Thank you Stephanie! The Illumina Basespace provides 2 apps for making the VCF files: the "Enrichment" and "BWA Enrichment" cost exactly the same. When I generated the VCF files with BWA Enrichment it did not cause the problem.

ADD REPLY
0
Entering edit mode
zhengx ▴ 30
@zhengx-7950
Last seen 5.3 years ago
United States

You can directly modify the header in R:

hdr <- seqVCF_Header("data/72S1.vcf.gz")
hdr$info$Number[hdr$info$ID == "GMAF"] <- "."

gdsfile <- seqVCF2GDS(vcffile, gdsfile, header=hdr)
ADD COMMENT

Login before adding your answer.

Traffic: 619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6