scanVcf: FORMAT 'GT' not found
1
0
Entering edit mode
seth redmond ▴ 70
@seth-redmond-5037
Last seen 10.2 years ago
I keep running into an error in my VCF files but can't seem to pinpoint where the problem is. The file has a number of missing genotypes but nothing that should be causing any problems, I don't think, and it passes vcf-validator without any problem. Completely unremarkable code and head of the file below: Has anyone encountered this before? Or has any suggestions as to what might be the issue? thanks -s > filename<-"tmpvcf.vcf.gz" > vcftab <- TabixFile(filename, index = paste(filename, "tbi", sep=".")); > vcfScan <- scanVcf(filename) trace: scanVcf(filename) trace: scanVcf(con) Error: scanVcf: record 1 field 1 FORMAT 'GT' not found path: tmpvcf.vcf.gz bash-3.2$ vcf-validator tmpvcf.vcf.gz The header tag 'reference' not present. (Not required but highly recommended.) The header tag 'contig' not present for CHROM=2R. (Not required but highly recommended.) The header tag 'contig' not present for CHROM=3L. (Not required but highly recommended.) ##fileformat=VCFv4.1 ##samtoolsVersion=0.1.18 (r982:295) ##INFO=<id=dp,number=1,type=integer,description="raw read="" depth"=""> ##INFO=<id=dp4,number=4,type=integer,description="# high-quality="" ref-="" forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> ##FORMAT=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> ##INFO=<id=mq,number=1,type=integer,description="root-mean-square mapping="" quality="" of="" covering="" reads"=""> ##INFO=<id=fq,number=1,type=float,description="phred probability="" of="" all="" samples="" being="" the="" same"=""> ##INFO=<id=af1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" frequency="" (assuming="" hwe)"=""> ##INFO=<id=ac1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" count="" (no="" hwe="" assumption)"=""> ##INFO=<id=g3,number=3,type=float,description="ml estimate="" of="" genotype="" frequencies"=""> ##INFO=<id=hwe,number=1,type=float,description="chi^2 based="" hwe="" test="" p-value="" based="" on="" g3"=""> ##INFO=<id=clr,number=1,type=integer,description="log ratio="" of="" genotype="" likelihoods="" with="" and="" without="" the="" constraint"=""> ##INFO=<id=ugt,number=1,type=string,description="the most="" probable="" unconstrained="" genotype="" configuration="" in="" the="" trio"=""> ##INFO=<id=cgt,number=1,type=string,description="the most="" probable="" constrained="" genotype="" configuration="" in="" the="" trio"=""> ##INFO=<id=pv4,number=4,type=float,description="p-values for="" strand="" bias,="" baseq="" bias,="" mapq="" bias="" and="" tail="" distance="" bias"=""> ##INFO=<id=pc2,number=2,type=integer,description="phred probability="" of="" the="" nonref="" allele="" frequency="" in="" group1="" samples="" being="" larger="" (,smaller)="" than="" in="" group2."=""> ##INFO=<id=pchi2,number=1,type=float,description="posterior weighted="" chi^2="" p-value="" for="" testing="" the="" association="" between="" group1="" and="" group2="" samples."=""> ##INFO=<id=qchi2,number=1,type=integer,description="phred scaled="" pchi2."=""> ##INFO=<id=pr,number=1,type=integer,description="# permutations="" yielding="" a="" smaller="" pchi2."=""> ##INFO=<id=vdb,number=1,type=float,description="variant distance="" bias"=""> ##FORMAT=<id=gt,number=1,type=string,description="genotype"> ##FORMAT=<id=gq,number=1,type=integer,description="genotype quality"=""> ##FORMAT=<id=gl,number=3,type=float,description="likelihoods for="" rr,ra,aa="" genotypes="" (r="ref,A=alt)""> ##FORMAT=<id=dp,number=1,type=integer,description="# high-quality="" bases"=""> ##FORMAT=<id=sp,number=1,type=integer,description="phred-scaled strand="" bias="" p-value"=""> ##FORMAT=<id=pl,number=g,type=integer,description="list of="" phred-="" scaled="" genotype="" likelihoods"=""> ##source_20121102.1=./vcf-merge -s Fd03_high.vcf.gz Fd03_low.vcf.gz Fd03_zero.vcf.gz ##sourceFiles_20121102.1=0:Fd03_high.vcf.gz,1:Fd03_low.vcf.gz,2:Fd03_z ero.vcf.gz ##INFO=<id=sf,number=.,type=string,description="source file="" (index="" to="" sourcefiles,="" f="" when="" filtered)"=""> ##INFO=<id=ac,number=.,type=integer,description="allele count="" in="" genotypes"=""> ##INFO=<id=an,number=1,type=integer,description="total number="" of="" alleles="" in="" called="" genotypes"=""> #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Fd03_high.vcf Fd03_low.vcf Fd03_zero.vcf 2R 23990061 . G A 152.33 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=3,0,2,4;DP=9;FQ=18.1;MQ=35;PV4=0.17,1,1,1;SF=0,1 ,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,2,4:48:9:121,0,45 0/1:1,3,6,5:90:15:212,0,87 0/1:2,3,7,5:99:17:214,0,103 2R 23990067 . G A 32.80 . AC1=1; AC=2;AF1=0.5;AN=4;DP4=4,1,2,3;DP=10;FQ=64.8;MQ=35;PV4=0.52,0.022,1,1;S F=0,1,2;VDB=0.0297 GT:DP4:GQ:DP:PL 0/1:4,1,2,3:95:10:92,0,106 .:6,8,2,1:.:17:20,.,. 0/1:8,8,1,4:59:21:56,0,255 2R 23990070 . T C 109.67 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=11;FQ=10.4;MQ=35;PV4=0.2,0.091,1,1;SF =0,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:40:10:104,0,37 0/1:2,3,6,6:99:17:152,0,103 0/1:2,4,7,9:95:22:163,0,92 2R 23990073 . T C 100.33 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=12;FQ=16.1;MQ=35;PV4=0.2,0.025,1,1;SF =0,1,2;VDB=0.0504 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:46:10:101,0,43 0/1:2,3,6,5:99:16:134,0,103 0/1:2,4,7,9:99:22:156,0,113 2R 23990083 . T G 99.92 . AC1=1; AC=2;AF1=0.4995;AN=4;DP4=3,3,3,0;DP=10;FQ=3.02;MQ=38;PV4=0.46,5.9e-05, 0.23,1;SF=0,1,2;VDB=0.0426 GT:GQ:DP4:DP:PL .:.:3,3,3,0:9:27,.,. 0/1:38:2,1,6,8:17:165,0,35 0/1:81:1,4,8,10:23:190,0,78 2R 23990100 . A C 114.67 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=4,2,3,1;DP=10;FQ=68;MQ=39;PV4=1,0.41,0.38,0.041; SF=0,1,2;VDB=0.0386 GT:DP4:GQ:DP:PL 0/1:4,2,3,1:98:10:95,0,141 0/1:4,5,3,6:99:18:167,0,172 0/1:4,6,3,6:99:19:172,0,185 2R 23990108 . T A 21.40 . AC1=1; AC=1;AF1=0.5;AN=2;DP4=5,2,3,2;DP=12;FQ=24;MQ=39;PV4=1,3.8e-05,1,1;SF=0 ,1,2;VDB=0.0075 GT:DP4:GQ:DP:PL 0/1:5,2,3,2:54:12:51,0,146 .:8,6,0,3:.:17:16,.,. .:5,10,1,2:.:18:1,.,. 2R 23990114 . C T 113.00 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=6,3,4,1;DP=14;FQ=81;MQ=40;PV4=1,1,0.24,1;SF=0,1, 2;VDB=0.0523 GT:DP4:GQ:DP:PL 0/1:6,3,4,1:99:14:108,0,181 0/1:4,4,3,5:99:16:166,0,147 0/1:3,4,2,7:99:16:155,0,158 2R 23990116 . A T 20.25 . AC1=1; AC=1;AF1=0.4871;AN=2;DP4=8,3,2,1;DP=14;FQ=-14.2;MQ=40;PV4=1,6e-05,0.09 3,0.25;SF=0,1,2;VDB=0.0282 GT:GQ:DP4:DP:PL .:.:8,3,2,1:14:13,.,. 0/1:40:4,9,4,1:18:38,0,204 .:.:5,10,1,1:17:0,.,. 2R 23990120 . G C 189.67 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=4,2,6,3;DP=15;FQ=103;MQ=40;PV4=1,1,0.026,1;SF=0, 1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,2,6,3:99:15:188,0,130 0/1:0,3,8,7:19:18:252,0,16 0/1:2,5,4,8:99:19:219,0,134 2R 23990143 . A C 190.67 . AC1=2; AC=6;AF1=1;AN=6;DP4=0,0,6,4;DP=11;FQ=-57;MQ=43;SF=0,1,2;VDB=0.0436 GT:DP4:GQ:DP:PL 1/1:0,0,6,4:57:10:248,30,0 1/1:0,0,3,6:51:9:212,27,0 1/1:0,0,2,7:51:9:211,27,0 2R 23990147 . A T 15.36 . AC1=1; AC=1;AF1=0.5;AN=2;DP4=5,6,2,1;DP=15;FQ=27;MQ=39;PV4=1,0.25,1,1;SF=0,1, 2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:5,6,2,1:57:14:54,0,230 .:7,5,0,2:.:14:15,.,. .:7,6,0,2:.:15:24,.,. 2R 23990163 . G A 38.03 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=2,2,2,3;DP=14;FQ=44;MQ=43;PV4=1,4e-05,0.44,0.19; SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:2,2,2,3:74:9:71,0,106 0/1:0,1,4,1:20:6:66,0,17 0/1:0,2,4,1:51:7:67,0,48 2R 23990164 . T C 24.03 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=4,5,2,3;DP=14;FQ=22;MQ=41;PV4=1,0.00033,1,0.056; SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,2,3:52:14:49,0,164 0/1:3,2,4,1:56:10:53,0,77 0/1:1,4,4,1:63:10:60,0,96 2R 23990171 . T C 74.67 . AC1=1; AC=3;AF1=0.5;AN=6;DP4=4,5,3,4;DP=16;FQ=71;MQ=41;PV4=1,6.1e-07,0.1,1;SF =0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,3,4:99:16:98,0,194 0/1:4,2,6,1:99:13:100,0,131 0/1:5,3,3,4:99:15:116,0,173 2R 23990190 . C A 27.34 . AC1=1; AC=1;AF1=0.4997;AN=2;DP4=4,6,2,2;DP=14;FQ=4.77;MQ=43;PV4=1,2.3e-09,1,0 .15;SF=0,1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:4,6,2,2:28:14:30,0,225 .:8,1,0,1:.:10:0,.,. .:12,5,2,0:.:19:0,.,. 2R 23990198 . G T 26.67 . AC1=0; AC=1;AF1=0;AN=2;DP4=6,7,2,0;DP=15;FQ=-28;MQ=44;PV4=0.47,0.0016,1,0.052 ;SF=0,1,2;VDB=0.0260 GT:GQ:DP4:DP:PL .:.:6,7,2,0:15:0,.,. .:.:6,1,1,0:8:3,.,. 0/1:55:10,2,5,1:18:52,0,200
• 2.0k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.9 years ago
United States
Hi Seth, What version of VariantAnnotation are you using? Please provide the output of sessionInfo(). I think there is a spacing problem in the file - are there true tabs between each field? Test using just the first line of the file so you can easily see/modify the tabs. I can't reproduce your error with the file output below. I may be modifying the format as I cut and paste. If looking at the spacing does not solve the problem please attach a small subset of the file - maybe just through the first 5 rows. Valerie On 12/03/2012 03:16 AM, seth redmond wrote: > I keep running into an error in my VCF files but can't seem to pinpoint where the problem is. The file has a number of missing genotypes but nothing that should be causing any problems, I don't think, and it passes vcf-validator without any problem. > Completely unremarkable code and head of the file below: > > Has anyone encountered this before? Or has any suggestions as to what might be the issue? > > thanks > > -s > >> filename<-"tmpvcf.vcf.gz" >> vcftab<- TabixFile(filename, index = paste(filename, "tbi", sep=".")); >> vcfScan<- scanVcf(filename) > trace: scanVcf(filename) > trace: scanVcf(con) > Error: scanVcf: record 1 field 1 FORMAT 'GT' not found > path: tmpvcf.vcf.gz > > bash-3.2$ vcf-validator tmpvcf.vcf.gz > The header tag 'reference' not present. (Not required but highly recommended.) > The header tag 'contig' not present for CHROM=2R. (Not required but highly recommended.) > The header tag 'contig' not present for CHROM=3L. (Not required but highly recommended.) > > ##fileformat=VCFv4.1 > ##samtoolsVersion=0.1.18 (r982:295) > ##INFO=<id=dp,number=1,type=integer,description="raw read="" depth"=""> > ##INFO=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> > ##FORMAT=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> > ##INFO=<id=mq,number=1,type=integer,description="root-mean-square mapping="" quality="" of="" covering="" reads"=""> > ##INFO=<id=fq,number=1,type=float,description="phred probability="" of="" all="" samples="" being="" the="" same"=""> > ##INFO=<id=af1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" frequency="" (assuming="" hwe)"=""> > ##INFO=<id=ac1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" count="" (no="" hwe="" assumption)"=""> > ##INFO=<id=g3,number=3,type=float,description="ml estimate="" of="" genotype="" frequencies"=""> > ##INFO=<id=hwe,number=1,type=float,description="chi^2 based="" hwe="" test="" p-value="" based="" on="" g3"=""> > ##INFO=<id=clr,number=1,type=integer,description="log ratio="" of="" genotype="" likelihoods="" with="" and="" without="" the="" constraint"=""> > ##INFO=<id=ugt,number=1,type=string,description="the most="" probable="" unconstrained="" genotype="" configuration="" in="" the="" trio"=""> > ##INFO=<id=cgt,number=1,type=string,description="the most="" probable="" constrained="" genotype="" configuration="" in="" the="" trio"=""> > ##INFO=<id=pv4,number=4,type=float,description="p-values for="" strand="" bias,="" baseq="" bias,="" mapq="" bias="" and="" tail="" distance="" bias"=""> > ##INFO=<id=pc2,number=2,type=integer,description="phred probability="" of="" the="" nonref="" allele="" frequency="" in="" group1="" samples="" being="" larger="" (,smaller)="" than="" in="" group2."=""> > ##INFO=<id=pchi2,number=1,type=float,description="posterior weighted="" chi^2="" p-value="" for="" testing="" the="" association="" between="" group1="" and="" group2="" samples."=""> > ##INFO=<id=qchi2,number=1,type=integer,description="phred scaled="" pchi2."=""> > ##INFO=<id=pr,number=1,type=integer,description="# permutations="" yielding="" a="" smaller="" pchi2."=""> > ##INFO=<id=vdb,number=1,type=float,description="variant distance="" bias"=""> > ##FORMAT=<id=gt,number=1,type=string,description="genotype"> > ##FORMAT=<id=gq,number=1,type=integer,description="genotype quality"=""> > ##FORMAT=<id=gl,number=3,type=float,description="likelihoods for="" rr,ra,aa="" genotypes="" (r="ref,A=alt)" "=""> > ##FORMAT=<id=dp,number=1,type=integer,description="# high-quality="" bases"=""> > ##FORMAT=<id=sp,number=1,type=integer,description="phred-scaled strand="" bias="" p-value"=""> > ##FORMAT=<id=pl,number=g,type=integer,description="list of="" phred-="" scaled="" genotype="" likelihoods"=""> > ##source_20121102.1=./vcf-merge -s Fd03_high.vcf.gz Fd03_low.vcf.gz Fd03_zero.vcf.gz > ##sourceFiles_20121102.1=0:Fd03_high.vcf.gz,1:Fd03_low.vcf.gz,2:Fd03 _zero.vcf.gz > ##INFO=<id=sf,number=.,type=string,description="source file="" (index="" to="" sourcefiles,="" f="" when="" filtered)"=""> > ##INFO=<id=ac,number=.,type=integer,description="allele count="" in="" genotypes"=""> > ##INFO=<id=an,number=1,type=integer,description="total number="" of="" alleles="" in="" called="" genotypes"=""> > #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Fd03_high.vcf Fd03_low.vcf Fd03_zero.vcf > 2R 23990061 . G A 152.33 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=3,0,2,4;DP=9;FQ=18.1;MQ=35;PV4=0.17,1,1,1;SF=0 ,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,2,4:48:9:121,0,45 0/1:1,3,6,5:90:15:212,0,87 0/1:2,3,7,5:99:17:214,0,103 > 2R 23990067 . G A 32.80 . AC1= 1;AC=2;AF1=0.5;AN=4;DP4=4,1,2,3;DP=10;FQ=64.8;MQ=35;PV4=0.52,0.022,1,1 ;SF=0,1,2;VDB=0.0297 GT:DP4:GQ:DP:PL 0/1:4,1,2,3:95:10:92,0,106 .:6,8,2,1:.:17:20,.,. > 0/1:8,8,1,4:59:21:56,0,255 > 2R 23990070 . T C 109.67 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=11;FQ=10.4;MQ=35;PV4=0.2,0.091,1,1; SF=0,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:40:10:104,0,37 0/1:2,3,6,6:99:17:152,0,103 0/1:2,4,7,9:95:22:163,0,92 > 2R 23990073 . T C 100.33 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=12;FQ=16.1;MQ=35;PV4=0.2,0.025,1,1; SF=0,1,2;VDB=0.0504 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:46:10:101,0,43 0/1:2,3,6,5:99:16:134,0,103 0/1:2,4,7,9:99:22:156,0,113 > 2R 23990083 . T G 99.92 . AC1= 1;AC=2;AF1=0.4995;AN=4;DP4=3,3,3,0;DP=10;FQ=3.02;MQ=38;PV4=0.46,5.9e-0 5,0.23,1;SF=0,1,2;VDB=0.0426 GT:GQ:DP4:DP:PL .:.:3,3,3,0:9:27,.,. 0/1:38:2,1,6,8:17:165,0,35 0/1:81:1,4,8,10:23:190,0,78 > 2R 23990100 . A C 114.67 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=4,2,3,1;DP=10;FQ=68;MQ=39;PV4=1,0.41,0.38,0.04 1;SF=0,1,2;VDB=0.0386 GT:DP4:GQ:DP:PL 0/1:4,2,3,1:98:10:95,0,141 0/1:4,5,3,6:99:18:167,0,172 0/1:4,6,3,6:99:19:172,0,185 > 2R 23990108 . T A 21.40 . AC1= 1;AC=1;AF1=0.5;AN=2;DP4=5,2,3,2;DP=12;FQ=24;MQ=39;PV4=1,3.8e-05,1,1;SF =0,1,2;VDB=0.0075 GT:DP4:GQ:DP:PL 0/1:5,2,3,2:54:12:51,0,146 .:8,6,0,3:.:17:16,.,. > .:5,10,1,2:.:18:1,.,. > 2R 23990114 . C T 113.00 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=6,3,4,1;DP=14;FQ=81;MQ=40;PV4=1,1,0.24,1;SF=0, 1,2;VDB=0.0523 GT:DP4:GQ:DP:PL 0/1:6,3,4,1:99:14:108,0,181 0/1:4,4,3,5:99:16:166,0,147 0/1:3,4,2,7:99:16:155,0,158 > 2R 23990116 . A T 20.25 . AC1= 1;AC=1;AF1=0.4871;AN=2;DP4=8,3,2,1;DP=14;FQ=-14.2;MQ=40;PV4=1,6e-05,0. 093,0.25;SF=0,1,2;VDB=0.0282 GT:GQ:DP4:DP:PL .:.:8,3,2,1:14:13,.,. 0/1:40:4,9,4,1:18:38,0,204 .:.:5,10,1,1:17:0,.,. > 2R 23990120 . G C 189.67 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=4,2,6,3;DP=15;FQ=103;MQ=40;PV4=1,1,0.026,1;SF= 0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,2,6,3:99:15:188,0,130 0/1:0,3,8,7:19:18:252,0,16 0/1:2,5,4,8:99:19:219,0,134 > 2R 23990143 . A C 190.67 . AC1= 2;AC=6;AF1=1;AN=6;DP4=0,0,6,4;DP=11;FQ=-57;MQ=43;SF=0,1,2;VDB=0.0436 GT:DP4:GQ:DP:PL 1/1:0,0,6,4:57:10:248,30,0 1/1:0,0,3,6:51:9:212,27,0 1/1:0,0,2,7:51:9:211,27,0 > 2R 23990147 . A T 15.36 . AC1= 1;AC=1;AF1=0.5;AN=2;DP4=5,6,2,1;DP=15;FQ=27;MQ=39;PV4=1,0.25,1,1;SF=0, 1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:5,6,2,1:57:14:54,0,230 .:7,5,0,2:.:14:15,.,. > .:7,6,0,2:.:15:24,.,. > 2R 23990163 . G A 38.03 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=2,2,2,3;DP=14;FQ=44;MQ=43;PV4=1,4e-05,0.44,0.1 9;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:2,2,2,3:74:9:71,0,106 0/1:0,1,4,1:20:6:66,0,17 0/1:0,2,4,1:51:7:67,0,48 > 2R 23990164 . T C 24.03 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=4,5,2,3;DP=14;FQ=22;MQ=41;PV4=1,0.00033,1,0.05 6;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,2,3:52:14:49,0,164 0/1:3,2,4,1:56:10:53,0,77 0/1:1,4,4,1:63:10:60,0,96 > 2R 23990171 . T C 74.67 . AC1= 1;AC=3;AF1=0.5;AN=6;DP4=4,5,3,4;DP=16;FQ=71;MQ=41;PV4=1,6.1e-07,0.1,1; SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,3,4:99:16:98,0,194 0/1:4,2,6,1:99:13:100,0,131 0/1:5,3,3,4:99:15:116,0,173 > 2R 23990190 . C A 27.34 . AC1= 1;AC=1;AF1=0.4997;AN=2;DP4=4,6,2,2;DP=14;FQ=4.77;MQ=43;PV4=1,2.3e-09,1 ,0.15;SF=0,1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:4,6,2,2:28:14:30,0,225 .:8,1,0,1:.:10:0,.,. .:12,5,2,0:.:19:0,.,. > 2R 23990198 . G T 26.67 . AC1= 0;AC=1;AF1=0;AN=2;DP4=6,7,2,0;DP=15;FQ=-28;MQ=44;PV4=0.47,0.0016,1,0.0 52;SF=0,1,2;VDB=0.0260 GT:GQ:DP4:DP:PL .:.:6,7,2,0:15:0,.,. .:.:6,1,1,0:8:3,.,. > 0/1:55:10,2,5,1:18:52,0,200 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Urgh, yeah I'd checked the tabs between the columns a hundred times, but I hadn't checked for trailing tabs in the header. thanks for the nudge? -s On 3 Dec 2012, at 18:20, Valerie Obenchain wrote: > Hi Seth, > > What version of VariantAnnotation are you using? Please provide the output of sessionInfo(). > > I think there is a spacing problem in the file - are there true tabs between each field? Test using just the first line of the file so you can easily see/modify the tabs. > > I can't reproduce your error with the file output below. I may be modifying the format as I cut and paste. If looking at the spacing does not solve the problem please attach a small subset of the file - maybe just through the first 5 rows. > > > Valerie > > On 12/03/2012 03:16 AM, seth redmond wrote: >> I keep running into an error in my VCF files but can't seem to pinpoint where the problem is. The file has a number of missing genotypes but nothing that should be causing any problems, I don't think, and it passes vcf-validator without any problem. >> Completely unremarkable code and head of the file below: >> >> Has anyone encountered this before? Or has any suggestions as to what might be the issue? >> >> thanks >> >> -s >> >>> filename<-"tmpvcf.vcf.gz" >>> vcftab<- TabixFile(filename, index = paste(filename, "tbi", sep=".")); >>> vcfScan<- scanVcf(filename) >> trace: scanVcf(filename) >> trace: scanVcf(con) >> Error: scanVcf: record 1 field 1 FORMAT 'GT' not found >> path: tmpvcf.vcf.gz >> >> bash-3.2$ vcf-validator tmpvcf.vcf.gz >> The header tag 'reference' not present. (Not required but highly recommended.) >> The header tag 'contig' not present for CHROM=2R. (Not required but highly recommended.) >> The header tag 'contig' not present for CHROM=3L. (Not required but highly recommended.) >> >> ##fileformat=VCFv4.1 >> ##samtoolsVersion=0.1.18 (r982:295) >> ##INFO=<id=dp,number=1,type=integer,description="raw read="" depth"=""> >> ##INFO=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> >> ##FORMAT=<id=dp4,number=4,type=integer,description="# high-quality="" ref-forward="" bases,="" ref-reverse,="" alt-forward="" and="" alt-reverse="" bases"=""> >> ##INFO=<id=mq,number=1,type=integer,description="root-mean-square mapping="" quality="" of="" covering="" reads"=""> >> ##INFO=<id=fq,number=1,type=float,description="phred probability="" of="" all="" samples="" being="" the="" same"=""> >> ##INFO=<id=af1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" frequency="" (assuming="" hwe)"=""> >> ##INFO=<id=ac1,number=1,type=float,description="max-likelihood estimate="" of="" the="" first="" alt="" allele="" count="" (no="" hwe="" assumption)"=""> >> ##INFO=<id=g3,number=3,type=float,description="ml estimate="" of="" genotype="" frequencies"=""> >> ##INFO=<id=hwe,number=1,type=float,description="chi^2 based="" hwe="" test="" p-value="" based="" on="" g3"=""> >> ##INFO=<id=clr,number=1,type=integer,description="log ratio="" of="" genotype="" likelihoods="" with="" and="" without="" the="" constraint"=""> >> ##INFO=<id=ugt,number=1,type=string,description="the most="" probable="" unconstrained="" genotype="" configuration="" in="" the="" trio"=""> >> ##INFO=<id=cgt,number=1,type=string,description="the most="" probable="" constrained="" genotype="" configuration="" in="" the="" trio"=""> >> ##INFO=<id=pv4,number=4,type=float,description="p-values for="" strand="" bias,="" baseq="" bias,="" mapq="" bias="" and="" tail="" distance="" bias"=""> >> ##INFO=<id=pc2,number=2,type=integer,description="phred probability="" of="" the="" nonref="" allele="" frequency="" in="" group1="" samples="" being="" larger="" (,smaller)="" than="" in="" group2."=""> >> ##INFO=<id=pchi2,number=1,type=float,description="posterior weighted="" chi^2="" p-value="" for="" testing="" the="" association="" between="" group1="" and="" group2="" samples."=""> >> ##INFO=<id=qchi2,number=1,type=integer,description="phred scaled="" pchi2."=""> >> ##INFO=<id=pr,number=1,type=integer,description="# permutations="" yielding="" a="" smaller="" pchi2."=""> >> ##INFO=<id=vdb,number=1,type=float,description="variant distance="" bias"=""> >> ##FORMAT=<id=gt,number=1,type=string,description="genotype"> >> ##FORMAT=<id=gq,number=1,type=integer,description="genotype quality"=""> >> ##FORMAT=<id=gl,number=3,type=float,description="likelihoods for="" rr,ra,aa="" genotypes="" (r="ref,A=alt)" "=""> >> ##FORMAT=<id=dp,number=1,type=integer,description="# high-quality="" bases"=""> >> ##FORMAT=<id=sp,number=1,type=integer,description="phred-scaled strand="" bias="" p-value"=""> >> ##FORMAT=<id=pl,number=g,type=integer,description="list of="" phred-="" scaled="" genotype="" likelihoods"=""> >> ##source_20121102.1=./vcf-merge -s Fd03_high.vcf.gz Fd03_low.vcf.gz Fd03_zero.vcf.gz >> ##sourceFiles_20121102.1=0:Fd03_high.vcf.gz,1:Fd03_low.vcf.gz,2:Fd0 3_zero.vcf.gz >> ##INFO=<id=sf,number=.,type=string,description="source file="" (index="" to="" sourcefiles,="" f="" when="" filtered)"=""> >> ##INFO=<id=ac,number=.,type=integer,description="allele count="" in="" genotypes"=""> >> ##INFO=<id=an,number=1,type=integer,description="total number="" of="" alleles="" in="" called="" genotypes"=""> >> #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Fd03_high.vcf Fd03_low.vcf Fd03_zero.vcf >> 2R 23990061 . G A 152.33 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=3,0,2,4;DP=9;FQ=18.1;MQ=35;PV4=0.17,1,1,1;SF= 0,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,2,4:48:9:121,0,45 0/1:1,3,6,5:90:15:212,0,87 0/1:2,3,7,5:99:17:214,0,103 >> 2R 23990067 . G A 32.80 . AC1 =1;AC=2;AF1=0.5;AN=4;DP4=4,1,2,3;DP=10;FQ=64.8;MQ=35;PV4=0.52,0.022,1, 1;SF=0,1,2;VDB=0.0297 GT:DP4:GQ:DP:PL 0/1:4,1,2,3:95:10:92,0,106 .:6,8,2,1:.:17:20,.,. >> 0/1:8,8,1,4:59:21:56,0,255 >> 2R 23990070 . T C 109.67 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=11;FQ=10.4;MQ=35;PV4=0.2,0.091,1,1 ;SF=0,1,2;VDB=0.0474 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:40:10:104,0,37 0/1:2,3,6,6:99:17:152,0,103 0/1:2,4,7,9:95:22:163,0,92 >> 2R 23990073 . T C 100.33 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=3,0,3,4;DP=12;FQ=16.1;MQ=35;PV4=0.2,0.025,1,1 ;SF=0,1,2;VDB=0.0504 GT:DP4:GQ:DP:PL 0/1:3,0,3,4:46:10:101,0,43 0/1:2,3,6,5:99:16:134,0,103 0/1:2,4,7,9:99:22:156,0,113 >> 2R 23990083 . T G 99.92 . AC1 =1;AC=2;AF1=0.4995;AN=4;DP4=3,3,3,0;DP=10;FQ=3.02;MQ=38;PV4=0.46,5.9e- 05,0.23,1;SF=0,1,2;VDB=0.0426 GT:GQ:DP4:DP:PL .:.:3,3,3,0:9:27,.,. 0/1:38:2,1,6,8:17:165,0,35 0/1:81:1,4,8,10:23:190,0,78 >> 2R 23990100 . A C 114.67 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=4,2,3,1;DP=10;FQ=68;MQ=39;PV4=1,0.41,0.38,0.0 41;SF=0,1,2;VDB=0.0386 GT:DP4:GQ:DP:PL 0/1:4,2,3,1:98:10:95,0,141 0/1:4,5,3,6:99:18:167,0,172 0/1:4,6,3,6:99:19:172,0,185 >> 2R 23990108 . T A 21.40 . AC1 =1;AC=1;AF1=0.5;AN=2;DP4=5,2,3,2;DP=12;FQ=24;MQ=39;PV4=1,3.8e-05,1,1;S F=0,1,2;VDB=0.0075 GT:DP4:GQ:DP:PL 0/1:5,2,3,2:54:12:51,0,146 .:8,6,0,3:.:17:16,.,. >> .:5,10,1,2:.:18:1,.,. >> 2R 23990114 . C T 113.00 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=6,3,4,1;DP=14;FQ=81;MQ=40;PV4=1,1,0.24,1;SF=0 ,1,2;VDB=0.0523 GT:DP4:GQ:DP:PL 0/1:6,3,4,1:99:14:108,0,181 0/1:4,4,3,5:99:16:166,0,147 0/1:3,4,2,7:99:16:155,0,158 >> 2R 23990116 . A T 20.25 . AC1 =1;AC=1;AF1=0.4871;AN=2;DP4=8,3,2,1;DP=14;FQ=-14.2;MQ=40;PV4=1,6e-05,0 .093,0.25;SF=0,1,2;VDB=0.0282 GT:GQ:DP4:DP:PL .:.:8,3,2,1:14:13,.,. 0/1:40:4,9,4,1:18:38,0,204 .:.:5,10,1,1:17:0,.,. >> 2R 23990120 . G C 189.67 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=4,2,6,3;DP=15;FQ=103;MQ=40;PV4=1,1,0.026,1;SF =0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,2,6,3:99:15:188,0,130 0/1:0,3,8,7:19:18:252,0,16 0/1:2,5,4,8:99:19:219,0,134 >> 2R 23990143 . A C 190.67 . AC1 =2;AC=6;AF1=1;AN=6;DP4=0,0,6,4;DP=11;FQ=-57;MQ=43;SF=0,1,2;VDB=0.0436 GT:DP4:GQ:DP:PL 1/1:0,0,6,4:57:10:248,30,0 1/1:0,0,3,6:51:9:212,27,0 1/1:0,0,2,7:51:9:211,27,0 >> 2R 23990147 . A T 15.36 . AC1 =1;AC=1;AF1=0.5;AN=2;DP4=5,6,2,1;DP=15;FQ=27;MQ=39;PV4=1,0.25,1,1;SF=0 ,1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:5,6,2,1:57:14:54,0,230 .:7,5,0,2:.:14:15,.,. >> .:7,6,0,2:.:15:24,.,. >> 2R 23990163 . G A 38.03 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=2,2,2,3;DP=14;FQ=44;MQ=43;PV4=1,4e-05,0.44,0. 19;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:2,2,2,3:74:9:71,0,106 0/1:0,1,4,1:20:6:66,0,17 0/1:0,2,4,1:51:7:67,0,48 >> 2R 23990164 . T C 24.03 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=4,5,2,3;DP=14;FQ=22;MQ=41;PV4=1,0.00033,1,0.0 56;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,2,3:52:14:49,0,164 0/1:3,2,4,1:56:10:53,0,77 0/1:1,4,4,1:63:10:60,0,96 >> 2R 23990171 . T C 74.67 . AC1 =1;AC=3;AF1=0.5;AN=6;DP4=4,5,3,4;DP=16;FQ=71;MQ=41;PV4=1,6.1e-07,0.1,1 ;SF=0,1,2;VDB=0.0532 GT:DP4:GQ:DP:PL 0/1:4,5,3,4:99:16:98,0,194 0/1:4,2,6,1:99:13:100,0,131 0/1:5,3,3,4:99:15:116,0,173 >> 2R 23990190 . C A 27.34 . AC1 =1;AC=1;AF1=0.4997;AN=2;DP4=4,6,2,2;DP=14;FQ=4.77;MQ=43;PV4=1,2.3e-09, 1,0.15;SF=0,1,2;VDB=0.0352 GT:DP4:GQ:DP:PL 0/1:4,6,2,2:28:14:30,0,225 .:8,1,0,1:.:10:0,.,. .:12,5,2,0:.:19:0,.,. >> 2R 23990198 . G T 26.67 . AC1 =0;AC=1;AF1=0;AN=2;DP4=6,7,2,0;DP=15;FQ=-28;MQ=44;PV4=0.47,0.0016,1,0. 052;SF=0,1,2;VDB=0.0260 GT:GQ:DP4:DP:PL .:.:6,7,2,0:15:0,.,. .:.:6,1,1,0:8:3,.,. >> 0/1:55:10,2,5,1:18:52,0,200 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD REPLY

Login before adding your answer.

Traffic: 935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6