Using variantAnnotation to put together genotype and consequence information
0
0
Entering edit mode
@ramirobarrantes-7796
Last seen 10 weeks ago
United States

I am trying to put together genotype and consequence information from a vcf file. I am using the VariantAnnotation package. I am not totally clear if there is a better way but this is what I am doing:

#Get consequence information
vcf <- readVcf(tempFile, genome="grc38")
csq <- data.frame(parseCSQToGRanges(vcf, VCFRowID=rownames(vcf)))

#get genotype information
genotypeInfo <- c("GT", "DP", "AD")
genotype <- geno(vcf)[,genotypeInfo]
genotype <- do.call(cbind,lapply(1:length(genotype), function(g) {
    x <- genotypeInfo[i]
    data.frame(genotype[[g]])
 }))
colnames(genotype) <- genotypeInfo

csq$id <- paste0(csq$CHROM,":",csq$POS,"_",csq$REF,"/",csq$ALT)
csq[,genotypeInfo] <- genotype[csq$id,]

I know this is not the best, in particular, problems come where there is more than one allele for the same reference (i.e. GT=1/2). In that case, what I see happening in my example is that not all the genotypes are defined in the geno(vcf) command. For example for

chr3 121476455 . TACAC TACACAC,T ...
GT:GQ:GQX:DPI:AD:ADF:ADR:FT:PL 1/2:76:19:36:2,10,9:2,10,9:0,0,0:PASS:237,91,77,105,0,108

Then the genotype appears as:

chr3:121476455_TACAC/TACACAC 1/2 NA 2, 10, 9

but the one corresponding to T/TACACAC does not appear anywhere. So I am wondering if I am making this too complicated?

Thank you in advance.

variantAnn VariantAnnotation • 366 views
ADD COMMENT

Login before adding your answer.

Traffic: 847 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6