filtering a VCF file based on genotype
1
0
Entering edit mode
Bogdan ▴ 670
@bogdan-2367
Last seen 14 months ago
Palo Alto, CA, USA

Dear all, I would appreciate having a piece of advice :

shall we use the VariantAnnotation package, how shall I filter a VCF file in order to exclude the GERMLINE and TUMOR samples that have a GENOTYPE of "./".

for example, if we aim to see the GENOTYPES in the vcf file, we receive the following messages :

> geno(vcf)$GT[,"TUMOR"]
Error in geno(vcf)$GT[, "TUMOR"] : subscript out of bounds
> geno(vcf)$GT[,"NORMAL"]
Error in geno(vcf)$GT[, "NORMAL"] : subscript out of bounds

 

thank you !

 

 

variantannotation • 1.9k views
ADD COMMENT
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.9 years ago
United States

You can use filterVcf to remove rows from a vcf file that meet certain criteria. (The function does not filter by sample/column.)

From the ?filterVcf man page:

There are up to two passes. In the first pass, unparsed lines are passed to \code{prefilters} for filtering, e.g., searching for a fixed character string. In the second pass lines successfully passing \code{prefilters} are parsed into \code{VCF} instances and made available for further filtering. One or both of \code{prefilter} and \code{filter} can be present.

So the 'prefilter' would use grep on each line of the vcf file (similar to "lowCoverageExomeSNP" on the man page) and the 'filter' would use VCF class extractors to get at the data persome some sort of matching (similar to "VTisSNP" on the man page).

ADD COMMENT
0
Entering edit mode

Thank you very much Valerie ! great to hear from you ...

ADD REPLY

Login before adding your answer.

Traffic: 862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6