Import of vcf with VariantAnnotation
4
0
Entering edit mode
Anna N. • 0
@anna-n-6889
Last seen 9.9 years ago
France

Hello again,

 

after installing succesfully the packages and reproducing all examples, I would like to analyze my own mutation calls.

I have a problem importing my vcf file - sorry if these are naive questions but I am really new here.

 

> fl=readVcf("Ac216_H69_1_A549_mutect_confident_eff.vcf", "hg19")
Error in FUN(X[[2L]], ...) : subscript out of bounds
In addition: Warning message:
In .bcfHeaderAsSimpleList(header) :
  duplicate keys in header will be forced to unique rownames
> traceback()
29: lapply(X = X, FUN = FUN, ...)
28: FUN(X[[1L]], ...)
27: FUN(X[[1L]], ...)
26: lapply(elt, sapply, "[[", 2)
25: lapply(elt, sapply, "[[", 2)
24: FUN(X[[3L]], ...)
23: lapply(X = split(X, group), FUN = FUN, ...)
22: tapply(keyval, tags, function(elt) {
        keys <- lapply(elt, sapply, "[[", 1)
        vals0 <- lapply(elt, sapply, "[[", 2)
        vals <- Map("names<-", vals0, keys)
        cols <- unique(unlist(keys))
        entries <- Map(function(k) as.vector(sapply(vals, "[", k)), 
            cols)
        desc <- which("DESCRIPTION" == toupper(names(entries)))
        if (1L == length(desc)) 
            entries[[desc]] <- gsub("\"", "", entries[[desc]])
        id <- which("ID" == toupper(names(entries)))
        if (length(id) > 0L) {
            if (any(duplicated(entries[[id]]))) 
                warning("duplicate ID's in header will be forced to unique ", 
                    "rownames")
            df <- DataFrame(entries[-id], row.names = make.unique(entries[[id]]))
        }
        else {
            df <- DataFrame(entries)
        }
        df
    })
21: tapply(keyval, tags, function(elt) {
        keys <- lapply(elt, sapply, "[[", 1)
        vals0 <- lapply(elt, sapply, "[[", 2)
        vals <- Map("names<-", vals0, keys)
        cols <- unique(unlist(keys))
        entries <- Map(function(k) as.vector(sapply(vals, "[", k)), 
            cols)
        desc <- which("DESCRIPTION" == toupper(names(entries)))
        if (1L == length(desc)) 
            entries[[desc]] <- gsub("\"", "", entries[[desc]])
        id <- which("ID" == toupper(names(entries)))
        if (length(id) > 0L) {
            if (any(duplicated(entries[[id]]))) 
                warning("duplicate ID's in header will be forced to unique ", 
                    "rownames")
            df <- DataFrame(entries[-id], row.names = make.unique(entries[[id]]))
        }
        else {
            df <- DataFrame(entries)
        }
        df
    })
20: .bcfHeaderAsSimpleList(header)
19: scanBcfHeader(bf)
18: scanBcfHeader(bf)
17: (function (file, mode) 
    {
        bf <- open(BcfFile(file, character(0), ...))
        on.exit(close(bf))
        scanBcfHeader(bf)
    })(dots[[1L]][[1L]])
16: mapply(FUN = f, ..., SIMPLIFY = FALSE)
15: .Method(..., f = f)
14: eval(expr, envir, enclos)
13: eval(.dotsCall, env)
12: eval(.dotsCall, env)
11: standardGeneric("Map")
10: Map(function(file, mode) {
        bf <- open(BcfFile(file, character(0), ...))
        on.exit(close(bf))
        scanBcfHeader(bf)
    }, file, ...)
9: scanBcfHeader(file, ...)
8: scanBcfHeader(file, ...)
7: scanVcfHeader(file)
6: scanVcfHeader(file)
5: .scanVcfToVCF(scanVcf(file, param = param, row.names = row.names), 
       file, genome, param)
4: .readVcf(file, genome, param = ScanVcfParam(), row.names = row.names, 
       ...)
3: .local(file, genome, param, ...)
2: readVcf("Ac216_H69_1_A549_mutect_confident_eff.vcf", "hg19")
1: readVcf("Ac216_H69_1_A549_mutect_confident_eff.vcf", "hg19")


 

variantannotation somaticsignatures • 3.6k views
ADD COMMENT
1
Entering edit mode
Anna N. • 0
@anna-n-6889
Last seen 9.9 years ago
France

Hello Valerie,

 

The file is in my working directory, even if I include the full pathway I have the same problems...

I am sending you the file in your email

Thank you very much

Anna

ADD COMMENT
1
Entering edit mode

Thanks for sending the file.

scanVcfHeader() was not recognizing the (relatively) new header fields 'SAMPLE' and 'PEDIGREE'. 

The problem originated in Rsamtools so for the fix you'll need both Rsamtools 1.19.4 and VariantAnnotation 1.13.4.

Thanks for reporting this bug!

Valerie

ADD REPLY
0
Entering edit mode
@valerie-obenchain-4275
Last seen 2.9 years ago
United States

Hi Anna,

What version of VariantAnnotation are you using? Please provide the output of

sessionInfo()

It might be helpful to inspect the header information and look for duplicate variable names in info and geno fields. For example,

> fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
> hdr <- scanVcfHeader(fl)
> hdr
class: VCFHeader
samples(3): NA00001 NA00002 NA00003
meta(5): fileformat fileDate source reference phasing
fixed(1): FILTER
info(6): NS DP ... DB H2
geno(4): GT GQ DP HQ
> info(hdr)
DataFrame with 6 rows and 3 columns
        Number        Type                 Description
   <character> <character>                 <character>
NS           1     Integer Number of Samples With Data
DP           1     Integer                 Total Depth
AF           A       Float            Allele Frequency
AA           1      String            Ancestral Allele
DB           0        Flag dbSNP membership, build 129
H2           0        Flag          HapMap2 membership

Are you able to read the file with scanVcf()? Assuming the file isn't too big for available memory.

scanVcf(fl)

 

Valerie

 

ADD COMMENT
0
Entering edit mode
Anna N. • 0
@anna-n-6889
Last seen 9.9 years ago
France

 Hi Valerie,

no I am not able to read the file with scanVcf(fl) and it is not a big file (325 kb)
I am pasting the sessionInfo()


> fl=readVcf("Ac216_H69_1_A549_mutect_confident_eff.vcf", "hg19")
Error in FUN(X[[2L]], ...) : subscript out of bounds
In addition: Warning message:
In .bcfHeaderAsSimpleList(header) :
  duplicate keys in header will be forced to unique rownames
> scanVcf(fl)
Error in scanVcf(fl) : 
  erreur d'évaluation de l'argument 'file' lors de la sélection d'une méthode pour la fonction 'scanVcf' : Error: object 'fl' not found
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252 LC_NUMERIC=C                  
[5] LC_TIME=French_France.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] VariantAnnotation_1.12.2 Rsamtools_1.18.1         Biostrings_2.34.0        XVector_0.6.0            GenomicRanges_1.18.1    
[6] GenomeInfoDb_1.2.2       IRanges_2.0.0            S4Vectors_0.4.0          BiocGenerics_0.12.0     

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.0    base64enc_0.1-2         BatchJobs_1.4           BBmisc_1.7              Biobase_2.26.0         
 [6] BiocParallel_1.0.0      biomaRt_2.22.0          bitops_1.0-6            brew_1.0-6              BSgenome_1.34.0        
[11] checkmate_1.5.0         codetools_0.2-9         DBI_0.3.1               digest_0.6.4            fail_1.2               
[16] foreach_1.4.2           GenomicAlignments_1.2.0 GenomicFeatures_1.18.1  iterators_1.0.7         RCurl_1.95-4.3         
[21] RSQLite_1.0.0           rtracklayer_1.26.1      sendmailR_1.2-1         stringr_0.6.2           tools_3.1.1            
[26] XML_3.98-1.1            zlibbioc_1.12.0        
 

>

ADD COMMENT
0
Entering edit mode

'fl' should be the path to your file.

fl <- "Ac216_H69_1_A549_mutect_confident_eff.vcf" ## include the full path

Did you try looking for duplicate variable names in the header?

hdr <- scanVcfHeader(fl)

names(info(hdr))

If these hints don't help I can take a look at the file. Can you share it off-line (vobencha@fhcrc.org) or maybe I can download it from somewhere?

Valerie

ADD REPLY
0
Entering edit mode
Anna N. • 0
@anna-n-6889
Last seen 9.9 years ago
France

Thank you for all the help!

I am waiting to update my packages and start over

Anna

 

ADD COMMENT

Login before adding your answer.

Traffic: 740 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6