a filter works alone but not with other filters
1
1
Entering edit mode
@timotheeflutre-6727
Last seen 5.6 years ago
France

I wrote the following filter to discard variants which have less than 20% of samples with less than 10 reads:

filterDp <- function(x, min.dp=10, min.prop.dp=0.2){
  dp <- geno(x)$DP
  (rowSums(dp >= min.dp) / ncol(dp) < min.prop.dp)
}

When I use it alone, it's working well:

filters <- FilterRules(list(dp=filterDp))
filterVcf(file=tabix.file, genome="test", destination=out.file,
              index=TRUE, filters=filters, param=vcf.params, verbose=TRUE)
starting filter
filtering 1202 records
completed filtering
compressing and indexing '...'

However, when I combine it with others, it fails:

filterBiall <- function(x){
  (elementLengths(alt(vcf)) > 1)
}
filterSnv <- function(x){
  (! isSNV(x))
}
filters <- FilterRules(list(dp=filterDp,
                            biall=filterBiall,
                            snv=filterSnv))
filterVcf(file=tabix.file, genome="test", destination=out.file,
              index=TRUE, filters=filters, param=vcf.params, verbose=TRUE)
starting filter
filtering 1202 records
Error in extractROWS(x, eval(filter, x)) :
  error in evaluating the argument 'i' in selecting a method for function 'extractROWS': Error in eval(filter, x) : filter rule evaluated to inconsistent length:

Moreover, the error changes depending on the order of the filters:

filters <- FilterRules(list(dp=filterDp,
                            biall=filterBiall,
                            snv=filterSnv))
filterVcf(file=tabix.file, genome="test", destination=out.file,
              index=TRUE, filters=filters, param=vcf.params, verbose=TRUE)
starting filter
filtering 1202 records
Error in extractROWS(x, eval(filter, x)) :
  error in evaluating the argument 'i' in selecting a method for function 'extractROWS': Error in rowSums(dp >= min.dp)
  'x' must be an array of at least two dimensions

Do you have any idea?

variantannotation vcf filtervcf • 1.5k views
ADD COMMENT
1
Entering edit mode
@martin-morgan-1513
Last seen 4 months ago
United States

The problem is that some of the filters remove all variants, and the remaining filters are not robust to having nothing

> filters[[1]](vcf[FALSE,])
Error in rowSums(dp >= min.dp) : 
  'x' must be an array of at least two dimensions
> filters[[2]](vcf[FALSE,])
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[15] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[29] FALSE
> filters[[3]](vcf[FALSE,])
logical(0)

The second filter is incorrect because it references 'vcf' instead of its argument 'x. The first filter is invalid because it tries something like

> rowSums(matrix(0, 0, 2) > 1)
Error in rowSums(matrix(0, 0, 2) > 1) : 
  'x' must be an array of at least two dimensions

Neat, eh? (thanks for your earlier suggestion about updating the document to use isSNV(); the vignette was written several years ago).

ADD COMMENT
0
Entering edit mode

I see... I managed to solve my problem by doing this:

filterDp <- function(x, min.dp=10, min.prop.dp=0.2){
  if(nrow(x) == 0){
    logical(0)
  } else{
    dp <- geno(x)$DP
    (rowSums(dp >= min.dp) / ncol(dp) < min.prop.dp)
  }
}

Could you also mention this in the vignette? That would be great. Thanks a lot!

ADD REPLY

Login before adding your answer.

Traffic: 572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6