derfinder failing in mergeResults()
1
0
Entering edit mode
@jessicahekman-6877
Last seen 9.3 years ago
United States

I'm trying to get derfinder working with my RNA-seq reads which are aligned to dog (Canis familiaris 3.1) since its update to work with the recent bumphunter update to support non-human data.

I have dog transcripts in a "tx" object which I believe is built appropriately (I've used it successfully in the past).

First:

 

for (chr in chroms ) {
  currentChrom <- paste("chr", chr, sep="")
  print(currentChrom)
  dir.create(currentChrom)
  results <- analyzeChr(chr = currentChrom, filteredCov[[currentChrom]], models, mc.cores=cores, groupInfo = groups, cutoffFstat = fstat, cutoffType = 'manual', nPermute = perms, seeds = 19731107 + seq_len(perms), lowMemDir = file.path(tempdir(), currentChrom), BPPARAM.custom = MulticoreParam(workers = cores), verbose = TRUE, runAnnotation = TRUE, writeOutput = TRUE, returnOutput = TRUE, subject=tx)
}

This gives the error:

 

> Error in annotateNearest(regions$regions, subject) :
>  arguments must be both IRanges or GRanges

 

My tx object is a GRanges. I found that this workaround got me past this problem:

 

for (chr in chroms ) {
  currentChrom <- paste("chr", chr, sep="")
  print(currentChrom)
  dir.create(currentChrom)
  results <- analyzeChr(chr = currentChrom, filteredCov[[currentChrom]], models, mc.cores=cores, groupInfo = groups, cutoffFstat = fstat, cutoffType = 'manual', nPermute = perms, seeds = 19731107 + seq_len(perms), lowMemDir = file.path(tempdir(), currentChrom), BPPARAM.custom = MulticoreParam(workers = cores), verbose = TRUE, runAnnotation = FALSE, writeOutput = TRUE, returnOutput = TRUE, subject=tx)
  annotation <- annotateNearest(results$regions$regions, tx)
  save(annotation, file = file.path(currentChrom, 'annotation.Rdata'))
}


Now analyzeChr() completes. However:

mergeResults(chrs=paste("chr", chroms, sep=""), prefix=derResults,
   genomicState = genomicState$fullGenome,
   optionsStats = results$optionsStats)

2014-12-21 11:11:25 mergeResults: Saving options used
2014-12-21 11:11:25 Loading chromosome chr1
...
2014-12-21 11:11:32 Loading chromosome chr38
2014-12-21 11:11:32 Loading chromosome chrX
Error in `$<-.data.frame`(`*tmp*`, "name", value = character(0)) :
  replacement has 0 rows, data has 204530
Calls: mergeResults -> $<- -> $<-.data.frame
Execution halted

I'm not sure how to approach this one. Help?

> devtools::session_info()
Session info-------------------------------------------------------------------
 setting  value                       
 version  R version 3.1.1 (2014-07-10)
 system   x86_64, linux-gnu           
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 tz       America/Chicago             

Packages-----------------------------------------------------------------------
 package           * version  date       source        
 acepack             1.3.3.3  2013-05-03 CRAN (R 3.1.1)
 AnnotationDbi       1.28.1   2014-10-30 Bioconductor  
 base64enc           0.1.2    2014-06-26 CRAN (R 3.1.1)
 BatchJobs           1.5      2014-10-30 CRAN (R 3.1.1)
 BBmisc              1.8      2014-10-30 CRAN (R 3.1.1)
 Biobase             2.26.0   2014-10-15 Bioconductor  
 BiocGenerics        0.12.1   2014-11-19 Bioconductor  
 BiocParallel        1.0.0    2014-11-11 Bioconductor  
 biomaRt             2.22.0   2014-10-15 Bioconductor  
 Biostrings          2.34.1   2014-12-15 Bioconductor  
 bitops              1.0.6    2013-08-17 CRAN (R 3.0.2)
 brew                1.0.6    2011-04-13 CRAN (R 3.1.1)
 bumphunter          1.6.0    2014-12-17 Bioconductor  
 checkmate           1.5.1    2014-12-14 CRAN (R 3.1.1)
 cluster             1.15.3   2014-09-04 CRAN (R 3.1.1)
 codetools           0.2.9    2014-08-21 CRAN (R 3.1.1)
 DBI                 0.3.1    2014-09-24 CRAN (R 3.1.1)
 derfinder         * 1.0.10   2014-12-21 Bioconductor  
 derfinderHelper     1.0.4    2014-11-10 Bioconductor  
 devtools            1.6.1    2014-10-07 CRAN (R 3.1.1)
 digest              0.6.7    2014-12-20 CRAN (R 3.1.1)
 doRNG               1.6      2014-03-07 CRAN (R 3.1.1)
 fail                1.2      2013-09-19 CRAN (R 3.1.1)
 foreach             1.4.2    2014-04-11 CRAN (R 3.1.1)
 foreign             0.8.61   2014-03-28 CRAN (R 3.1.1)
 Formula             1.1.2    2014-07-13 CRAN (R 3.1.1)
 GenomeInfoDb        1.2.3    2014-11-17 Bioconductor  
 GenomicAlignments   1.2.1    2014-11-10 Bioconductor  
 GenomicFeatures     1.18.3   2014-12-17 Bioconductor  
 GenomicFiles        1.2.0    2014-10-15 Bioconductor  
 GenomicRanges       1.18.3   2014-11-19 Bioconductor  
 Hmisc               3.14.6   2014-11-22 CRAN (R 3.1.1)
 IRanges             2.0.1    2014-12-15 Bioconductor  
 iterators           1.0.7    2014-04-11 CRAN (R 3.1.1)
 lattice             0.20.29  2014-04-04 CRAN (R 3.1.1)
 latticeExtra        0.6.26   2013-08-15 CRAN (R 3.1.1)
 locfit              1.5.9.1  2013-04-20 CRAN (R 3.0.2)
 Matrix              1.1.4    2014-06-15 CRAN (R 3.1.1)
 matrixStats         0.12.2   2014-12-07 CRAN (R 3.1.1)
 nnet                7.3.8    2014-03-28 CRAN (R 3.1.1)
 pkgmaker            0.22     2014-05-14 CRAN (R 3.1.1)
 qvalue              1.40.0   2014-10-15 Bioconductor  
 R.methodsS3         1.6.1    2014-01-05 CRAN (R 3.1.1)
 RColorBrewer        1.1.2    2014-12-07 CRAN (R 3.1.1)
 RCurl               1.95.4.5 2014-12-06 CRAN (R 3.1.1)
 registry            0.2      2012-01-24 CRAN (R 3.1.1)
 rngtools            1.2.4    2014-03-06 CRAN (R 3.1.1)
 rpart               4.1.8    2014-03-28 CRAN (R 3.1.1)
 Rsamtools           1.18.2   2014-11-11 Bioconductor  
 RSQLite             1.0.0    2014-10-25 CRAN (R 3.1.1)
 rstudioapi          0.1      2014-03-27 CRAN (R 3.1.1)
 rtracklayer         1.26.2   2014-11-11 Bioconductor  
 S4Vectors           0.4.0    2014-10-15 Bioconductor  
 sendmailR           1.2.1    2014-09-21 CRAN (R 3.1.1)
 stringr             0.6.2    2012-12-06 CRAN (R 3.0.2)
 survival            2.37.7   2014-01-22 CRAN (R 3.1.1)
 XML                 3.98.1.1 2013-06-20 CRAN (R 3.0.2)
 xtable              1.7.4    2014-09-12 CRAN (R 3.1.1)
 XVector             0.6.0    2014-10-15 Bioconductor  
 zlibbioc            1.12.0   2014-10-15 Bioconductor  

 

 

Jessica

derfinder • 2.7k views
ADD COMMENT
0
Entering edit mode

OK, after updating, I now get past this error in derfinder -- analyzeChr() now runs. Thanks so much!

I now get an error in mergeResults():

 

> mergeResults(chrs=paste("chr", chroms, sep=""), prefix=derResults,
+    genomicState = genomicState$fullGenome,
+    optionsStats = results$optionsStats)
2015-01-23 14:35:40 mergeResults: Saving options used
2015-01-23 14:35:40 Loading chromosome chr1
2015-01-23 14:35:40 Loading chromosome chr2
2015-01-23 14:35:41 Loading chromosome chr3
2015-01-23 14:35:41 Loading chromosome chr4
2015-01-23 14:35:41 Loading chromosome chr5
2015-01-23 14:35:41 Loading chromosome chr6
2015-01-23 14:35:41 Loading chromosome chr7
2015-01-23 14:35:41 Loading chromosome chr8
2015-01-23 14:35:41 Loading chromosome chr9
2015-01-23 14:35:42 Loading chromosome chr10
2015-01-23 14:35:42 Loading chromosome chr11
2015-01-23 14:35:42 Loading chromosome chr12
2015-01-23 14:35:42 Loading chromosome chr13
2015-01-23 14:35:42 Loading chromosome chr14
2015-01-23 14:35:42 Loading chromosome chr15
2015-01-23 14:35:42 Loading chromosome chr16
2015-01-23 14:35:42 Loading chromosome chr17
2015-01-23 14:35:42 Loading chromosome chr18
2015-01-23 14:35:43 Loading chromosome chr19
2015-01-23 14:35:43 Loading chromosome chr20
2015-01-23 14:35:43 Loading chromosome chr21
2015-01-23 14:35:43 Loading chromosome chr22
2015-01-23 14:35:43 Loading chromosome chr23
2015-01-23 14:35:43 Loading chromosome chr24
2015-01-23 14:35:43 Loading chromosome chr25
2015-01-23 14:35:43 Loading chromosome chr26
2015-01-23 14:35:43 Loading chromosome chr27
2015-01-23 14:35:43 Loading chromosome chr28
2015-01-23 14:35:43 Loading chromosome chr29
2015-01-23 14:35:43 Loading chromosome chr30
2015-01-23 14:35:43 Loading chromosome chr31
2015-01-23 14:35:44 Loading chromosome chr32
2015-01-23 14:35:44 Loading chromosome chr33
2015-01-23 14:35:44 Loading chromosome chr34
2015-01-23 14:35:44 Loading chromosome chr35
2015-01-23 14:35:44 Loading chromosome chr36
2015-01-23 14:35:44 Loading chromosome chr37
2015-01-23 14:35:44 Loading chromosome chr38
2015-01-23 14:35:44 Loading chromosome chrX
Error in `$<-.data.frame`(`*tmp*`, "name", value = character(0)) :
  replacement has 0 rows, data has 38053


I'm happy to start a new post on this issue if that's helpful.

 

> devtools::session_info()
Session info -------------------------------------------------------------------
 setting  value                       
 version  R version 3.1.2 (2014-10-31)
 system   x86_64, linux-gnu           
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 tz       America/Chicago             

Packages -----------------------------------------------------------------------
 package           * version  date       source        
 acepack           * 1.3-3.3  2013-05-03 CRAN (R 3.1.1)
 AnnotationDbi     * 1.28.1   2014-10-30 Bioconductor  
 base64enc         * 0.1-2    2014-06-26 CRAN (R 3.1.1)
 BatchJobs         * 1.5      2014-10-30 CRAN (R 3.1.1)
 BBmisc            * 1.8      2014-10-30 CRAN (R 3.1.1)
 Biobase           * 2.26.0   2014-10-15 Bioconductor  
 BiocGenerics      * 0.12.1   2014-11-19 Bioconductor  
 BiocParallel        1.0.0    2014-11-11 Bioconductor  
 biomaRt           * 2.22.0   2015-01-03 Bioconductor  
 Biostrings        * 2.34.1   2014-12-15 Bioconductor  
 bitops            * 1.0-6    2013-08-17 CRAN (R 3.0.2)
 brew              * 1.0-6    2011-04-13 CRAN (R 3.1.1)
 bumphunter        * 1.6.0    2014-12-17 Bioconductor  
 checkmate         * 1.5.1    2014-12-14 CRAN (R 3.1.1)
 cluster           * 1.15.3   2014-09-04 CRAN (R 3.1.2)
 codetools         * 0.2-9    2014-08-21 CRAN (R 3.1.2)
 DBI               * 0.3.1    2014-09-24 CRAN (R 3.1.1)
 derfinder           1.0.10   2014-12-21 Bioconductor  
 derfinderHelper   * 1.0.4    2014-11-10 Bioconductor  
 devtools          * 1.7.0    2015-01-17 CRAN (R 3.1.1)
 digest            * 0.6.8    2014-12-31 CRAN (R 3.1.1)
 doRNG             * 1.6      2014-03-07 CRAN (R 3.1.1)
 fail              * 1.2      2013-09-19 CRAN (R 3.1.1)
 foreach           * 1.4.2    2014-04-11 CRAN (R 3.1.1)
 foreign           * 0.8-61   2014-03-28 CRAN (R 3.1.2)
 Formula           * 1.2-0    2015-01-20 CRAN (R 3.1.1)
 GenomeInfoDb      * 1.2.3    2014-11-17 Bioconductor  
 GenomicAlignments * 1.2.1    2014-11-10 Bioconductor  
 GenomicFeatures   * 1.18.3   2014-12-17 Bioconductor  
 GenomicFiles      * 1.2.0    2014-10-15 Bioconductor  
 GenomicRanges     * 1.18.4   2015-01-10 Bioconductor  
 Hmisc             * 3.14-6   2014-11-22 CRAN (R 3.1.1)
 IRanges           * 2.0.1    2014-12-15 Bioconductor  
 iterators         * 1.0.7    2014-04-11 CRAN (R 3.1.1)
 lattice           * 0.20-29  2014-04-04 CRAN (R 3.1.2)
 latticeExtra      * 0.6-26   2013-08-15 CRAN (R 3.1.1)
 locfit            * 1.5-9.1  2013-04-20 CRAN (R 3.0.2)
 Matrix            * 1.1-4    2014-06-15 CRAN (R 3.1.2)
 matrixStats       * 0.12.2   2014-12-07 CRAN (R 3.1.1)
 nnet              * 7.3-8    2014-03-28 CRAN (R 3.1.2)
 pkgmaker          * 0.22     2014-05-14 CRAN (R 3.1.1)
 qvalue            * 1.40.0   2014-10-15 Bioconductor  
 R.methodsS3       * 1.6.1    2014-01-05 CRAN (R 3.1.1)
 RColorBrewer      * 1.1-2    2014-12-07 CRAN (R 3.1.1)
 RCurl             * 1.95-4.5 2014-12-06 CRAN (R 3.1.1)
 registry          * 0.2      2012-01-24 CRAN (R 3.1.1)
 rngtools          * 1.2.4    2014-03-06 CRAN (R 3.1.1)
 rpart             * 4.1-8    2014-03-28 CRAN (R 3.1.2)
 Rsamtools         * 1.18.2   2014-11-11 Bioconductor  
 RSQLite           * 1.0.0    2014-10-25 CRAN (R 3.1.1)
 rstudioapi        * 0.2      2014-12-31 CRAN (R 3.1.1)
 rtracklayer       * 1.26.2   2014-11-11 Bioconductor  
 S4Vectors         * 0.4.0    2014-10-15 Bioconductor  
 sendmailR         * 1.2-1    2014-09-21 CRAN (R 3.1.1)
 stringr           * 0.6.2    2012-12-06 CRAN (R 3.0.2)
 survival          * 2.37-7   2014-01-22 CRAN (R 3.1.2)
 XML               * 3.98-1.1 2013-06-20 CRAN (R 3.0.2)
 xtable            * 1.7-4    2014-09-12 CRAN (R 3.1.1)
 XVector           * 0.6.0    2014-10-15 Bioconductor  
 zlibbioc          * 1.12.0   2014-10-15 Bioconductor

 

ADD REPLY
0
Entering edit mode

Hmm -- looking at that session_info() it looks like in fact bumphunter and derfinder are the same version as previously. (So why did analyzeChr() start working?) Looking into it...

ADD REPLY
0
Entering edit mode

I'm at a loss. Seems like those two packages are the same version as previously, yet derfinder is behaving differently on the same data -- and still not working, just differently. Help?

ADD REPLY
1
Entering edit mode
@lcolladotor
Last seen 1 day ago
United States

As of version 1.1.17 available from BioC-devel, `derfinder` now works well with `bumphunter` version >= 1.7.3 These versions of the packages now allow users to work with non-human annotations. 

A few key things to look at are:

  • bumphunter::matchGenes()
  • the 'txdb' argument in derfinder::analyzeChr()

An example of how to  use bumphunter::matchGenes() with Canis familiaris is available at https://github.com/lcolladotor/bumphunter/blob/master/tests/testthat/test_annotation.R#L29-L76

Some details: `derfinder` 1.1.17 requires `bumphunter` 1.1.6 which should soon be available from BioC-devel. This `bumphunter` version corrects for a few minor bugs.

ADD COMMENT
0
Entering edit mode

Hooray!

I installed the latest R-devel (2015-03-19) and installed bioc-devel and derfinder (1.1.17) and bumphunter (1.7.6).

I am testing using your tests here:

https://github.com/lcolladotor/bumphunter/blob/master/tests/testthat/test_annotation.R#L29-L76

The first set (Annotate Nearest) passes -- excellent.

The second set (Annotate Transcripts) says:

Loading required package: DoesNotExist
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘DoesNotExist’

 

Of course it's true that there is no such package -- is there something I should install to get this to work?

ADD REPLY
0
Entering edit mode
Onward without waiting for testing to work!

library(derfinder)
library(GenomicFeatures)
library(BiocParallel)

cores <- 10
perms <- 1
fstat <- 10
options(species = 'canis_familiaris')

chroms <- c(1:38, 'X')
files <- rawFiles(datadir="~/transcriptome/pituitary/aligned/cf3-N3-ensembl/", samplepatt="tophat-N3-ensembl-*")
tameIds <- read.table("/home/hekman2/Dropbox/Research/transcriptome/pituitary/notes/samples-pit-tame.txt")
tameFileNames <- lapply(tameIds, function(id) { return(paste("tophat-N3-ensembl-", id, sep="")) } )

groups <- as.factor(sapply(names(files), function(filename) { ifelse(grepl(filename, tameFileNames), "tame", "aggr") }))
fullCov <- fullCoverage(files, chrs = chroms)
sampleDepths <- sampleDepth(collapseFullCoverage(fullCov), 1) # only worthwhile if we analyzed all chromosomes
models <- makeModels(sampleDepths, testvars = groups)
filteredCov <- lapply(fullCov, filterData, cutoff = 2)

dogTx <- makeTranscriptDbFromBiomart(biomart="ensembl",
                                 dataset="cfamiliaris_gene_ensembl",
                                 transcript_ids=NULL,
                                 circ_seqs=DEFAULT_CIRC_SEQS,
                                 filters="",
                                 id_prefix="ensembl_",
                                 host="www.biomart.org",
                                 port=80,
                                 miRBaseBuild=NA)
originalWd <- getwd()
setwd(file.path(originalWd, derResults))
currentChrom <- "chr1"

analyzeChr(chr = currentChrom, filteredCov[[currentChrom]], models, mc.cores=cores, groupInfo = groups, cutoffFstat = fstat, cutoffType = 'manual', nPermute = perms, seeds = 19731107 + seq_len(perms), lowMemDir = file.path(tempdir(), currentChrom), BPPARAM.custom = MulticoreParam(workers = cores), verbose = TRUE, runAnnotation = TRUE, writeOutput = TRUE, returnOutput = FALSE, txdb=dogTx)


This runs without complaint, except that analyzeChr seems to just freeze during f-stat calculation. "top" tells me that R is still running but using 0% of processor, so I don't think it's doing much. I tried the without BPPARAM.custom set, and there was no difference (I was wondering if it was having trouble assigning workers or something). Any ideas?

2015-03-23 16:51:51 analyzeChr: Pre-processing the coverage data
2015-03-23 16:52:03 preprocessCoverage: splitting the data
2015-03-23 16:53:24 analyzeChr: Calculating statistics
2015-03-23 16:53:24 calculateStats: calculating the F-statistics

Jessica

ADD REPLY
0
Entering edit mode

Hi Jessica,

This is really a new question. Can you post it as such please? Otherwise this thread will get hard to follow.

I also need your session information because BiocParallel has recently changed and maybe something broke down.

Cheers,

Leo

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Hi Jessica,

The second set of tests is not failing. Simply, that set of tests includes https://github.com/lcolladotor/bumphunter/blob/master/tests/testthat/test_annotation.R#L50 which expects an error. Running that line of code generates a warning message (see below) but doesn't mean that anything is broken.

> test_that('Annotate Transcripts', {
+     expect_that(attributes(can_genes)$description, equals('annotatedTranscripts'))
+     expect_that(attributes(can_genes_NA)$description, equals('annotatedTranscripts'))
+     expect_that(colnames(mcols(can_genes)), equals(colnames((mcols(can_genes_NA)))))
+     expect_that(ranges(can_genes), equals(ranges(can_genes_NA)))
+     expect_that(can_genes_NA$Gene, equals(Rle(NA, length(can_genes_NA))))
+     expect_that(can_genes_NA$Refseq, equals(Rle(NA, length(can_genes_NA))))
+     expect_that(annotateTranscripts(can_txdb, by = 'gene', annotationPackage = 'DoesNotExist', requireAnnotation = TRUE), throws_error())
+     expect_that(sum(can_genes$CSE < can_genes$CSS, na.rm = TRUE), equals(0))
+ })
Loading required package: DoesNotExist
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘DoesNotExist’
> 

You can learn more about testing R packages at http://r-pkgs.had.co.nz/tests.html but you don't really need to know about this for your analysis. It's mostly meant for R package developers.

 

Cheers,

Leo

 

ADD REPLY
0
Entering edit mode

Makes sense. Thanks for the explanation!

ADD REPLY

Login before adding your answer.

Traffic: 694 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6