Hello,
after I have successfully performed the summarizeOverlaps function and DEseq2 a couple of times on my computer the following problem occurred.
the complete code:
dir<-"/Users/philipp/Desktop/Bioinformatics/bamfiles" sampleTable <- read.delim("/Users/philipp/Desktop/Bioinformatics/sample_table.csv",header=TRUE, sep=";") register(SerialParam()) filenames <- file.path(dir,paste0(sampleTable$FileName)) bamfiles <- BamFileList(filenames, yieldSize=10000000) gtffile <- file.path("/Users/philipp/Desktop/Bioinformatics/gtf_files/Drosophila_melanogaster.BDGP6.81.ERCC.gtf") txdb <- makeTxDbFromGFF(gtffile, format="gtf", circ_seqs=character()) ebg <- exonsBy(txdb, by="gene") se <- summarizeOverlaps(features=ebg, reads=bamfiles, mode="Union", singleEnd=TRUE, ignore.strand=TRUE )
------------- return:
Error in extractROWS(x, i) : Problem(s) found when testing validity of SortedByQueryHits object returned by subsetting operation: 'queryHits(x)' must be sorted. Make sure to use a subscript that results in a valid SortedByQueryHits object. > sessionInfo() R version 3.3.1 (2016-06-21) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11 (El Capitan) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] GenomicFeatures_1.24.5 BiocParallel_1.6.6 [3] plyr_1.8.4 org.Hs.eg.db_3.3.0 [5] topGO_2.24.0 SparseM_1.7 [7] GO.db_3.3.0 graph_1.50.0 [9] genefilter_1.54.2 ggplot2_2.1.0 [11] RColorBrewer_1.1-2 pheatmap_1.0.8 [13] biomaRt_2.28.0 AnnotationDbi_1.34.4 [15] GenomicAlignments_1.8.4 Rsamtools_1.24.0 [17] Biostrings_2.40.2 XVector_0.12.1 [19] BiocInstaller_1.22.3 DESeq2_1.12.4 [21] SummarizedExperiment_1.2.3 Biobase_2.32.0 [23] GenomicRanges_1.24.2 GenomeInfoDb_1.8.3 [25] IRanges_2.6.1 S4Vectors_0.10.3 [27] BiocGenerics_0.18.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.6 bitops_1.0-6 tools_3.3.1 [4] zlibbioc_1.18.0 rpart_4.1-10 annotate_1.50.0 [7] RSQLite_1.0.0 gtable_0.2.0 lattice_0.20-33 [10] Matrix_1.2-6 DBI_0.5 gridExtra_2.2.1 [13] rtracklayer_1.32.2 cluster_2.0.4 locfit_1.5-9.1 [16] nnet_7.3-12 grid_3.3.1 data.table_1.9.6 [19] XML_3.98-1.4 survival_2.39-5 foreign_0.8-66 [22] latticeExtra_0.6-28 Formula_1.2-1 geneplotter_1.50.0 [25] matrixStats_0.50.2 Hmisc_3.17-4 scales_0.4.0 [28] splines_3.3.1 xtable_1.8-2 colorspace_1.2-6 [31] acepack_1.3-3.3 RCurl_1.95-4.8 munsell_0.4.3 [34] chron_2.3-47
Notably the problem does not occur when copy pasting everything into Rstudio on a different computer and also copying the bamfiles as they are from one computer to the other one.
I have updated Rstudio and R to the latest version and reinstalled all packages required. I have asked google and three bioinformaticians and where still not able to solve the problem.
I would be very happy for any help.
Thanks
Philipp
hi,
I added the function's package name "GenomicAlignments" as a tag, which will notify the maintainers.
If you can't find a quick solution, you can also use alternative counting tools, for example the featureCounts() function from the Rsubread Bioconductor package.
A starting point is to make sure that your packages are current and correct --
BiocInstaller::biocValid()
. It would also help to add the output oftraceback()
to your question.Hey, Thanks to both of you. In the following you can have a look at the output of the two commands recommended. I will test Rsubread as alternative soon in case can't fix it.
Cheers
> BiocInstaller::biocValid()
[1] TRUE
> traceback()
16: stop(e)
15: value[[3L]](cond)
14: tryCatchOne(expr, names, parentenv, handlers[[1L]])
13: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch({
FUN(...)
}, error = handle_error)
11: withCallingHandlers({
tryCatch({
FUN(...)
}, error = handle_error)
}, warning = handle_warning)
10: FUN(X[[i]], ...)
9: lapply(X, FUN, ...)
8: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM)
7: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM)
6: bplapply(setNames(seq_along(reads), names(reads)), function(i,
FUN, reads, features, mode, ignore.strand, inter.feature,
param, preprocess.reads) {
bf <- reads[[i]]
.countWithYieldSize(FUN, features, bf, mode, ignore.strand,
inter.feature, param, preprocess.reads, ...)
}, FUN, reads, features, mode = match.fun(mode), ignore.strand = ignore.strand,
inter.feature = inter.feature, param = param, preprocess.reads = preprocess.reads,
...)
5: bplapply(setNames(seq_along(reads), names(reads)), function(i,
FUN, reads, features, mode, ignore.strand, inter.feature,
param, preprocess.reads) {
bf <- reads[[i]]
.countWithYieldSize(FUN, features, bf, mode, ignore.strand,
inter.feature, param, preprocess.reads, ...)
}, FUN, reads, features, mode = match.fun(mode), ignore.strand = ignore.strand,
inter.feature = inter.feature, param = param, preprocess.reads = preprocess.reads,
...)
4: .dispatchBamFiles(features, reads, mode, ignore.strand, inter.feature = inter.feature,
singleEnd = singleEnd, fragments = fragments, param = param,
preprocess.reads = preprocess.reads, ...)
3: .local(features, reads, mode, ignore.strand, ...)
2: summarizeOverlaps(features = ebg, reads = bamfiles, mode = "Union",
singleEnd = TRUE, ignore.strand = TRUE)
1: summarizeOverlaps(features = ebg, reads = bamfiles, mode = "Union",
singleEnd = TRUE, ignore.strand = TRUE)
It's difficult to narrow this down without a reproducible example. Do you have this problem with all bam files or can you narrow it down to just one? Does changing the yieldSize have any effect? I can take a closer look if you can provide your file(s) and annotation in dropbox; send link to valerie.obenchain@roswellpark.org.
Valerie