Hello,
I'm trying to use "summarizeOverlaps" to count reads from a bamfile, using annotation from a gff3 file, using a standard procedure :
library(GenomicFeatures) library(GenomicAlignments)
txdb <- makeTxDbFromGFF(refgff3, format = "gff3", circ_seqs = character()) #or gtf ebg <- exonsBy(txdb, by="gene") bamfile <- BamFile(readsBAM) se <- summarizeOverlaps(features=ebg, reads=bamfile, mode="Union", singleEnd=FALSE,ignore.strand=TRUE, fragments=TRUE )
This last step returns the following error :
Error in .normargSeqlevels(seqnames) : supplied 'seqlevels' cannot contain duplicated sequence names
It's quite clear that there must be some duplicates either in the bam file or in the exonsBy object. However any combination of duplicates(ebg), which(duplicates(names(ebg)), etc. that I could try returned no duplicate.
=> hence my question : how to narrow down the search on which object is duplicated ?
- in particular, which object is examined just before the error occurs ?
- maybe there is another way to use the 'duplicated' command that would work ?
Thanks a lot in advance for your answers
PS : to be complete, here is the output of the sessionInfo() command :
R version 3.3.3 (2017-03-06) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.2 LTS locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=de_DE.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] rPython_0.0-6 RJSONIO_1.3-0 GenomicAlignments_1.8.4 Rsamtools_1.24.0 Biostrings_2.40.2 [6] XVector_0.12.1 SummarizedExperiment_1.2.3 GenomicFeatures_1.24.5 AnnotationDbi_1.34.4 Biobase_2.32.0 [11] GenomicRanges_1.24.3 GenomeInfoDb_1.8.7 IRanges_2.6.1 S4Vectors_0.10.3 BiocGenerics_0.18.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.15 zlibbioc_1.18.0 BiocParallel_1.6.6 bit_1.1-12 rlang_0.2.0 blob_1.1.0 tools_3.3.3 [8] DBI_0.7 bit64_0.9-7 digest_0.6.15 tibble_1.4.2 rtracklayer_1.32.2 bitops_1.0-6 biomaRt_2.28.0 [15] RCurl_1.95-4.10 memoise_1.1.0 RSQLite_2.0 pillar_1.1.0 XML_3.98-1.10 pkgconfig_2.0.1