Hello, i am trying to do motif discovery in a ChIPseq dataset. I am using the rGADEM package to do de novo motif discovery, but the GADEM() function returns an error. My data consists of 576 regions which I have in bed format and convert to GRanges via makeGRangeFromDataFrame(). To provide an example, I shortened my dataframe to 20 regions and the error remains. Any suggestions on how to solve this?
data
seqnames start end
1 chr6 29723590 29723790
2 chr14 103334312 103334512
3 chr1 150579030 150579230
4 chr7 76358527 76358727
5 chr6 11537891 11538091
6 chr14 49893256 49893456
7 chr5 179623200 179623400
8 chr1 228082831 228083031
9 chr12 93441644 93441844
10 chr10 3784776 3784976
11 chr3 183635833 183636033
12 chr7 975301 975501
13 chr12 123364510 123364710
14 chr1 1615578 1615778
15 chr1 36156320 36156520
16 chr14 55051781 55051981
17 chr8 11867697 11867897
18 chr22 38706135 38706335
19 chr6 44265256 44265456
20 chr1 185316658 185316858
The code I use
library(GenomicRanges)
library(IRanges)
dataRange <- makeGRangesFromDataFrame(data)
#merge nearby peaks
dataRange <- reduce(dataRange)
# expand the peaks
dataRange_resized = resize(dataRange, width = 50, fix='center')
library(rGADEM)
library(BSgenome.Hsapiens.UCSC.hg38)
novel_motifs <- GADEM(dataRange_resized, seed = 2, nmotifs = 10, genome = Hsapiens)
Error in .Call2("C_solve_user_SEW", refwidths, start, end, width, translate.negative.coord, :
solving row 4: 'allow.nonnarrowing' is FALSE and the supplied start (185316733) is > refwidth + 1
sessionInfo( )
``` R version 4.1.3 (2022-03-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale: [1] LC_COLLATE=Dutch_Netherlands.1252 LC_CTYPE=Dutch_Netherlands.1252 LC_MONETARY=Dutch_Netherlands.1252 [4] LC_NUMERIC=C LC_TIME=Dutch_Netherlands.1252
attached base packages: [1] grid stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] BSgenome.Hsapiens.UCSC.hg38_1.4.4 rGADEM_2.42.0 seqLogo_1.60.0
[4] BSgenome_1.62.0 rtracklayer_1.54.0 Biostrings_2.62.0
[7] XVector_0.34.0 GenomicRanges_1.46.1 GenomeInfoDb_1.30.1
[10] IRanges_2.28.0 S4Vectors_0.32.4 BiocGenerics_0.40.0
loaded via a namespace (and not attached):
[1] pillar_1.8.1 compiler_4.1.3 restfulr_0.0.15 BiocManager_1.30.18
[5] MatrixGenerics_1.6.0 bitops_1.0-7 tools_4.1.3 zlibbioc_1.40.0
[9] tibble_3.1.8 lifecycle_1.0.1 lattice_0.20-45 pkgconfig_2.0.3
[13] rlang_1.0.4 Matrix_1.5-1 DBI_1.1.3 DelayedArray_0.20.0
[17] cli_3.4.1 rstudioapi_0.14 yaml_2.3.5 parallel_4.1.3
[21] GenomeInfoDbData_1.2.7 dplyr_1.0.9 generics_0.1.3 vctrs_0.4.1
[25] tidyselect_1.1.2 glue_1.6.2 Biobase_2.54.0 R6_2.5.1
[29] fansi_1.0.3 XML_3.99-0.10 BiocParallel_1.28.3 purrr_0.3.4
[33] magrittr_2.0.3 Rsamtools_2.10.0 matrixStats_0.62.0 GenomicAlignments_1.30.0
[37] assertthat_0.2.1 SummarizedExperiment_1.24.0 utf8_1.2.2 RCurl_1.98-1.8
[41] crayon_1.5.2 rjson_0.2.21 BiocIO_1.4.0
Hi!
I'm facing the same issue. Have you managed to fix it?