I have an AAStringSet containing the name and sequences of proteins of interest.
I am trying to find from this set, hom many patterns match DDVF, DEVF EDVF or EEVF. I've try to do this but only get error codes in return saying that it needs to be a vector and not an AAstringset Object.
pattern <- c("DDVF", "DEVF", "EDVF", "EEVF")
# str_detect(string = prot_interest, pattern = pattern)
sapply(getSeq(prot_interest, names(prot_interest)), str_detect, pattern)
Any idea how i can do this ?
Here is the session info if needed
R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22631)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
time zone: Europe/Brussels
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] rWSBIM1322_0.3.2
[2] lubridate_1.9.4
[3] forcats_1.0.0
[4] stringr_1.5.1
[5] dplyr_1.1.4
[6] purrr_1.0.2
[7] readr_2.1.5
[8] tidyr_1.3.1
[9] tibble_3.2.1
[10] tidyverse_2.0.0
[11] ggplot2_3.5.1
[12] BSgenome.Dmelanogaster.UCSC.dm2_1.4.0
[13] BSgenome_1.70.2
[14] rtracklayer_1.62.0
[15] BiocIO_1.12.0
[16] GenomicRanges_1.54.1
[17] Biostrings_2.70.3
[18] GenomeInfoDb_1.38.8
[19] XVector_0.42.0
[20] IRanges_2.36.0
[21] S4Vectors_0.40.2
[22] BiocGenerics_0.48.1
loaded via a namespace (and not attached):
[1] SummarizedExperiment_1.32.0 gtable_0.3.6
[3] rjson_0.2.23 xfun_0.49
[5] bslib_0.8.0 Biobase_2.62.0
[7] lattice_0.21-9 tzdb_0.4.0
[9] vctrs_0.6.5 tools_4.3.2
[11] bitops_1.0-8 generics_0.1.3
[13] parallel_4.3.2 fansi_1.0.6
[15] highr_0.11 pkgconfig_2.0.3
[17] Matrix_1.6-1.1 lifecycle_1.0.4
[19] GenomeInfoDbData_1.2.11 compiler_4.3.2
[21] Rsamtools_2.18.0 munsell_0.5.1
[23] codetools_0.2-19 htmltools_0.5.8.1
[25] sass_0.4.9 RCurl_1.98-1.16
[27] yaml_2.3.10 pillar_1.9.0
[29] crayon_1.5.3 jquerylib_0.1.4
[31] BiocParallel_1.36.0 DelayedArray_0.28.0
[33] cachem_1.1.0 abind_1.4-8
[35] tidyselect_1.2.1 digest_0.6.37
[37] stringi_1.8.4 restfulr_0.0.15
[39] fastmap_1.2.0 grid_4.3.2
[41] colorspace_2.1-1 cli_3.6.2
[43] SparseArray_1.2.4 magrittr_2.0.3
[45] S4Arrays_1.2.1 XML_3.99-0.17
[47] utf8_1.2.4 withr_3.0.2
[49] scales_1.3.0 timechange_0.3.0
[51] rmarkdown_2.29 matrixStats_1.4.1
[53] hms_1.1.3 evaluate_1.0.1
[55] knitr_1.49 rlang_1.1.4
[57] glue_1.7.0 pkgload_1.4.0
[59] rstudioapi_0.17.1 jsonlite_1.8.9
[61] R6_2.5.1 MatrixGenerics_1.14.0
[63] GenomicAlignments_1.38.2 zlibbioc_1.48.2