Hello everyone!
I have been using the TCGAbiolinks package for the last couple years to access RNAseq data for the TCGA-LAML project. Just very recently, I had noticed that I could no longer use GDCquery to retrieve counts quantified by HT-seq (although I had been able to do so for years), and was prompted to use "STAR - Counts" instead.
When I fixed my code to reflect this, I get this error:
RNA_query <- GDCquery ( project = "TCGA-LAML",
data.category = "Transcriptome Profiling",
data.type = "Gene Expression Quantification",
workflow.type = "STAR - Counts",
experimental.strategy = "RNA-Seq")
GDCdownload(RNA_query)
RNA_counts <- GDCprepare(RNA_query, summarizedExperiment = FALSE)
|=====================================================================|100% Completed after 35 s
Error in `vectbl_as_col_location()`:
! Can't subset columns past the end.
ℹ Locations 2, 3, and 4 don't exist.
ℹ There is only 1 column.
Run `rlang::last_error()` to see where the error occurred.
There were 50 or more warnings (use warnings() to see the first 50)
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.2.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.8 purrr_0.3.4
[5] readr_2.1.2 tidyr_1.2.0 tibble_3.1.6 ggplot2_3.3.5
[9] tidyverse_1.3.1 edgeR_3.36.0 limma_3.50.1 TCGAbiolinks_2.22.4
loaded via a namespace (and not attached):
[1] colorspace_2.0-3 ggsignif_0.6.3 rjson_0.2.21
[4] ellipsis_0.3.2 XVector_0.34.0 fs_1.5.2
[7] GenomicRanges_1.46.1 rstudioapi_0.13 ggpubr_0.4.0
[10] bit64_4.0.5 lubridate_1.8.0 AnnotationDbi_1.56.2
[13] fansi_1.0.3 xml2_1.3.3 splines_4.1.2
[16] R.methodsS3_1.8.1 cachem_1.0.6 knitr_1.38
[19] jsonlite_1.8.0 Rsamtools_2.10.0 broom_0.7.12
[22] km.ci_0.5-2 dbplyr_2.1.1 png_0.1-7
[25] R.oo_1.24.0 compiler_4.1.2 httr_1.4.2
[28] backports_1.4.1 assertthat_0.2.1 Matrix_1.4-1
[31] fastmap_1.1.0 lazyeval_0.2.2 cli_3.2.0
[34] prettyunits_1.1.1 tools_4.1.2 gtable_0.3.0
[37] glue_1.6.2 GenomeInfoDbData_1.2.7 rappdirs_0.3.3
[40] Rcpp_1.0.8.3 carData_3.0-5 Biobase_2.54.0
[43] cellranger_1.1.0 vctrs_0.4.0 Biostrings_2.62.0
[46] rtracklayer_1.54.0 xfun_0.30 rvest_1.0.2
[49] lifecycle_1.0.1 restfulr_0.0.13 ensembldb_2.18.4
[52] rstatix_0.7.0 XML_3.99-0.9 zlibbioc_1.40.0
[55] zoo_1.8-9 scales_1.1.1 hms_1.1.1
[58] MatrixGenerics_1.6.0 ProtGenerics_1.26.0 parallel_4.1.2
[61] SummarizedExperiment_1.24.0 AnnotationFilter_1.18.0 yaml_2.3.5
[64] curl_4.3.2 memoise_2.0.1 gridExtra_2.3
[67] KMsurv_0.1-5 downloader_0.4 biomaRt_2.50.3
[70] stringi_1.7.6 RSQLite_2.2.11 S4Vectors_0.32.4
[73] BiocIO_1.4.0 GenomicFeatures_1.46.5 BiocGenerics_0.40.0
[76] filelock_1.0.2 BiocParallel_1.28.3 GenomeInfoDb_1.30.1
[79] rlang_1.0.2 pkgconfig_2.0.3 matrixStats_0.61.0
[82] bitops_1.0-7 TCGAbiolinksGUI.data_1.14.0 lattice_0.20-45
[85] GenomicAlignments_1.30.0 bit_4.0.4 tidyselect_1.1.2
[88] plyr_1.8.7 magrittr_2.0.3 R6_2.5.1
[91] IRanges_2.28.0 generics_0.1.2 DelayedArray_0.20.0
[94] DBI_1.1.2 withr_2.5.0 haven_2.4.3
[97] pillar_1.7.0 survival_3.3-1 KEGGREST_1.34.0
[100] abind_1.4-5 RCurl_1.98-1.6 modelr_0.1.8
[103] crayon_1.5.1 car_3.0-12 survMisc_0.5.5
[106] utf8_1.2.2 BiocFileCache_2.2.1 tzdb_0.3.0
[109] progress_1.2.2 readxl_1.4.0 locfit_1.5-9.5
[112] grid_4.1.2 data.table_1.14.2 blob_1.2.2
[115] reprex_2.0.1 digest_0.6.29 xtable_1.8-4
[118] R.utils_2.11.0 stats4_4.1.2 munsell_0.5.0
[121] survminer_0.4.9
Any insight on why this is happening and how I may approach fixing the issue would be greatly appreciated!
Thanks, it works now. At first I had some trouble after the initial download it, but after restarting R it seems to have fixed it. Thanks for your hard work!
DELETED: I provided some misleading info and dont want to waste anybody's time, will get back when I try some more things