Enter the body of text here
Dear community,
I got an error when used TCGAquery_recount2 from TCGAbiolinks. Here are the code and the error as below. I did some search for the error but did not figure out why and how to fix it out. Here is a link related the issue https://support.bioconductor.org/p/125586/ , but I did not find an answer there.
Could you give some clue why does this happen? Thanks a lot!
Lung_recount2 <- TCGAquery_recount2(project = "tcga", tissue = "lung")
downloading Range Summarized Experiment for: lung
Error in load(url(con)) : cannot read from connection
In addition: Warning message:
In load(url(con)) :
URL 'http://idies.jhu.edu/recount/data/v2/TCGA/rse_gene_lung.Rdata': Timeout of 60 seconds was reached
sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid parallel stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] lsa_0.73.2 SnowballC_0.7.0 rgr_1.1.15 fastICA_1.2-2 MASS_7.3-53.1 limma_3.46.0
[7] survminer_0.4.9 ggpubr_0.4.0 stringr_1.4.0 survival_3.2-10 pvclust_2.2-0 factoextra_1.0.7
[13] FactoMineR_2.4 pheatmap_1.0.12 DESeq2_1.30.1 ggplot2_3.3.3 VennDiagram_1.6.20 futile.logger_1.4.3
[19] cowplot_1.1.1 SummarizedExperiment_1.20.0 Biobase_2.50.0 GenomicRanges_1.42.0 GenomeInfoDb_1.26.4 IRanges_2.24.1
[25] S4Vectors_0.28.1 BiocGenerics_0.36.0 MatrixGenerics_1.2.1 matrixStats_0.58.0 data.table_1.14.0 dplyr_1.0.5
[31] plyr_1.8.6 readxl_1.3.1 TCGAbiolinks_2.18.0
loaded via a namespace (and not attached):
[1] backports_1.2.1 BiocFileCache_1.14.0 splines_4.0.3 BiocParallel_1.24.1 digest_0.6.27 htmltools_0.5.1.1
[7] fansi_0.4.2 magrittr_2.0.1 memoise_2.0.0 cluster_2.1.1 remotes_2.2.0 openxlsx_4.2.3
[13] readr_1.4.0 annotate_1.68.0 R.utils_2.10.1 askpass_1.1 prettyunits_1.1.1 colorspace_2.0-0
[19] blob_1.2.1 rvest_1.0.0 rappdirs_0.3.3 ggrepel_0.9.1 haven_2.3.1 xfun_0.22
[25] crayon_1.4.1 RCurl_1.98-1.3 jsonlite_1.7.2 genefilter_1.72.1 zoo_1.8-9 glue_1.4.2
[31] gtable_0.3.0 zlibbioc_1.36.0 XVector_0.30.0 DelayedArray_0.16.2 car_3.0-10 abind_1.4-5
[37] scales_1.1.1 futile.options_1.0.1 DBI_1.1.1 rstatix_0.7.0 Rcpp_1.0.6 xtable_1.8-4
[43] progress_1.2.2 flashClust_1.01-2 foreign_0.8-81 bit_4.0.4 km.ci_0.5-2 DT_0.17
[49] htmlwidgets_1.5.3 httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.1 pkgconfig_2.0.3 XML_3.99-0.6
[55] R.methodsS3_1.8.1 dbplyr_2.1.0 locfit_1.5-9.4 utf8_1.2.1 tidyselect_1.1.0 rlang_0.4.10
[61] AnnotationDbi_1.52.0 munsell_0.5.0 cellranger_1.1.0 tools_4.0.3 cachem_1.0.4 downloader_0.4
[67] generics_0.1.0 RSQLite_2.2.4 broom_0.7.5 fastmap_1.1.0 knitr_1.31 bit64_4.0.5
[73] zip_2.1.1 survMisc_0.5.5 purrr_0.3.4 formatR_1.8 R.oo_1.24.0 leaps_3.1
[79] xml2_1.3.2 biomaRt_2.46.3 compiler_4.0.3 curl_4.3 ggsignif_0.6.1 tibble_3.1.0
[85] geneplotter_1.68.0 stringi_1.5.3 TCGAbiolinksGUI.data_1.10.0 forcats_0.5.1 lattice_0.20-41 Matrix_1.3-2
[91] KMsurv_0.1-5 vctrs_0.3.6 pillar_1.5.1 lifecycle_1.0.0 BiocManager_1.30.10 bitops_1.0-6
[97] R6_2.5.0 gridExtra_2.3 rio_0.5.26 lambda.r_1.2.4 assertthat_0.2.1 openssl_1.4.3
[103] withr_2.4.1 GenomeInfoDbData_1.2.4 hms_1.0.0 tidyr_1.1.3 carData_3.0-4 scatterplot3d_0.3-41
fixed it finally, because of the very poor internet connects
Thank you so much!
Kevin Blighe
But, I got another question. It is not related to this error.
I'd like to use the data from TCGA lung. But, after downloading, do you know how to differentiate the data from LUAD or LUSC? I tried to look the colData by colData(Lung_TCGA_recount2$tcga_lung)$project, but all are TCGA. Thanks a lot!
Hey, it seems to be in the following column:
Also take a look at
gdc_cases.project.project_id
columnThanks a lot!
Kevin Blighe Sorry, one more question, which one is the information for TCGA sample barcodes, eg. "TCGA-B2-3924-01B-03R-A277-07". How can I get the information in this format?
In fact, "gdc_cases.samples.portions.analytes.aliquots.submitter_id" is the one (in TCGA-62-A471-01A-12R..) with most information that I can find. But, I still want to the full name or barcode in TCGA-LUAD.
Thanks a lot!
I am not sure about that. Perhaps TCGAutils has a way to do it. Otherwise, you could retrieve the biotab metadata from the GDC (for LUAD), import that to R, and then use that
Thanks a lot, the information from "gdc_cases.samples.portions.analytes.aliquots.submitter_id" should be enough. Thanks a lot!