Hello,
I am trying to analyze 1.3 Million Brain Cells from E18 Mice from 10X using R (https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.3.0/1M_neurons) . Due to data size, I used graph clustering results containing 60 clusters and tested just one cluster. I am getting an error while running PCA on the SingleCellExperiment object.
Please find below my code:
library(restfulSE)
library(SingleCellExperiment)
library(Seurat)
library(BiocParallel)
BiocParallel::register(BiocParallel::MulticoreParam(workers=12))
####################################### processed result using graph clustering ########################
## set path
fpath = '/home/bsharmi6/NA_TF_project/scRNAseq_1million/'
## read clustering csv file
cluster.df = read.csv('/home/bsharmi6/NA_TF_project/scRNAseq_1million/analysis/clustering/graphclust/clusters.csv', h=T)
## read 10x
##https://bioconductor.org/packages/release/bioc/vignettes/zinbwave/inst/doc/intro.html
my10x = se1.3M()
## select a cluster
iclust = 21
## get cluster indices
cluster.df_i = cluster.df[cluster.df$Cluster %in% iclust,]
## reduce my10x to cluster
my10x_iclust = my10x[,my10x@colData$Barcode %in% cluster.df_i$Barcode]
## create sc object
sc <- as(my10x_iclust, "SingleCellExperiment")
## runPCA
sc <- runPCA(sc, exprs_values = "counts")
I get an error at the PCA step: Error in curl::curlfetchmemory(url, handle = handle) : Failed to connect to hsdshdflab.hdfgroup.org port 80: Connection refused
I get the following error if I try to create a Seurat object bypassing the PCA step:
seurat <- as.Seurat(sc, data = NULL)
Error in curl::curlfetchmemory(url, handle = handle) :
Failed to connect to hsdshdflab.hdfgroup.org port 80: Connection refused
Calls: as.Seurat ... requestfetch -> requestfetch.write_memory -> <anonymous>
Execution halted
In addition, I also get error for filtering step on the 'my10x_iclust object as follows
filter <- DelayedMatrixStats::rowSums2(assay(my10x_iclust)>5)>5
Error in curl::curlfetchmemory(url, handle = handle) : Recv failure: Connection reset by peer Calls: <anonymous> ... requestfetch -> requestfetch.write_memory -> <anonymous> Execution halted
```sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)
Matrix products: default BLAS/LAPACK: /apps/easybuild/software/pegasus-sandybridge/OpenBLAS/0.3.1-GCC-7.3.0-2.30/lib/libopenblassandybridgep-r0.3.1.so
locale:
[1] LCCTYPE=enUS.UTF-8 LCNUMERIC=C
[3] LCTIME=enUS.UTF-8 LCCOLLATE=enUS.UTF-8
[5] LCMONETARY=enUS.UTF-8 LCMESSAGES=enUS.UTF-8
[7] LCPAPER=enUS.UTF-8 LCNAME=C
[9] LCADDRESS=C LCTELEPHONE=C
[11] LCMEASUREMENT=enUS.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets [8] methods base
other attached packages:
[1] restfulSEData1.4.0 ExperimentHub1.8.0
[3] AnnotationHub2.14.5 loomR0.2.1.9000
[5] hdf5r1.2.0 R62.4.0
[7] scater1.10.1 dplyr0.8.1
[9] zinbwave1.4.2 biomaRt2.38.0
[11] ggplot23.1.1 magrittr1.5
[13] scRNAseq1.8.0 Seurat3.0.1
[15] TENxGenomics0.0.27 Matrix1.2-14
[17] BiocFileCache1.6.0 dbplyr1.2.2
[19] SingleCellExperiment1.4.1 restfulSE1.4.1
[21] SummarizedExperiment1.12.0 DelayedArray0.8.0
[23] BiocParallel1.16.6 matrixStats0.54.0
[25] Biobase2.42.0 GenomicRanges1.34.0
[27] GenomeInfoDb1.18.2 IRanges2.16.0
[29] S4Vectors0.20.1 BiocGenerics0.28.0
loaded via a namespace (and not attached):
[1] copula0.999-19.1 bigrquery1.1.1
[3] plyr1.8.4 igraph1.2.4.1
[5] lazyeval0.2.2 splines3.5.1
[7] pspline1.0-18 listenv0.7.0
[9] digest0.6.19 foreach1.4.4
[11] htmltools0.3.6 viridis0.5.1
[13] GO.db3.7.0 gdata2.18.0
[15] memoise1.1.0 cluster2.0.7-1
[17] ROCR1.0-7 limma3.38.3
[19] annotate1.60.1 globals0.12.4
[21] stabledist0.7-1 R.utils2.8.0
[23] prettyunits1.0.2 colorspace1.3-2
[25] blob1.1.1 rappdirs0.3.1
[27] ggrepel0.8.1 crayon1.3.4
[29] RCurl1.95-4.11 jsonlite1.6
[31] genefilter1.64.0 iterators1.0.10
[33] survival2.44-1.1 zoo1.8-6
[35] ape5.3 glue1.3.1
[37] gtable0.3.0 zlibbioc1.28.0
[39] XVector0.22.0 Rhdf5lib1.4.3
[41] future.apply1.2.0 HDF5Array1.10.1
[43] scales1.0.0 mvtnorm1.0-10
[45] edgeR3.24.3 DBI1.0.0
[47] bibtex0.4.2 Rcpp1.0.1
[49] metap1.1 viridisLite0.3.0
[51] xtable1.8-2 progress1.2.0
[53] reticulate1.12 bit1.1-14
[55] rsvd1.0.1 SDMTools1.1-221.1
[57] rhdf5client1.4.1 tsne0.1-3
[59] glmnet2.0-16 htmlwidgets1.3
[61] httr1.4.0 gplots3.0.1.1
[63] RColorBrewer1.1-2 ica1.0-2
[65] pkgconfig2.0.2 XML3.98-1.15
[67] R.methodsS31.7.1 locfit1.5-9.1
[69] softImpute1.4 tidyselect0.2.5
[71] rlang0.3.4 reshape21.4.3
[73] later0.7.4 AnnotationDbi1.44.0
[75] munsell0.5.0 tools3.5.1
[77] RSQLite2.1.1 ggridges0.5.1
[79] stringr1.4.0 yaml2.2.0
[81] npsurv0.4-0 bit640.9-7
[83] fitdistrplus1.0-14 caTools1.17.1.2
[85] purrr0.3.2 RANN2.6.1
[87] pbapply1.4-0 future1.13.0
[89] nlme3.1-137 mime0.6
[91] R.oo1.22.0 compiler3.5.1
[93] beeswarm0.2.3 plotly4.9.0
[95] curl3.3 png0.1-7
[97] interactiveDisplayBase1.20.0 lsei1.2-0
[99] tibble2.1.2 pcaPP1.9-73
[101] stringi1.4.3 gsl2.1-6
[103] lattice0.20-35 pillar1.4.1
[105] ADGofTest0.3 BiocManager1.30.4
[107] Rdpack0.11-0 lmtest0.9-37
[109] data.table1.12.2 cowplot0.9.4
[111] bitops1.0-6 irlba2.3.3
[113] gbRd0.4-11 httpuv1.4.5
[115] promises1.0.1 KernSmooth2.23-15
[117] gridExtra2.3 vipor0.4.5
[119] codetools0.2-15 MASS7.3-50
[121] gtools3.8.1 assertthat0.2.1
[123] rhdf52.26.2 rjson0.2.20
[125] withr2.1.2 sctransform0.2.0
[127] GenomeInfoDbData1.2.0 hms0.4.2
[129] grid3.5.1 tidyr0.8.3
[131] DelayedMatrixStats1.4.0 Rtsne0.15
[133] numDeriv2016.8-1 shiny1.1.0
[135] ggbeeswarm_0.6.0 ```
The size of this cluster is not very big (27998 genes and 18919 cells) so I am wondering why is it failing. If I use the randomly sampled 20k cells by 10X I do not have any problem creating the Seurat object. Can someone please let me know how to solve this problem?
Thank you very much
I can't offer much guidance, but if I try going to hsdshdflab.hdfgroup.org in a browser it times out too, suggesting this is an issue with that server rather than either the R package or your code.