Convert Seurat Object to Summarised Experiment or ExpressionSet
1
0
Entering edit mode
Bine ▴ 50
@bine-23912
Last seen 6 months ago
UK

Good afternoon,

Can anyone advice me on how to perform a conversion from a Seurat Object to a Summarised Experiment or ExpressionSet. I need this format unfortunately for a specific software.

Thank you very much!

SummarizedExperiment ExpressionSet • 2.7k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

Have you looked at the help pages for Seurat?

0
Entering edit mode

Also, you almost surely don't want a SummarizedExperiment and for sure you don't want an ExpressionSet. You want a SingleCellExperiment object.

ADD REPLY
0
Entering edit mode

No, I want a SummarisedExperiment or ExpressionSet. I need it in this format.

ADD REPLY
1
Entering edit mode

A SingleCellExperiment IS a SummarizedExperiment, with added features required for scRNA-Seq analyses. The basic SummarizedExperiment object is meant for bulk RNA-Seq or microarray data, and doesn't have things like a reducedDims slot. I suppose you could just pull out things that map from a SingleCellExperiment to a SummarizedExperiment, but I am 99% sure there isn't a coercion method, so you are likely on your own if you really have to use a SummarizedExperiment.

ADD REPLY
0
Entering edit mode

Its because I want to run the GSVA software on my single cell data (i am treating cells as samples). It requires this format:

A matrix of expression values with genes corresponding to rows and samples corresponding to columns.

An ExpressionSet object; see package Biobase.

A SummarizedExperiment object, see package SummarizedExperiment.

I know there is a package scGSVA but it doesnt have the full functionality i have in GSVA package.

GSVA: https://bioconductor.statistik.tu-dortmund.de/packages/3.16/bioc/vignettes/GSVA/inst/doc/GSVA.html#6_Example_applications

ADD REPLY
1
Entering edit mode

Hi, here the GSVA maintainer, the current release version of GSVA (1.50.2) running on Bioconductor 3.18 released on October 2023 started giving support to SingleCellExperiment objects, which will improve in the next release of the package to come on May 1st, 2024. See help(gsvaParam) and class ? GsvaExprData. You can use sessionInfo() to check the version of GSVA that you're using and if you are unsure about how to update the software, please consult https://bioconductor.org/install.

ADD REPLY
0
Entering edit mode

That are great news! Thank you. However, we only have R version 4.1. So we can only run bioconductor version 3.14. I believe: https://bioconductor.org/about/release-announcements/

Is there any easier way I havent thought about?

ADD REPLY
1
Entering edit mode

Well, you can extract the matrix of normalized expression values from the Seurat object and provide that matrix to the gsva() function, assuming that such a matrix fits in the main memory of your hardware. If it would be too large, you can try to filter out lowly-expressed genes, which is a good practice anyway, and also try converting that matrix into a sparse matrix with Matrix::Matrix(x, sparse=TRUE), because I think that version of GSVA should already be able to accept dgCMatrix objects. In any case, for GSVA or any other Bioconductor package, it is strongly recommended to work with the latest release of the software, not only to be able to use the latest feature, but also because bugfixes and documentation improvements happen throughout releases.

ADD REPLY
0
Entering edit mode

Thank you very much. I converted it to a matrix and used the gsva() function. However, I dont know how to deal with the metadata then. I have two groups of patients (e.g control / treated). Thanks for all your input!

ADD REPLY
0
Entering edit mode

In addition to above comment: I have installed now locally R Studio with appropriate R version. I was able to ran gsva on a SingleCellExperiment and got these results:

class: SingleCellExperiment 
dim: 2 4697 
metadata(0):
assays(1): es
rownames(2): chr3_p_21 chr3_p_25
rowData names(0):
colnames(4697): AAACCTGCACGGCTAC-1 AAACCTGCAGTATCTG-1 ... TTTGTCATCGTCACGG-1 TTTGTCATCTTATCTG-1
colData names(9): orig.ident nCount_RNA ... copykat_pred ident
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):**

I dont find any information on how to extract now my enrichment scores for a SingleCellExperiment. Can you please clarify? I would like to do a heatmap with two copykat_pred metadata categories included. Similar to this one:

library(RColorBrewer)
subtypeOrder <- c("Proneural", "Neural", "Classical", "Mesenchymal")
sampleOrderBySubtype <- sort(match(gbm_es$subtype, subtypeOrder),
                             index.return=TRUE)$ix
subtypeXtable <- table(gbm_es$subtype)
subtypeColorLegend <- c(Proneural="red", Neural="green",
                        Classical="blue", Mesenchymal="orange")
geneSetOrder <- c("astroglia_up", "astrocytic_up", "neuronal_up",
                  "oligodendrocytic_up")
geneSetLabels <- gsub("_", " ", geneSetOrder)
hmcol <- colorRampPalette(brewer.pal(10, "RdBu"))(256)
hmcol <- hmcol[length(hmcol):1]

heatmap(exprs(gbm_es)[geneSetOrder, sampleOrderBySubtype], Rowv=NA,
        Colv=NA, scale="row", margins=c(3,5), col=hmcol,
        ColSideColors=rep(subtypeColorLegend[subtypeOrder],
                          times=subtypeXtable[subtypeOrder]),
        labCol="", gbm_es$subtype[sampleOrderBySubtype],
        labRow=paste(toupper(substring(geneSetLabels, 1,1)),
                     substring(geneSetLabels, 2), sep=""),
        cexRow=2, main=" \n ")
par(xpd=TRUE)
text(0.23,1.21, "Proneural", col="red", cex=1.2)
text(0.36,1.21, "Neural", col="green", cex=1.2)
text(0.47,1.21, "Classical", col="blue", cex=1.2)
text(0.62,1.21, "Mesenchymal", col="orange", cex=1.2)
mtext("Gene sets", side=4, line=0, cex=1.5)
mtext("Samples          ", side=1, line=4, cex=1.5)

https://bioconductor.org/packages/devel/bioc/vignettes/GSVA/inst/doc/GSVA.html#6_Example_applications

ADD REPLY
1
Entering edit mode

A SingleCellExperiment object is inherited from a SummarizedExperiment object, so many, or maybe all, of the getter methods for a SummarizedExperiment object work on a SingleCellExperiment object. The GSVA scores are return in an assay called es, since this is they first and only assay in your returned object, you can write the following instruction to retrieve the GSVA scores:

es <- assay(x)

where x should be the name of the SingleCellExperiment object. Regarding how to retrieve the metadata, if it was part of the input SingleCellExperiment object, then it should also be part of the output object, and again applies the same principle I described before, you should be able to retrieve it writing:

phenodata <- colData(x)

I recommend you taking a look at the vignettes of the packages SummarizedExperiment and SingleCellExperiment to get acquainted with how to pull out the different bits of data and metadata that these types of object store. ```

ADD REPLY
0
Entering edit mode

Thank you so much. This is extremely helpful. I have been searching for a while now.

Please apologise one final question - when running this

gbmPar <- gsvaParam(SingleCellExperiment, gs, maxDiff=FALSE)

which dataslot is it taking form the SingleCellExperiment? I believe I should input normalised data (which should be the logcounts slot (=data slot in Seurat object)) into gsvaParam. Is it taking the normalised data?

ADD REPLY
1
Entering edit mode

Try:

gbmPar <- gsvaParam(SingleCellExperiment, gs, maxDiff=FALSE, assay="logcounts")

If you do not specify an assay, it will by default use the first one, i.e. assays(SingleCellExperiment)[[1]].

ADD REPLY
0
Entering edit mode

Thank you. I tried this but get the following warning even though I have logcounts assay:

Warning message:
In gsvaParam(T3, gs, maxDiff = FALSE, assay = "logcounts") :
  argument assay='logcounts' ignored since exprData has no assayNames()

enter image description here

ADD REPLY
0
Entering edit mode

Oops, sorry, that's unexpected... but I haven't tested on objects from Seurat::as.SingleCellExperiment() so far...

Could you please let me know the output of:

assayNames(T3)

and

sessionInfo()

And is that data set publicly available so I could use it for testing? Thanks!

ADD REPLY
0
Entering edit mode

assayNames(T3) [1] "counts" "logcounts"

 sessionInfo()
R version 4.3.3 (2024-02-29 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default


locale:
[1] LC_COLLATE=English_United Kingdom.utf8  LC_CTYPE=English_United Kingdom.utf8   
[3] LC_MONETARY=English_United Kingdom.utf8 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.utf8    

time zone: Europe/Madrid
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] RColorBrewer_1.1-3          GSVA_1.50.2                 BiocManager_1.30.22        
 [4] SingleCellExperiment_1.24.0 SummarizedExperiment_1.32.0 Biobase_2.62.0             
 [7] GenomicRanges_1.54.1        GenomeInfoDb_1.38.8         IRanges_2.36.0             
[10] S4Vectors_0.40.2            BiocGenerics_0.48.1         MatrixGenerics_1.14.0      
[13] matrixStats_1.2.0          

loaded via a namespace (and not attached):
 [1] SparseArray_1.2.4         bitops_1.0-7              RSQLite_2.3.6             lattice_0.22-5           
 [5] sparseMatrixStats_1.14.0  grid_4.3.3                fastmap_1.1.1             blob_1.2.4               
 [9] Matrix_1.6-5              GSEABase_1.64.0           AnnotationDbi_1.64.1      graph_1.80.0             
[13] DBI_1.2.2                 httr_1.4.7                HDF5Array_1.30.1          XML_3.99-0.16.1          
[17] Biostrings_2.70.3         codetools_0.2-19          abind_1.4-5               cli_3.6.2                
[21] rlang_1.1.3               crayon_1.5.2              XVector_0.42.0            bit64_4.0.5              
[25] cachem_1.0.8              DelayedArray_0.28.0       S4Arrays_1.2.1            beachmat_2.18.1          
[29] tools_4.3.3               parallel_4.3.3            BiocParallel_1.36.0       annotate_1.80.0          
[33] memoise_2.0.1             Rhdf5lib_1.24.2           GenomeInfoDbData_1.2.11   rsvd_1.0.5               
[37] vctrs_0.6.5               R6_2.5.1                  png_0.1-8                 rhdf5_2.46.1             
[41] zlibbioc_1.48.2           KEGGREST_1.42.0           BiocSingular_1.18.0       bit_4.0.5                
[45] irlba_2.3.5.1             ScaledMatrix_1.10.0       Rcpp_1.0.12               rhdf5filters_1.14.1      
[49] xtable_1.8-4              DelayedMatrixStats_1.24.0 compiler_4.3.3            RCurl_1.98-1.14  

The dataset is not publicly available.

Thanks a lot!

ADD REPLY
1
Entering edit mode

Sorry again -- improved support for sparse matrices and SingleCellExperiment is currently under development and hence a bit of a moving target. I have found that this issue is already fixed in what is currently the development version of GSVA but not in the release (1.50.2) that you are using. We have just uploaded the fix as version 1.50.3 and it is now on its way to the Bioconductor release.

If you want to use it immediately, you can try installing it from GitHub using:

install.packages("remotes")
library(remotes)
install_github("rcastelo/GSVA", ref="RELEASE_3_18")

and let us know if it still doesn't work for you.

ADD REPLY
0
Entering edit mode

Thanks I will try it asap. Do you think using counts or logcounts for 1 sample (one individual) will change gsva results much?

ADD REPLY
0
Entering edit mode

The GSVA method expects input data to be normalized and needs at least 10 samples, if you just have one sample you may try using the ssGSEA method.

ADD REPLY
0
Entering edit mode

okay, but the results I am seeing are what I would expect already (I am using it to double check some results from copykat). Is there a manual for the ssGSEA method? Thanks again for all your support today.

ADD REPLY
1
Entering edit mode

You should ask in the official support mailing list for ssGSEA, which I believe is this Google Group. Please search first whether they answered already your question and try to be very precise about what you want to calculate.

This thread is now too long and became off-topic with respect to your original question, so if you have further questions about GSVA please write a new post and make sure the term GSVA is included as a tag. Thanks for using GSVA!

ADD REPLY
0
Entering edit mode

Yes, I have.

ADD REPLY
1
Entering edit mode

Maybe take a look at Seurat::as.SingleCellExperiment()

ADD REPLY

Login before adding your answer.

Traffic: 714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6