Hello.
I am trying to calculate per cell TPMS for all the genes in my single-cell dataset in order to find cells that might ectopically express genes that are nto supposed to be there.
So far, I found the gene lengths and have divided my count matrix by the gene lengths to get the rpk.
My plan is to read the rpk back into the sce object as an assay "RPK" and then calculate the TPM using the "RPK" assay and read the TPM back in as a TPM assay.
This has been difficult because my files are so large.
Right now, because I have ~145,000 cells, I have two separate rpk files since R cannot accomodate dgC matrix files that large. I am trying to do the following to add the rpk dgc matrix in as an sce assay.
This is what I tried:
assay(sce,i="rpk")[,1:72573] <- first_half_rpk
assay(sce,i="rpk")[,72574:145146] <- second_half_rpk
#results.
'assay(<SingleCellExperiment>, i="character", ...)' invalid subscript 'i'
'rpk' not in names(assays(<SingleCellExperiment>))
I obviously know that "'rpk' not in names(assays(<SingleCellExperiment>))" , but I've looked several places, and this is the code that I found that allows a person to add in this assay information. I also looked elsewhere for examples of others adding assays, but most of the places I found information about adding assays involved performing uniform calculations across all the count data, which is not how you calculate TPM since gene lengths are different for every gene. I saw that tpm at least is an option as an assay in other contexts, so I know that it is possible. I searched the other questions on the forum, but had trouble finding anything that related directly to this. Thanks!