I have a SingleCellExperiment object, and no matter what I do, when I run normalize(filtered.sce)
, I get the error: size factors should be positive real numbers
.
It is my understanding that even though computeSumFactors()
coerces to positive by default if necessary, it doesn't imply that normalize()
will run automatically.
I have done many things to my pancreas dataset (Segerstolpe et. al., 2016) in terms of QC after starting with the 1308 high-quality cells specified in the metadata. Nothing seems to be working:
libsize.drop <- isOutlier(sce$total_counts, nmads=3, type="lower", log=TRUE)
feature.drop <- isOutlier(sce$total_features_by_counts, nmads=3, type="lower", log=TRUE)
spike.drop <- isOutlier(sce$pct_counts_ERCC, nmads=3, type="higher")
- Together, these three methods removed 62, 73, and 143 cells, respectively, from the original 1308. This seems to be a lot.
- After defining
ave.raw.counts <- calcAverage(sce, use_size_factors=FALSE)
, I've reduced thesce
object down to the genes withave.raw.counts >= 1
, which is about 14000 out of the original 25000 genes
When running filtered.sce <- computeSumFactors(filtered.sce)
, it runs WITHOUT any warning of encountering negative size factor estimates.
However, when running the following two commands, I get a warning and then an error:
filtered.sce <- computeSpikeFactors(filtered.sce, type="ERCC", general.use=FALSE)
- Warning message: zero spike-in counts during spike-in normalization
filtered.sce <- normalize(filtered.sce)
- Error in .local(object,...): size factors should be positive real numbers
I even tried filtering by keep <- ave.raw.counts >= 50
just to see if there was any way I could get it to work, but my final error during normalization was still size factors should be positive real numbers
.
I would appreciate any help as to why this may be happening. I can also provide any more information that is required. Thank you so much.
This is extremely helpful, thank you so much. I've also batched
isOutlier()
by individual now.Had a quick question - does "remove cells with zero spike-in size factors" mean "remove cells whose read count for every spike-in is zero"?
If so, I was looking at the code you linked to. Your
for.hvg <- sce.emtab[,sizeFactors(sce.emtab, "ERCC") > 0 & sce.emtab$Donor!="AZ"]
line seems to be accomplishing this?Doing the same with my sce object (specifically,
filtered.sce.spike <- filtered.sce[,sizeFactors(filtered.sce,"ERCC") > 0]
results infiltered.sce.spike
having zero columns (zero cells). I had defined 72 spike-ins earlier. Am I missing something simple here? Perhaps there is a way I need to denote spike-ins that I have not done properly?Yes.
You probably filtered them out in your
calcAverage
filtering step. I would suggest not filtering explicitly, but rather usesubset.row
to filter within each function as needed. See comments here.Hi Aaron,
I am having a similar issue with CITE-Seq data. I have one control "Ig" and I am trying to perform control based normalization as suggested in the OSCA book. When I do the following: controls <- grep("Ig", rownames(altExp(sce))) sf.control <- librarySizeFactors(altExp(sce), subset_row=controls) sce <- logNormCounts(sce, use.altexps=TRUE)
I get the error since ~2000 cells have zero counts for the control antibody summary(sf.control) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0000 0.4953 0.9906 1.0000 1.4859 6.4389
I saw in the OSCA book the control size factor are also zero for some cells. My question is: how should I use the control normalization then? I calculated the median size factors and made a scatter plot vs the control factors, and they don't correlate at all.