How to aggregate pseudobulks: Normalization & Log-Transformation
0
2
Entering edit mode
Tadeoye ▴ 20
@98d490f8
Last seen 8 months ago
United States

I am currently working on a single-cell data analysis project, and I am facing a challenge regarding the aggregation of single-cell data into pseudobulks for input into the GSVA software. GSVA only accepts a gene X subject matrix, which means that pseudobulks must be created to facilitate this input. I have come across two different approaches to the aggregation process and I am unsure of which one to use.

In a recent paper by Blanchard et al., pseudobulk counts were aggregated after normalizing and log-transforming the data. The authors computed normalized gene expression profile averages first, using ACTIONet, and then obtained individual-cell-type-level aggregated expression profiles. On the other hand, a single-cell tutorial suggests aggregating raw counts first, followed by normalization and log transformation. This step is important because the gaussian kernel I intend to use in GSVA software only accepts continuous expression data in logarithmic scale and RNA-seq log-CPMs, log-RPKMs, or log-TPMs units of expression.

I am unsure which approach to take. Should I normalize and log-transform the data first before aggregation, or should I aggregate first before normalization? I would greatly appreciate any guidance or insights on this matter.

pseudobulk scRNAseq GSVA • 1.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 675 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6