Normalization of RNA seq data
1
0
Entering edit mode
@isabelpieterse-23295
Last seen 4.6 years ago

Currently, I am analysing sc-RNA sequencing data. As far as I know, there are several normalization methods available when differential gene expression analysis is performed. However, in my case, I have a predefined set of genes (n=530), and I want to compare the expression of these genes between different kinds of cells (intersample comparison).

To my (very basic) knowledge, accounting for total no. of reads is important for between-sample comparisons, so CPM should do that trick. But I do not know if I should normalize in more ways. Accounting for gene length does not seem necessary, since the genes are the same in every condition. Samples can be from different subjects, but all data is from one dataset.

Thank you for your help.

RNA seq cpm • 888 views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 2 days ago
Republic of Ireland

Despite the number of genes, I see no reason not to use a standard package for this process, e.g., Seurat or scran.

Kevin

ADD COMMENT
0
Entering edit mode

Thank you so much for your answer! Forgive my ignorance, but I feel like these packages were designed for a fundamentally different question. I'm not doing a DE analysis, but already have a predefined set of genes I want to compare. So currently I am summing the CPM count of these genes, and comparing this sum of expression between cell types. But I'm wondering if I'm not overlooking some normalization I should perform on these counts.

ADD REPLY
0
Entering edit mode

Sure, but allow me to put my answer in another way: I see no problem in using a standardised workflow for your data, such as those provided by the scran or Seurat authors. With either, you can normalise your raw UMI counts (if that's what you have?), deal with batch effects, low count genes, mitochondrial artefacts, etc., and, ultimately, transform this data to a normal distribution, suitable for any parametric downstream statistical test that you want to use.

If you are content to just calculate the CPM values manually, then that is fine, but a seasoned reviewer will criticise you for not dealing with the known sources of bias in a scRNA-seq analysis.

How your scRNA-seq wet-lab protocol was conducted is important, as is the count method used.

ADD REPLY

Login before adding your answer.

Traffic: 654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6