Hi,
I created a RangedSummarizedExperiment object via the create_rse() function of the recount3 package (project == "LIHC" & file_source == "tcga"). The next step would be to extract the read counts for all available genes to use these counts for normalization and transformation via the estimateSizeFactors and vst function of DESeq2.
If I understand it correctly after using create_rse() I only have raw base-pair coverage counts stored in the "raw_counts" assay, which is not what I need.
In the quick start guide is written:
Using transform_counts() you can scale the counts and assign them to the “counts” assays slot to use them in downstream packages such as DESeq2 and limma.
So it looks like that transform_counts() is the function which gives me the gene read counts I need for my further analyses. But what about the compute_read_counts() function? What is the difference between transform_counts() and compute_read_counts()?
Thanks in advance.
Mario
Hi Leo,
thanks for the clarification. As I continued to work with the counts generated by transform_counts() I was wondering whether it is necessary to repeat all analyses with the counts generated by compute_read_counts() or if the results would in the end be the same.
If I understand it correctly the counts generated by
transform_counts()
are already normalized for sequencing depth, while the counts fromcompute_read_counts()
are not. So if we assume I normalize the scaled (transform_counts) and unscaled (compute_read_counts) counts with the DESeq2estimateSizeFactors()
function, the size factors for scaled counts should not account for any differences in sequencing depth, while the size factors of the unscaled counts should. Shouldn't I expect that the normalized counts of the two approaches are roughly the same?I hope you get my idea.
My overall aim is to have DESeq2 normalized and vst-stabilized expression estimates for correlation analyses.
Best Mario