clustering RNA-Seq data and performing gene set enrichment analysis

0

Entering edit mode

Julie Leonard ▴ 110

@julie-leonard-5222

Last seen 10.2 years ago

1) Is RNA-Seq data even appropriate for "standard" cluster analysis due to its discrete nature? What normalization should be done beforehand? We tend to perform length and TMM normalization of our data. 2) If we perform some sort of clustering of RNA-Seq data, and then obtain a gene list from a cluster (e.g. all genes in a cluster) and then want to perform gene set enrichment analysis on this gene list, is just using the Fisher's Exact Test by itself ok or do we need to account for gene length (e.g. use GOSeq)? I know that RNA-Seq data has the bias that longer genes tend to be more often called differentially expressed due to an increase in statistical power. The issue here is that longer genes --> more reads --> lower variance --> higher power to detect differences? I am wondering if this difference in variance levels between long and short genes would have an effect on the results of clustering? Thanks, -Julie

Normalization Clustering Normalization Clustering • 1.2k views

ADD COMMENT • link 12.6 years ago Julie Leonard ▴ 110

0

Entering edit mode

Did you ever get an answer?

ADD REPLY • link 9.2 years ago nickbern92 • 0

Login before adding your answer.