I am planning to perform WGCNA analysis on my transcriptomics data, which includes 200 samples and approximately 49,000 genes. The primary goal is to investigate the associations between gene modules and disease status. However, I have encountered some confusion after reviewing several forum posts, and I would like to confirm a few points regarding the workflow.
Through preliminary analysis (PCA plots), I have observed that gender appears to have an effect on the transcriptomic data. Given this, I would like to ask whether it would make sense to first normalize the count data using DESeq2 and VST before performing WGCNA. Specifically, I intend to use the following code:
dds <- DESeqDataSetFromMatrix(countData = raw_counts,
colData = metadata,
design = ~1)
dds <- DESeq(dds)
exposures_Normalized <- t(assay(vst(dds)))
modules_data <- blockwiseModules(exposures_Normalized,
power = 8,
networkType = "signed",
TOMType = "signed",
corType = "bicor")
After performing clustering, I plan to subset the data by gender and examine the correlation between the resulting gene modules and disease traits. I would appreciate any feedback on whether this approach is appropriate or if there are better strategies for handling gender effects in WGCNA.
OR, should I use Gender in the deseq2 design?
Thank you for your time and insights!