Hello,
I recently used WGCNA to analyze a 15 sample set (7 cases and 8 controls), and it appears to have worked swimmingly. The analysis yielded a resulting module highly correlated to disease status (.7) and the GO results are highly consistent with previous literature for this disease. Looking forward to in vitro validation experiments!
Looking back on the theory of WGCNA, I am unclear as to why we need to raise the similarity matrix to an exponential power to approximate scale free topology.
I see in the WGCNA manual, raising the similarity matrix to an exponential power is useful as raw data is noisy and often studies have a limited sample size. I also see in the 2005 paper a discussion of why soft thresholding is better than hard thresholding (to prevent loss of information and the arbitrary nature in which one goes about choosing a threshold to determine if a pair of genes are connected).
I see that the mean connectivity for my dataset is quite high if I were to raise the similarity matrix to an exponential power of 1 (as in leave it as is). As I understand it, this would violate scale free topology as scale free topology is characterized by few nodes with high connectivity and many nodes with low connectivity.
However, if metabolic networks display properties of scale free topology, why do we need to transform the similarity matrix by raising it to an exponential power at all as opposed to the raw data reflecting that on its own?
I feel I am missing something.
Thanks!
Dave Brohawn
Anybody have an updated link to the FAQ page Peter referenced here.
Sorry, no permanent home for the tutorials yet, but they are provisionally available as a dropbox download at this link:
https://www.dropbox.com/scl/fo/4vqfiysan6rlurfo2pbnk/h?rlkey=3za224679kv8ulbpl5s9mwrkt&dl=0
Thanks Peter!