WGCNA package : identification of hub genes
0
2
Entering edit mode
bharata1803 ▴ 60
@bharata1803-7698
Last seen 5.7 years ago
Japan

I have a research problem that I want to solve. Basically, I want to find important genes per module that is generated from WGCNA algorithm.

I define important genes as hub genes. Hub genes is defined as genes that have most connectivity. I read from some paper that basically the calculation is to sum all the weight of each node, sort it from the highest to smallest, and select top 1%,5%, or 10%. 

After I have list of important genes, I need to find modules in the network generated from WGCNA, and map the hub genes to the modules. That way, I will have modules and important genes per module data.

To do that, I try the basic WGCNA tutorial because I am new to this package. I have followed WGCNA tutorial from : https://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/.

In tutorial 2b, I have followed until calculation of Topological Matrix (TOM). Below is the code:

softPower = 6;

adjacency = adjacency(datExpr, power = softPower);

TOM = TOMsimilarity(adjacency);

dissTOM = 1-TOM 

 

It seems this part is where the network is generated in the form of adjacency matrix.

My questions are:

1. Which matrix is used for calculation of hub genes? The adjaceny or dissTOM? I checked that dissTOM matrix contains number above 0.99. Is it right?

2. Is it better make some cutoff with a threshold to determine whether 2 genes are connected first before calculating the weight sum for determining hub genes? If a pair of gene has weight less than cutoff, I set it to 0, otherwise I set it to 1. That way, I just need to calculate how many 1 to determine the hub genes.

Thank you very much.

wgcna • 2.6k views
ADD COMMENT

Login before adding your answer.

Traffic: 512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6