I was trying to do WGCNA analysis for microarray data. The raw data was processed by "MAS5.0 Scaled 100 generated expression value". In the prepocessing step, I am seeing is 3 levels of correction.
- Filter genes
- Quantile normalization
- Outlier sample removal
So the confusion arises which order they should be performed??? and what is the rationality behind this???
A brief description of each filter method-
First one is quantile normalization of samples using normalizeBetweenArray() function of limma package. Because in the desity polt of each sample using limma::plotDensities() shows intra-group variations. This figure shows the density plot of samples.
The removal of the outlier samples using like hierarchical clustering of samples or pca. and Filter genes is to reduce the number of genes, like by keeping those genes with expression levels greater than the median value in at least two samples, as the gene number is more than 20,000.
and not interested with goodSamplesGenes() function of WGCNA package because it is not helping, and shows all okay.