Advice on normalization with metagenomics data
2
0
Entering edit mode
David ▴ 860
@david-3335
Last seen 6.7 years ago

Hello,

I´m using the metagenomSeq package to normalize my 16S data for one experiment with several samples . Here below are the log2 boxplots of the data after normalization ( i have normalized at the genus level). It looks that the normalization has worked pretty well. I have tried to check the data with qqnorm (2nd graph below).In the qqnorm graph the data does not look very normal ????  I still see a lot of values close to (>0) , i assume these are basically singletons. 

I´m just wondering if this is what you expect from such metagenomics data and if i can apply normality tests (such as anova for eaxmple) to compare my groups or i should stick to the suggested methods in metagenomeSEQ for gorup comparisons.  How can i control my data has properly been normalized. Thanks for your advice.

 

metagenome normalization metagenomics • 2.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States

Are you thinking that 'normalize' should make your data normally distributed? If so, that's not the case. All a normalization is intended to do is remove as much technical variability between samples as possible, so you can then compare between samples without picking up uninteresting things about how the data were processed.

In other words, the Q-Q plot that you show is to be expected. Count data are not normally distributed, and ecological count data tend to be zero inflated (meaning you get lots of zeros, which may indicate that the species in question wasn't there, or maybe that it was there, but you just didn't count it). The statistics that metagenomeSeq uses are intended to work correctly, given those limitations of the data, whereas a 'regular' linear model is not.
 

ADD COMMENT
0
Entering edit mode
David ▴ 860
@david-3335
Last seen 6.7 years ago

Thanks James,

Thanks so much for the clarification. I think i understand the meaning of zero inflated now. It was just there but just needed some clarifications. I guess that not normal methods should be use to move forward starting with the methods that metagenomeSeq provides.

How do you know if the normalization has worked properly ?

ADD COMMENT

Login before adding your answer.

Traffic: 846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6