Advise on analyzing NGS of many genes or intergenic peaks in many conditions
0
0
Entering edit mode
Xiaohui Wu ▴ 280
@xiaohui-wu-4141
Last seen 10.3 years ago
Hi all, I have NGS data (each tag is 20nt from 30 libraries, total about 60 million) in different conditions and have filtered some genes and intergenic regions (both called peak here, total about 20,000 peaks, rice). For now, I came up some ideas as follows: 1) the correlation of expression in these peaks (here expression is the normalized tag count) between each pair of libraries 2) cluster peaks or libraries based on their peak expression, like heatmap function in R 3) the fluctuation (or deviation) of each peak in these 30 libraries, to find what peaks are with consistent expression and what peaks are with fluctuated expression ** Is there any effective way to calculate something like this? Is the standard deviation sd or coefficient of dispersion (sd/avg) enough? 4) DE peak between each pair of libraries or between each pair of clusters of libs. Then use GO to compare the function of different sets of DE peaks. ** Here, I tend to use clusters of libs to reduce the times of comparison, but do you think I can treat the libs in the same cluster as different repliates then use DE package like EdgeR or DESeq to find DE peak? 5) relative peak usage among these libraries. ** but I've no idea how to calculate this. I think just using (expression of the peak in one library) / (total expression of that peak in all libraries) is not suitable for this case, because there may some peaks expressed much lower than other peaks, while this won't be reflected in the formular. Any idea is appreciated. Thank you! Regards, Xiaohui [[alternative HTML version deleted]]
GO edgeR DESeq GO edgeR DESeq • 942 views
ADD COMMENT

Login before adding your answer.

Traffic: 531 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6