Question

basic stats for methylC-seq type of methylation data

0

Entering edit mode

Steve Shen ▴ 330

@steve-shen-3743

Last seen 10.3 years ago

Dear All, I really appreciate if someone could help me out with this basic statistical problem or provide some suggestions. I have a set of bisulphite methylation data (methylC-seq) at single base resolution. The mapping information for each base is including coverage (2-20X) and frequency of methylation if the base is a C. The summary for a sliding window say 500bps will be the percentage of methylated C observed over base coverage for each position. For example, sample A: Index, start, end, strand, methylC_observed, positions, coverage, C_type window1, 12297500, 12298000, +, 1/3/0/5/1/2, 12297573/12297779/12297631/12287774/12297854/12297958, 6/5/4/10/7/15, C/CG/CG/CHG/C/C window2, . . sample B: Index, start, end, strand, methylC_observed, coverage, positions, type window1, 12297500, 12298000, +, 3/0/3/0/1/0, 12297573/12297779/12297631/12287774/12297854/12297958, 12/9/11/10/3/5, C/CG/CG/CHG/C/C window2, I understand that each base should be treated differently, such as type of C or CG or CHG and so on. Regardless the C type for now, however, the real problems for me are 1) how to summarize the methylated C, 2) how to do normalization, 3) more importantly how to make a comparison between sample A and B on window by window bases, what statistics can be applied? Any help and suggestions are very appreciated.Thanks in advance, Steve [[alternative HTML version deleted]]

Coverage Coverage • 786 views

ADD COMMENT • link 13.8 years ago Steve Shen ▴ 330