Hello,
I downloaded the CNA data from TCGA (GDC) which is pre-segmented by CBS using the DNAcopy library from Bioconductors (Level 3). I am currently analyzing the data but cannot find a way eliminate noise in the form of very short segments that do not match the surrounding segments of longer probe length. In other words, I have consecutive segments on chromosome 2 where the first has 122511 probes with segment mean .0235, 3 probes with segment mean -1.5194, and 9606 probes with segment mean .0224. These short segments (low number of probes) that drastically differ from their neighbor segments that are much longer are all over my data from the TCGA and I do not know how to remove them properly after segmentation (since that is how the data comes). I have read up on pruning methods via dynamic programming and square mean, but they seem to take place prior to segmentation. I can use any help you are willing to give me, I am lost and dont know what to do next.
Thank you