I am investing rare exonic variants in WES data from cases and controls using the SKAT-O method with GENESIS. I want to perform a sliding window approach defining a Window size and shift with the following code:
# make the window iterator object
iterator <- SeqVarWindowIterator(seqData, windowSize=10000, windowShift=500, verbose=FALSE)
I have run the analysis over chromosome 11 to try. If all the window have exactly 10000 pb, the step between the different windows are different than 500 and vary from each other. Here are the 10 first windows from the results:
# chr start end windows shift
1 11 188001 198000 3001
2 11 201001 211000 -8499
3 11 202501 212500 -7999
4 11 204501 214500 -4999
5 11 209501 219500 -8499
6 11 211001 221000 -8499
7 11 212501 222500 -7999
8 11 214501 224500 13001
9 11 237501 247500 23001
10 11 270501 280500 -9499
If I set a window shift of 500, why is it so different? Some regions seem not to be covered (but may be no exon in those regions? And some seem to be covered multiple times.
Thank you for your help!
The key line from the help page is "Only windows containing unique sets of variants are kept." In your case, the differing shifts between the final set of windows is because the variants in your GDS file are not uniformly distributed across the chromosome.