Hello!
I am using SNPRelate to calculate Fst for sliding windows. There are two things that I cannot find information about.
(1) If I pass a set of samples having a specific order, for example and their corresponding populations:
> samps
[1] "H07750-L1" "H07754-L1" "H07760-L1" "H07775" "H07762-L1" "H07782-L1"
[7] "H07758-L1" "H07792-L1" "H07793-L1" "H07742-L1" "H07751-L1" "H07784"
[13] "H07746-L1" "H07767-L1" "H07781-L1" "H07741-L1" "H07779-L1" "H07748-L1"
[19] "H07778" "H07773-L1"
> pops
[1] pop1 pop1 pop1 pop1 pop1 pop1 pop1 pop1 pop1 pop1 pop2 pop2 pop2 pop2 pop2 pop2 pop2 pop2 pop2 pop2
Levels: pop1 pop2
After running the command:
res <- snpgdsSlidingWindow(genofile, winsize = 500000, shift = 250000, FUN ="snpgdsFst",sample.id = samps, population=pops, method = "W&C84")
The order of the samples is changed (sorted) in the output:
> res$sample.id
[1] "H07741-L1" "H07742-L1" "H07746-L1" "H07748-L1" "H07750-L1" "H07751-L1"
[7] "H07754-L1" "H07758-L1" "H07760-L1" "H07762-L1" "H07767-L1" "H07773-L1"
[13] "H07775" "H07778" "H07779-L1" "H07781-L1" "H07782-L1" "H07784"
[19] "H07792-L1" "H07793-L1"
I'm not sure what this means:
- Is this the order in which samples are assigned to the argument
population
? --> not desired res$sample.id
just shows the samples that were used, but they were assigned topopulation
as originally intended.
(2) Finally, how is the Fst window score calculated, is it the arithmetic mean of all Fst scores within?
Thanks in advance