Hi,
I have been reading the GENESIS tutorial and wanted to confirm if the program would match the scanIDs for the genotype and phenotype data during the analysis?
OR does the phenotype data need to be in the same order as the genotype data when I input into the program?
There are two instances where the genotype and phenotype data need to be matched, so the answer depends on the instance.
1) In the creation of the GenotypeData object (or SeqVarData object), the genotypes and phenotypes need to be in the same order. The code checks that the scanID column in a ScanAnnotationDataFrame (or sample.id column in an AnnotatedDataFrame, for SeqVarData) is identical to the sample.id node in the GDS file. If they are not identical, you will get an error when trying to create the object.
2) The first step in an association test is fitNullModel, which only uses the phenotype data, and thus there are no requirements for the ordering of samples in this function. When the null model is combined with a genotype data object in the association test (assocTestSingle or assocTestAggregate), the samples will be matched on scanID/sample.id.