HOW to FIX - PCAir excludes all my SNPs because there is no chromosome information
1
0
Entering edit mode
gabry.scata ▴ 10
@372e99bd
Last seen 4 weeks ago
Australia

Hi, I am just trying now to run PCAir and PCRelate. However, I ran into a problem that I am not sure how to fix.

When I run pcair(), I get the following error (highlighted in bold):

Using kinobj and divobj to partition samples into unrelated and related sets
Working with 425 samples
Identifying relatives for each sample using kinship threshold 0.0441941738241592
Identifying pairs of divergent samples using divergence threshold -0.0441941738241592
Partitioning samples into unrelated and related sets...
Unrelated Set: 375 Samples 
Related Set: 50 Samples
Performing PCA on the Unrelated Set...
Principal Component Analysis (PCA) on genotypes:
**Excluding 8,685 SNPs on non-autosomes**
**Error in .InitFile2(cmd = "Principal Component Analysis (PCA) on genotypes:",  : 
  There is no SNP!**

The code I used is below:

# GDS file obtained from a genlight object (dartR package) with the dartR::gl2gds" function. Then GDS file converted into gdsobj with the code below (from GDWAS tools package):

mydataset_geno <- GdsGenotypeReader(filename = "C:/documents/mydataset_GDS.gds")

class(mydataset_geno)       # [1] "GdsGenotypeReader" attr(,"package") [1] "GWASTools"

# object without annotation
    mydataset_genoData <- GenotypeData(mydataset_geno)

    str(mydataset_genoData)

Formal class 'GenotypeData' [package "GWASTools"] with 3 slots
  ..@ data     :Formal class 'GdsGenotypeReader' [package "GWASTools"] with 15 slots
  .. .. ..@ snpIDvar     : chr "snp.id"
  .. .. ..@ chromosomeVar: chr "snp.chromosome"
  .. .. ..@ positionVar  : chr "snp.position"
  .. .. ..@ scanIDvar    : chr "sample.id"
  .. .. ..@ genotypeVar  : chr "genotype"
  .. .. ..@ alleleVar    : chr "snp.allele"
  .. .. ..@ autosomeCode : int [1:22] 1 2 3 4 5 6 7 8 9 10 ...
  .. .. ..@ XchromCode   : int 23
  .. .. ..@ YchromCode   : int 25
  .. .. ..@ XYchromCode  : int 24
  .. .. ..@ MchromCode   : int 26
  .. .. ..@ genotypeDim  : chr "snp,scan"
  .. .. ..@ missingValue : int 3
  .. .. ..@ filename     : chr "C:/documents/mydataset"| __truncated__
  .. .. ..@ handler      :List of 5
  .. .. .. ..$ filename: chr "C:/documents/mydataset_GDS.gds_LocusC"| __truncated__
  .. .. .. ..$ id      : int 1
  .. .. .. ..$ ptr     :<externalptr> 
  .. .. .. ..$ root    : 'gdsn.class' raw [1:20] 08 00 00 00 ...
  .. .. .. ..$ readonly: logi TRUE
  .. .. .. ..- attr(*, "class")= chr "gds.class"
  ..@ snpAnnot : NULL
  ..@ scanAnnot: NULL

# PCAir
mydataset_pca_PCAir <- pcair(
                 mydataset_genoData, 
                 kinobj = mydataset_kingMat,
                 kin.thresh=2^(-9/2), # = 0.04 = 3rd degree kinship threshold, which corresponds to first cousins (actually 1stcousins kinship= 1/16 - Weir 2006 & 0.06>0.04)
                 divobj = mydataset_kingMat,
                 div.thresh=-2^(-9/2))

Using kinobj and divobj to partition samples into unrelated and related sets
Working with 425 samples
Identifying relatives for each sample using kinship threshold 0.0441941738241592
Identifying pairs of divergent samples using divergence threshold -0.0441941738241592
Partitioning samples into unrelated and related sets...
Unrelated Set: 375 Samples 
Related Set: 50 Samples
Performing PCA on the Unrelated Set...
Principal Component Analysis (PCA) on genotypes:
**Excluding 8,685 SNPs on non-autosomes
Error in .InitFile2(cmd = "Principal Component Analysis (PCA) on genotypes:",  : 
  There is no SNP!**

How to make sure that all SNPs are used in PCAir and not only Autosomes (as my SNPs have no chromosome position being a non-model species with no genome)?

Thank you for any help! Gabriella

**sessionInfo( ):**
R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.utf8  LC_CTYPE=English_Australia.utf8    LC_MONETARY=English_Australia.utf8
[4] LC_NUMERIC=C                       LC_TIME=English_Australia.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GENESIS_2.26.0      GWASTools_1.42.1    Biobase_2.56.0      BiocGenerics_0.42.0 InRelate_0.1.0     
 [6] remotes_2.4.2       fstcore_0.9.12      radiator_1.2.8      OutFLANK_0.2        LEA_3.8.0          
[11] vcfR_1.12.0         stockR_1.0.74       qvalue_2.28.0       SNPRelate_1.30.1    dartR_2.0.4        
[16] adegenet_2.1.7      ade4_1.7-19         plotrix_3.8-2       forcats_0.5.1       stringr_1.4.1      
[21] dplyr_1.0.9         purrr_0.3.4         readr_2.1.2         tidyr_1.2.0         tibble_3.1.7       
[26] tidyverse_1.3.1     ggplot2_3.4.0       plyr_1.8.7          SeqArray_1.36.3     gdsfmt_1.32.0      

loaded via a namespace (and not attached):
  [1] utf8_1.2.2             R.utils_2.12.0         tidyselect_1.1.2       RSQLite_2.3.2         
  [5] BiocParallel_1.30.4    grid_4.2.1             combinat_0.0-8         StAMPP_1.6.3          
  [9] GWASExactHW_1.01       devtools_2.4.3         munsell_0.5.0          codetools_0.2-18      
 [13] withr_2.5.0            colorspace_2.0-3       fst_0.9.8              pegas_1.1             
 [17] knitr_1.39             rstudioapi_0.13        stats4_4.2.1           labeling_0.4.2        
 [21] RgoogleMaps_1.4.5.3    logistf_1.26.0         GenomeInfoDbData_1.2.8 bit64_4.0.5           
 [25] farver_2.1.1           gap.datasets_0.0.5     rprojroot_2.0.3        vctrs_0.5.1           
 [29] generics_0.1.3         xfun_0.31              R6_2.5.1               doParallel_1.0.17     
 [33] GenomeInfoDb_1.32.2    fields_14.0            bitops_1.0-7           cachem_1.0.6          
 [37] reshape_0.8.9          assertthat_0.2.1       promises_1.2.0.1       scales_1.2.0          
 [41] vroom_1.5.7            pinfsc50_1.2.0         gtable_0.3.0           formula.tools_1.7.1   
 [45] processx_3.7.0         spam_2.9-0             sandwich_3.0-2         MatrixModels_0.5-0    
 [49] rlang_1.0.6            calibrate_1.7.7        splines_4.2.1          rgdal_1.5-32          
 [53] hexbin_1.28.2          broom_1.0.0            BiocManager_1.30.18    reshape2_1.4.4        
 [57] modelr_0.1.8           backports_1.4.1        httpuv_1.6.5           tools_4.2.1           
 [61] usethis_2.1.6          ellipsis_0.3.2         raster_3.5-21          RColorBrewer_1.1-3    
 [65] DNAcopy_1.70.0         sessioninfo_1.2.2      Rcpp_1.0.9             zlibbioc_1.42.0       
 [69] RCurl_1.98-1.7         ps_1.7.1               prettyunits_1.1.1      viridis_0.6.2         
 [73] zoo_1.8-12             S4Vectors_0.34.0       haven_2.5.0            cluster_2.1.3         
 [77] fs_1.5.2               magrittr_2.0.3         SeqVarTools_1.34.0     data.table_1.14.2     
 [81] SparseM_1.81           genetics_1.3.8.1.3     lmtest_0.9-40          reprex_2.0.1          
 [85] mvtnorm_1.1-3          pkgload_1.3.0          hms_1.1.1              patchwork_1.1.1       
 [89] mime_0.12              xtable_1.8-4           readxl_1.4.0           IRanges_2.30.0        
 [93] gridExtra_2.3          compiler_4.2.1         mice_3.14.0            maps_3.4.0            
 [97] crayon_1.5.1           gdistance_1.3-6        R.oo_1.25.0            htmltools_0.5.2       
[101] mgcv_1.8-40            later_1.3.0            tzdb_0.3.0             lubridate_1.8.0       
[105] DBI_1.1.3              dbplyr_2.2.1           PopGenReport_3.0.7     MASS_7.3-57           
[109] Matrix_1.4-1           permute_0.9-7          cli_3.4.1              R.methodsS3_1.8.2     
[113] gdata_2.18.0.1         parallel_4.2.1         dotCall64_1.0-1        igraph_1.3.2          
[117] GenomicRanges_1.48.0   pkgconfig_2.0.3        sp_1.5-0               terra_1.5-34          
[121] versions_0.3           xml2_1.3.3             foreach_1.5.2          XVector_0.36.0        
[125] rvest_1.0.2            quantsmooth_1.62.0     callr_3.7.1            digest_0.6.29         
[129] vegan_2.6-2            Biostrings_2.64.0      cellranger_1.1.0       operator.tools_1.6.3  
[133] curl_4.3.2             gap_1.2.3-6            quantreg_5.93          shiny_1.7.1           
[137] gtools_3.9.3           lifecycle_1.0.3        nlme_3.1-157           dismo_1.3-5           
[141] jsonlite_1.8.0         seqinr_4.2-16          viridisLite_0.4.0      fansi_1.0.3           
[145] pillar_1.7.0           lattice_0.20-45        GGally_2.1.2           survival_3.3-1        
[149] fastmap_1.1.0          httr_1.4.3             pkgbuild_1.3.1         glue_1.6.2            
[153] mmod_1.3.3             png_0.1-7              iterators_1.0.14       bit_4.0.4             
[157] stringi_1.7.8          blob_1.2.3             memoise_2.0.1          ape_5.6-2
GENESIS autosome-only=FALSE? PCAir • 767 views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

Add autosome.only = FALSE to your call. Note that pcair has an ellipsis argument (...), which allows you to pass in arbitrary arguments to underlying functions. If you check ?pcair and look at that argument description, it says you can pass arguments to snpgdsPCA, and if you look at ?snpgdsPCA, one of the arguments is autosome.only, which does exactly as you might expect.

ADD COMMENT

Login before adding your answer.

Traffic: 573 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6