GENESIS pcrelate() write gds file doesn't work
1
0
Entering edit mode
@ogiannakopoulou-20154
Last seen 4.4 years ago

Hello,

I'm trying to save the output of pcrelate in gds format using the write.as.gds=TRUE in my command but it doesn't work. I'm getting the following error: Error in .local(gdsobj, ...) : unused argument (write.to.gds = TRUE)

I have installed GENESIS following the Bioconductor instructions: if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("GENESIS", version = "3.8")

However when I do the sessionInfo() in R 2.5.0, it shows that the attached package is the GENESIS_2.12.4. I don't know if that makes any difference.

With write.to.gds=FALSE the command works fine but I have a quite large dataset so I would like to save it as gds.

I would appreciate any advice

Best regards Olga

genesis pcrelate gds • 2.7k views
ADD COMMENT
0
Entering edit mode

Is that really R 2.5.0? That is 2007!

ADD REPLY
0
Entering edit mode

I just noticed that reply. This was a typo. I was using R/3.5

ADD REPLY
0
Entering edit mode

Is that really R 2.5.0? That is 2007!

ADD REPLY
0
Entering edit mode
@stephanie-m-gogarten-5121
Last seen 4 months ago
University of Washington

The write.to.gds argument is no longer an option for pcrelate, because the output format has changed. Previously NxN matrices of kinship and IBD sharing coefficients were returned (or written to GDS) as matrices, but this information is now returned as a pairwise table and can be transformed into a matrix (with options for sparsity) with the function pcrelateToMatrix. If you want to in turn save that matrix in a GDS file, you can use the mat2gds function, but that will result in a much larger file than the sparse format provided by the Matrix package.

We have been working with a dataset of ~100,000 samples, and found that the previous version of pcrelate was unable to handle that many samples in any reasonable amount of time (and writing incrementally to GDS files was part of the problem). The best solution for very large sample sizes seems to be to run sample blocks in parallel. Currently that is not documented (as we're still working it out), but the next release of GENESIS will have options for using the functions that are currently internal to pcrelate independently for best performance in large datasets.

ADD COMMENT
0
Entering edit mode

Thank you Stephanie for the swift reply. I am a new user and I had found that option in an older genesis vignette maybe. Just to confirm, the two pcrelate outputs in the current version are the data.frames. kinBtwn and kinSelf, right?

Many thanks again for the help Olga

ADD REPLY
0
Entering edit mode

Yes, that is correct.

ADD REPLY
0
Entering edit mode

Thanks Stephanie for the confirmation. I'm working on a cluster with limitations in running time so I'm having some problems to set up the pipeline and I was wondering if you could give me any advice. After some attempts, I have managed to run the PC-Relate command in my dataset and I have saved the two data.frames (mypcrelate$kinBtwn and mypcrelate$kinSelf). I'm not sure how I can create the pcrelateToMatrix with these two files as input though. I'm interested in using the PC-Relate output as input for PC-Air. Since I'm not sure how to recreate the mypcrelate matrix I have tried to create a KINGmat with these files using the "kingToMatrix" and the two files as .kin0 and .kin. However, it didn't work as it gives me an error "Error in FUN(X[[i]], ...) : Input is empty or only contains BOM or terminal control characters". Any advice would be more than welcomed since I am really kin in running the GENESIS pipeline for my non European dataset both for PCs and relatedness estimation.

ADD REPLY
0
Entering edit mode

pcrelateToMatrix takes the entire output object of pcrelate (the list of two data.frames) as its first argument:

pcmat <- pcrelateToMatrix(mypcrelate)
ADD REPLY
0
Entering edit mode

Thanks for the amazing support

ADD REPLY
0
Entering edit mode

For technical reasons I have to use R-3.4.1 version and GENESIS_2.8.1 instead of the new version. So I have to adjust my script to my previous version but I have been stuck. I have calculated the KING-robust estimates using the snpgdsIBDKING function from the SNPRelate package. However when I try to create the king matrix in this way:

KINGmat <- king2mat(file.kin0 = king$IBS0, file.kin = king$kinship, iids = iids)

It gives an "Error in read.table(file.kin0, header = TRUE) : 'file' must be a character string or connection"

Many thanks for the great help again

ADD REPLY
0
Entering edit mode

For technical reasons I have to use R-3.4.1 version and GENESIS_2.8.1 instead of the new version. So I have to adjust my script to my previous version but I have been stuck. I have calculated the KING-robust estimates using the snpgdsIBDKING function from the SNPRelate package. However when I try to create the king matrix in this way:

KINGmat <- king2mat(file.kin0 = king$IBS0, file.kin = king$kinship, iids = iids)

It gives an "Error in read.table(file.kin0, header = TRUE) : 'file' must be a character string or connection"

Many thanks for the great help again

ADD REPLY
0
Entering edit mode

You only need to use king2mat to import results from the command-line version of KING (where file.kin0 and file.kin are the paths to text files output by that software). If you are using snpgdsIBDKING, you already have the matrix in king$kinship.

ADD REPLY
0
Entering edit mode

Many thanks for the help. I had used the king$kinship but initially it was not working since the colnames and rownames are different. I changed them and is working fine now

ADD REPLY

Login before adding your answer.

Traffic: 446 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6