Question

GENESIS: PCAIR: Error in acc(object, NL[cnode]) : unmatched node provided

0

Entering edit mode

GENOMIC_region • 0

@genomic_region-13050

Last seen 4.2 years ago

Hello,

I'm using a small data of 10 people to work with GENESIS and generate relatedness using PC-relate function. I generate .gds file, create kinship using KING. When generating principal components I get error Error in acc(object, NL[cnode]) : unmatched node provided

R 3.4.2

################--Code begins

bed<-"selected.bed" fam<-"selected.fam" #10 people in sample data bim<-"selected.bim"

snpgdsBED2GDS(bed, fam, bim, "9040.gds") snpgdsSummary("9040.gds")

# read in GDS data geno <- GdsGenotypeReader(filename = "9040.gds") # create a GenotypeData class object genoData <- GenotypeData(geno)

iids <- getScanID(genoData) #head(iids) - looks good

KINGmat <- king2mat(file.kin0 = "small_data.kin0", iids = iids)

KINGmat[1:5,1:5] ## looks good. has colnames and rownames

mypcair <- pcair(genoData = genoData, kinMat = KINGmat, divMat = KINGmat) ##Error ###################--Code ends

Error:
Partitioning Samples into Related and Unrelated Sets...
Error in acc(object, NL[cnode]) : unmatched node provided

Version of packages used: GWASTools_1.24.1, Biobase_2.38.0, BiocGenerics_0.24.0, GENESIS_2.10.0, SNPRelate_1.14.0, gdsfmt_1.14.1, KING 2.1.2

Perhaps it's not able to find unrelated due to small sample size?

genesis pcair • 2.1k views

ADD COMMENT • link 7.0 years ago GENOMIC_region • 0

1

Entering edit mode

It took a while to chase down where this error is coming from. It turns out to be in the graph package:

> showMethods("acc", classes="graph", includeDefs=TRUE)
Function: acc (package graph)
object="graph", index="character"
function (object, index) 
{
    nN <- numNodes(object)
    nNames <- nodes(object)
    nIndex <- length(index)
    whN <- match(index, nNames)
    if anyis.na(whN))) 
        stop("unmatched node provided")
    rval <- vector("list", length = nIndex)
    names(rval) <- nNames[whN]
    for (i in seq_len(nIndex)) {
        marked <- rep(0, nN)
        distv <- rep(0, nN)
        names(distv) <- nNames
        distx <- 1
        names(marked) <- nNames
        nmkd <- 0
        marked[index[i]] <- 1
        done <- FALSE
        while (!done) {
            minds <- nNames[marked == 1]
            for (node in minds) {
                avec <- adj(object, node)[[1]]
                avec <- avec[marked[avec] == 0]
                marked[avec] <- 1
                distv[avec] <- distx
            }
            marked[minds] <- 2
            distx <- distx + 1
            newmk <- sum(marked == 1)
            if (newmk == 0) 
                done <- TRUE
        }
        marked[index[i]] <- 0
        rval[[i]] <- distv[marked == 2]
    }
    return(rval)
}

My best guess is that your sample names are somehow not matching up with the expected values. If you could share your data so I have a reproducible example, I could look into it further. If that's not an option, you could try seeting options(error=recover), which will help you figure out what is going on at the point in the code when that error is thrown. The above method is called from within pcairPartition (which is called from within pcair), so it might be easier to diagnose the problem if you run pcairPartition on its own.

ADD REPLY • link 6.9 years ago Stephanie M. Gogarten ▴ 890

1

Entering edit mode

You comment that the problem might be due to the small sample size may be correct. Can you try running the code with a larger dataset?

ADD REPLY • link 6.9 years ago Stephanie M. Gogarten ▴ 890

0

Entering edit mode

Thank you for your reply.

Yes, it works with larger sample size. Kinship I had with ten individuals had all negative values.

ADD REPLY • link 6.9 years ago GENOMIC_region • 0

0

Entering edit mode

Thank you so so much for digging into this. :)

ADD REPLY • link 6.9 years ago GENOMIC_region • 0

0

Entering edit mode

Hi Stephanie,

I am having a similar problem with the pcair function but it looks like the problem is kinMat$kinship matrix obtained from snpgdsIBDKING() has NaN values. Do you know why this happens? or how to overcome it?

ADD REPLY • link 6.5 years ago roll • 0

0

Entering edit mode

Hello,

I played around with some sample data, and I have an idea of why you are observing NaN values in your kinMat$kinship matrix obtained from snpgdsIBDKING(). If I have individuals in my data set who have no observed heterozygous genotype calls at any of the variants being used in the test, then I get an NaN kinship estimate out for that individual with all other pairs.

This makes sense, as the KING robust kinship estimator divides by the number of heterozygous calls observed for each individual, and dividing by 0 is obviously NaN.

I'm not sure how many variants you're using in your analysis, but a couple thoughts on how to fix this issue:

Check to see that you in fact have individuals with no het calls in the variants being used by snpgdsIBDKING()
Use more variants so everyone has observed het calls
Lower your MAF threshold to include more variants in the analysis

If you are using lots of variants and still have individuals with no het calls, that certainly seems odd to me. You may want to dig into that further if that's the case.

Matt

ADD REPLY • link 6.5 years ago mconomos ▴ 70

0

Entering edit mode

Hi Matt,

Thanks. I just checked but all genotypes have het calls in my vcf.

ADD REPLY • link 6.5 years ago roll • 0

0

Entering edit mode

Sorry, maybe what I was saying wasn't clear (or I'm misunderstanding what you said): It's not that every genotype needs to have a het call, it's that every individual needs to have a het call at some variants being used in your analysis. If that is true, then I'll try to dig some more.

A couple follow up questions:

How many variants are going into your analysis? You can get this by looking at the length of the snp.id vector included in the output of snpgdsIBDKING.
How many values in your kinship matrix are NaN? (what proportion?) Are they primarily in pairs with certain individual(s)? i.e. are certain rows or columns all NaN?

Matt

ADD REPLY • link 6.5 years ago mconomos ▴ 70