Question

Adaptive Lasso and CrossValidation for SNV selection

0

Entering edit mode

crichard • 0

@crichard-11911

Last seen 8.0 years ago

Hello everyone,

I have available 17 000 variables (SNV frequencies, a certain number of zeros) for 40 patients. Each patient is represented by its response to a treatment : 13 responses, 27 no-responses. I want to extract a subset of SNV which can have strong prediction power.

Because of the large size of set of variables, there are strong correlations, that's why I'm considering adaptive-lasso. I used glmnet R package, Ridge initial estimated coefficients and the following R code :

library(cvTools)
library(glmnet)

err.test.response <- c()
err.test.noresponse <- c()
nbiters <- 50

for(i in 1:nbiters){
  ## k folds
  kflds <- 8
  flds <- cvFolds(length(y), K = kflds)
 
  pred.test <- c() ## predicted classes
  class.test <- c() ## real classes
 
  for(j in 1:kflds){
    ## Train
    x.train <- x[flds$which!=j,]
    y.train <- y[flds$which!=j]
    ## Test
    x.test <- x[flds$which==j,]
    y.test <- y[flds$which==j]
    ## Adaptive Weights Vetor
    cv.ridge <- cv.glmnet(x.train, y.train, family='binomial', alpha=0, standardize=FALSE,
                          parallel = TRUE, nfolds = 7)
    w3 <- 1/abs(matrix(coef(cv.ridge, s=cv.ridge$lambda.min)[, 1][2:(ncol(x)+1)] ))^1
    w3[w3[,1] == Inf] <- 999999999 
    
    ## Adaptive Lasso
    cv.lasso <- cv.glmnet(x.train, y.train, family='binomial', alpha=1, standardize=FALSE,
                          parallel = TRUE, type.measure='class', penalty.factor=w3, nfolds = 7)
    ## Prediction
    pred.test <- c(pred.test, predict(cv.lasso, x.test, s = 'lambda.1se', type = c("class")))
    class.test <- c(class.test, as.character(y.test))
  }
 
  ## Prediction error
  err.test.noresponse <- c(err.test.noresponse, 1-sum(pred.test=="noresponse"&class.test=="noresponse")
                        /sum(class.test=="noresponse")) # noresponse error vector
  err.test.response <- c(err.test.response, 1-sum(pred.test=="response"&class.test=="response")
                        /sum(class.test=="response")) # response error vector
}

mean(err.test.noresponse) ## Mean noresponse prediction error
mean(err.test.response) ## Mean response prediction error

Is it good to do an external cross-validation like this to evaluate adaptive-lasso prediction power on my data ?

My results are not conclusive at all, I have mean(err.test.noresponse) = 0.15 and mean(err.test.response)=0.88, so my model doesn't succeed to identify the response. Have you got an idea why my results are so bad and how could I improve this ?

Thanks for your help and your ideas,

Corentin

adaptive-lasso snv biomarkers cv glmnet • 2.3k views

ADD COMMENT • link 7.9 years ago crichard • 0

0

Entering edit mode

Nobody has an idea ?

ADD REPLY • link 7.9 years ago crichard • 0