Adaptive Lasso and CrossValidation for SNV selection
0
0
Entering edit mode
crichard • 0
@crichard-11911
Last seen 8.0 years ago

Hello everyone,

 

I have available 17 000 variables (SNV frequencies, a certain number of zeros) for 40 patients. Each patient is represented by its response to a treatment : 13 responses, 27 no-responses. I want to extract a subset of SNV which can have strong prediction power.

Because of the large size of set of variables, there are strong correlations, that's why I'm considering adaptive-lasso. I used glmnet R package, Ridge initial estimated coefficients and the following R code :

library(cvTools)
library(glmnet)

err.test.response <- c()
err.test.noresponse <- c()
nbiters <- 50

for(i in 1:nbiters){
  ## k folds
  kflds <- 8
  flds <- cvFolds(length(y), K = kflds)
 
  pred.test <- c() ## predicted classes
  class.test <- c() ## real classes
 
  for(j in 1:kflds){
    ## Train
    x.train <- x[flds$which!=j,]
    y.train <- y[flds$which!=j]
    ## Test
    x.test <- x[flds$which==j,]
    y.test <- y[flds$which==j]
    ## Adaptive Weights Vetor
    cv.ridge <- cv.glmnet(x.train, y.train, family='binomial', alpha=0, standardize=FALSE,
                          parallel = TRUE, nfolds = 7)
    w3 <- 1/abs(matrix(coef(cv.ridge, s=cv.ridge$lambda.min)[, 1][2:(ncol(x)+1)] ))^1
    w3[w3[,1] == Inf] <- 999999999 
    
    ## Adaptive Lasso
    cv.lasso <- cv.glmnet(x.train, y.train, family='binomial', alpha=1, standardize=FALSE,
                          parallel = TRUE, type.measure='class', penalty.factor=w3, nfolds = 7)
    ## Prediction
    pred.test <- c(pred.test, predict(cv.lasso, x.test, s = 'lambda.1se', type = c("class")))
    class.test <- c(class.test, as.character(y.test))
  }
 
  ## Prediction error
  err.test.noresponse <- c(err.test.noresponse, 1-sum(pred.test=="noresponse"&class.test=="noresponse")
                        /sum(class.test=="noresponse")) # noresponse error vector
  err.test.response <- c(err.test.response, 1-sum(pred.test=="response"&class.test=="response")
                        /sum(class.test=="response")) # response error vector
}

mean(err.test.noresponse) ## Mean noresponse prediction error
mean(err.test.response) ## Mean response prediction error

 

Is it good to do an external cross-validation like this to evaluate adaptive-lasso prediction power on my data ?

My results are not conclusive at all, I have mean(err.test.noresponse) = 0.15 and mean(err.test.response)=0.88, so my model doesn't succeed to identify the response. Have you got an idea why my results are so bad and how could I improve this ?

 

Thanks for your help and your ideas,

Corentin

adaptive-lasso snv biomarkers cv glmnet • 2.3k views
ADD COMMENT
0
Entering edit mode

Nobody has an idea ?

ADD REPLY

Login before adding your answer.

Traffic: 537 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6