Hello,
I'm using switchBox to find different TSP of gene expressions predicting a diagnosis of prostate cancer in African American males and European American Males based on this paper and the dataset associated with it:
Website: https://www.ncbi.nlm.nih.gov/pubmed/18245496
The Link to the Dataset : https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE6956
I'm using the GEOQuery package to load the data in R. After separating the expressionSet into African Americans and European Americans, I tried running SwitchBox (as well as ktsp) on the African American expression set. The TSP scores that I got for the five pairs were all greater than 1 (around 1.000003 each) while for the ktsp package, I get TSP Value of 1 for all pairs. Since TSP scores can't be greater than 1, I'm confused as to what I'm doing wrong here. Any help is appreciated. Thanks.
Here is the code that I use to calculate it.
source("https://bioconductor.org/biocLite.R")
library(GEOquery)
library(switchBox)
datasets <- getGEO("GSE6956", GSEMatrix=TRUE)
gene_data = datasets[[1]]
AA_eset <- gene_data[, gene_data[["characteristics_ch1"]]=="race: African American"]
AA_label <- as.numeric(AA_eset[["source_name_ch1"]]) - 1
classifier_AA = SWAP.KTSP.Train(exprs(AA_eset),phenoGroup = factor(AA_label))
Here are the results that I got:
TSPs
[,1] [,2]
[1,] "220725_x_at" "208316_s_at"
[2,] "219024_at" "213977_s_at"
[3,] "210479_s_at" "214755_at"
[4,] "207516_at" "215212_at"
[5,] "37170_at" "211815_s_at"
$score
[1] 1.000003 1.000003 1.000003 1.000002 1.000002
$labels
[1] "0" "1"