Question

DEsubs - Error in read.table during DEsubs command

0

Entering edit mode

Talip ▴ 10

@talip-zengin-14290

Last seen 3 months ago

Türkiye

Hi,

I am using DEsubs for subpathway search for my differentially expressed genes. I get error below when running the DEsubs command. I used this code:

if (.Platform[['OS.type']] == 'unix') { 
options('DEsubs_CACHE'=file.path(path.expand("~"), 'DEsubs') ) }

library("DEsubs")

GeneExp_Matrix <- get(load("GeneExp_Matrix.rda"))
sign_genes <- get(load("sign_genes.rda"))

myRankedList <- sign_genes[c('adj.P.Val')]

DEsubs.out <- DEsubs(org='hsa',
                     mRNAexpr=GeneExp_Matrix,
                     mRNAnomenclature='ensembl_gene_id',
                     pathways='All',
                     DEtool=NULL, DEpar=0.05,
                     CORtool='pearson', CORpar=0.6,
                     subpathwayType='community.walktrap',
                     rankedList=myRankedList,
                     verbose=TRUE)

Pruning nodes...Error in read.table(file, dec = ".", sep = " ") : 
'file' must be a character string or connection

I checked the file structures of mine and the results are below:

str(GeneExp_Matrix)
num [1:56830, 1:124] 10139 0 6725 1173 2273 ...
- attr(*, "dimnames")=List of 2
  ..$ : chr [1:56830] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" "ENSG00000000457" ...
  ..$ : chr [1:124] "TCGA-38-4625-01A-01R-1206-07" "TCGA-38-4625-11A-01R-1758-07" "TCGA-38-4626-01A-01R-1206-07" "TCGA-38-4626-11A-01R-1758-07" ...

str(myRankedList)
'data.frame':    7292 obs. of  1 variable:
$ adj.P.Val: num  5.40e-52 4.91e-51 4.91e-51 3.29e-50 2.76e-49 ...

I checked also the files in toy-data of the package as below:

load(system.file('extdata', 'data.RData', package='DEsubs'))
str(mRNAexpr)
 num [1:6880, 1:8] 0 114 1647 31 128 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:6880] "226" "229" "230" "217" ...
  ..$ : chr [1:8] "Sample1" "Sample2" "Sample3" "Sample4" ...

str(rankedList)
 Named num [1:5023] 0.84 0.129 0.915 0.958 0.442 ...
 - attr(*, "names")= chr [1:5023] "226" "229" "230" "217" ...

Note: The rownames of rankedList in toy-data is Entrez Gene ID but I want to use Ensembl Gene ID.

How can I fix this error?

Thanks for any help.

sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: i686-pc-linux-gnu (32-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=tr_TR.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=tr_TR.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=tr_TR.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=tr_TR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DEsubs_1.4.0   locfit_1.5-9.1

loaded via a namespace (and not attached):
  [1] jsonlite_1.5               munsell_0.5.0              latticeExtra_0.6-28       
  [4] dplyr_0.7.6                DESeq_1.30.0               pillar_1.1.0              
  [7] pkgconfig_2.0.1            compiler_3.4.3             DBI_0.7                   
 [10] lazyeval_0.2.1             pkgmaker_0.27              tibble_1.4.2              
 [13] R6_2.2.2                   yaml_2.2.0                 graph_1.56.0              
 [16] BiocGenerics_0.24.0        rngtools_1.3.1             data.table_1.11.4         
 [19] annotate_1.56.2            xtable_1.8-2               gdata_2.18.0              
 [22] circlize_0.4.4             tools_3.4.3                stringr_1.3.1             
 [25] shape_1.4.4                bibtex_0.4.2               AnnotationDbi_1.40.0      
 [28] EBSeq_1.18.0               Biobase_2.38.0             registry_0.5              
 [31] foreach_1.4.4              digest_0.6.15              KernSmooth_2.23-15        
 [34] GenomeInfoDb_1.14.0        codetools_0.2-15           withr_2.1.2               
 [37] stats4_3.4.3               devtools_1.13.6            base64enc_0.1-3           
 [40] tidyselect_0.2.4           gplots_3.0.1               genefilter_1.60.0         
 [43] pheatmap_1.0.10            testthat_2.0.0             scales_1.0.0              
 [46] memoise_1.1.0              stringi_1.2.4              doRNG_1.7.1               
 [49] limma_3.34.9               assertthat_0.2.0           GlobalOptions_0.1.0       
 [52] purrr_0.2.5                GenomeInfoDbData_1.0.0     lattice_0.20-35           
 [55] bit64_0.9-7                Rcpp_0.12.18               caTools_1.17.1.1          
 [58] bindrcpp_0.2.2             Hmisc_4.1-1                Formula_1.2-3             
 [61] ggplot2_3.0.0              htmlTable_1.12             grid_3.4.3                
 [64] blob_1.1.0                 GenomicRanges_1.30.3       plyr_1.8.4                
 [67] survival_2.41-3            RBGL_1.54.0                edgeR_3.20.9              
 [70] DelayedArray_0.4.1         acepack_1.4.1              matrixStats_0.54.0        
 [73] rpart_4.1-12               NBPSeq_0.3.0               glue_1.3.0                
 [76] magrittr_1.5               igraph_1.2.2               S4Vectors_0.16.0          
 [79] blockmodeling_0.3.1        SummarizedExperiment_1.8.1 gridExtra_2.3             
 [82] samr_2.0                   htmlwidgets_1.2            checkmate_1.8.5           
 [85] parallel_3.4.3             htmltools_0.3.6            RSQLite_2.0               
 [88] rstudioapi_0.7             knitr_1.20                 nnet_7.3-12               
 [91] gtable_0.2.0               zlibbioc_1.24.0            colorspace_1.3-2          
 [94] geneplotter_1.56.0         cluster_2.0.6              gtools_3.8.1              
 [97] XVector_0.18.0             RCurl_1.95-4.11            DESeq2_1.18.1             
[100] bitops_1.0-6               RColorBrewer_1.1-2         Matrix_1.2-11             
[103] foreign_0.8-69             bit_1.1-12                 doParallel_1.0.11         
[106] IRanges_2.12.0             bindr_0.1.1                reshape2_1.4.3            
[109] XML_3.98-1.9               iterators_1.0.9            splines_3.4.3             
[112] rlang_0.2.2                BiocParallel_1.12.0        backports_1.1.2           
[115] qvalue_2.10.0

DEsubs error • 1.6k views

ADD COMMENT • link updated 6.6 years ago by Panos Balomenos • 0 • written 6.6 years ago by Talip ▴ 10

score 0 · Answer 1 · 2018-08-25

0

Entering edit mode

Panos Balomenos • 0

@panos-balomenos-9147

Last seen 5.1 years ago

Greece/University of Patras

Hi Talip,

The error seems to have originated from passing a data frame as a ranked list rather than a named vector.

I was able to reproduce your error with the package's example as follows:

load(system.file('extdata', 'data.RData', package='DEsubs')) 
DEsubs.run <- DEsubs(
    org='hsa',
    mRNAexpr=mRNAexpr,
    mRNAnomenclature='entrezgene',
    pathways='All',
    DEtool=NULL, DEpar=0.05,
    CORtool='pearson', CORpar=0.7,
    subpathwayType='community.walktrap',
    rankedList=data.frame('adj.P.Val'=rankedList),
    verbose=TRUE)

Pruning nodes...Error in read.table(file, dec = ".", sep = " ") : 
'file' must be a character string or connection

So, in your case, you just have to replace

myRankedList <- sign_genes[c('adj.P.Val')]

with

myRankedList <- setNames(sign_genes[, 'adj.P.Val'], rownames(sign_genes))

, which converts the single column of the data frame to a named vector, assuming the identifiers are stored as rownames.

Best regards,

Panos

ADD COMMENT • link 6.6 years ago Panos Balomenos • 0

0

Entering edit mode

Hi Panos,

Thanks very much for your reply. I changed my code as you advised, it works now. But the DEsubs.out object does not have subnetworks as result. The structure is as below:

> str(myRankedList)
 Named num [1:7292] 5.40e-52 4.91e-51 4.91e-51 3.29e-50 2.76e-49 ...
 - attr(*, "names")= chr [1:7292] "ENSG00000224215" "ENSG00000198358" "ENSG00000204305" "ENSG00000248551" ...

> DEsubs.out <- DEsubs(org='hsa',
+                      mRNAexpr=GeneExp_Matrix,
+                      mRNAnomenclature='ensembl_gene_id',
+                      pathways='All',
+                      DEtool=NULL,
+                      DEpar=0.05,
+                      CORtool='pearson',
+                      CORpar=0.7,
+                      subpathwayType='community.walktrap',
+                      rankedList=myRankedList,
+                      verbose=TRUE)
Pruning nodes...done.
Pruning edges...done.
Performing subpathway analysis ( community.walktrap )...done.

> str(DEsubs.out)
List of 4
 $ org             : chr "hsa"
 $ mRNAnomenclature: chr "ensembl_gene_id"
 $ edgeList        : logi[0 , 1:3]
 $ DEgenes         : Named num(0)
  ..- attr(*, "names")= chr(0)

Is the problem depending on using of Ensembl Gene IDs? Should I use Entrez Gene IDs?

ADD REPLY • link 6.6 years ago Talip ▴ 10

0

Entering edit mode

Hi,

I tried again with entrez gene IDs as below, but subnetworks are not generated.

> str(myRankedList)
Named num [1:4412] 4.91e-51 2.76e-49 3.30e-49 2.43e-48 5.59e-48 ...
- attr(*, "names")= chr [1:4412] "177" "857" "6445" "142683" ...

> DEsubs.out <- DEsubs(org='hsa',
+                      mRNAexpr=GeneExp_Matrix,
+                      mRNAnomenclature='entrezgene',
+                      pathways='All',
+                      DEtool=NULL,
+                      DEpar=0.05,
+                      CORtool='pearson',
+                      CORpar=0.7,
+                      subpathwayType='community.walktrap',
+                      rankedList=myRankedList,
+                      verbose=TRUE)
Pruning nodes...done.
Pruning edges...done.
Performing subpathway analysis ( community.walktrap )...done.

> str(DEsubs.out)
List of 4
$ org             : chr "hsa"
$ mRNAnomenclature: chr "entrezgene"
$ edgeList        : logi[0 , 1:3]
$ DEgenes         : Named num(0)
  ..- attr(*, "names")= chr(0)

Then I tried with DEtool option with voom+limma and edgeR, there was problems for voom+limma but edgeR option worked.

> DEsubs.out <- DEsubs(org='hsa',
+                      mRNAexpr=GeneExp_Matrix,
+                      mRNAnomenclature='ensembl_gene_id',
+                      pathways='All',
+                      DEtool='voom+limma',
+                      DEpar=0.05,
+                      CORtool='pearson',
+                      CORpar=0.7,
+                      subpathwayType='community.walktrap',
+                      rankedList=NULL,
+                      verbose=TRUE)
Pruning nodes...done.
Pruning edges...Error in if (type != 3 && CR[i] * reg > thr) { :
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
2: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
3: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
4: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
5: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
6: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero

> DEsubs.out <- DEsubs(org='hsa',
+                      mRNAexpr=GeneExp_Matrix,
+                      mRNAnomenclature='ensembl_gene_id',
+                      pathways='All',
+                      DEtool='edgeR',
+                      DEpar=0.05,
+                      CORtool='pearson',
+                      CORpar=0.7,
+                      subpathwayType='community.walktrap',
+                      rankedList=NULL,
+                      verbose=TRUE)
Pruning nodes...done.
Pruning edges...done.
Performing subpathway analysis ( community.walktrap )...done.

> str(DEsubs.out)
List of 5
 $ org                           : chr "hsa"
 $ mRNAnomenclature              : chr "ensembl_gene_id"
 $ edgeList                      :'data.frame':    9 obs. of  5 variables:
  ..$ E1      : chr [1:9] "51806" "8674" "1437" "3567" ...
  ..$ E2      : chr [1:9] "816" "8773" "3559" "149233" ...
  ..$ edgeType: num [1:9] 1 1 1 1 1 1 1 1 1
  ..$ pathway : chr [1:9] "04114" "04130" "04630" "04630" ...
  ..$ corr    : num [1:9] 0.987 0.787 0.776 0.753 0.751 ...
 $ DEgenes                       : Named num [1:17] 4.47e-06 2.53e-02 1.28e-02 8.43e-11 2.13e-06 ...
  ..- attr(*, "names")= chr [1:17] "816" "8773" "5294" "2798" ...
 $ subAnalysis.community.walktrap:List of 8
  ..$ sub1: chr [1:3] "940" "5294" "2322"
  ..$ sub2: chr [1:2] "3567" "149233"
  ..$ sub3: chr [1:2] "1437" "3559"
  ..$ sub4: chr [1:2] "51806" "816"
  ..$ sub5: chr [1:2] "114548" "3553"
  ..$ sub6: chr [1:2] "8674" "8773"
  ..$ sub7: chr [1:2] "2796" "2798"
  ..$ sub8: chr [1:2] "405753" "50506"

But I used voom+limma separately and I want to give significant genes with adjusted p values and Ensembl IDs to DEsubs. How can I fix this problem?

ADD REPLY • link 6.6 years ago Talip ▴ 10

score 0 · Answer 2 · 2018-08-31

0

Entering edit mode

Panos Balomenos • 0

@panos-balomenos-9147

Last seen 5.1 years ago

Greece/University of Patras

Hi Talip,

If you update to the latest version 1.6.3, you can supply the adjusted P-values with any type of identifier (from the ones specified in the manual). Give it a couple of days until the necessary changes have been propagated. Meanwhile, you can simply convert the identifiers to Entrez Ids manually:

if ( package.version('DEsubs') < '1.6.3' )
{
    require('org.Hs.eg.db') 
    geneInfo <- AnnotationDbi::select( 
        x=get('org.Hs.eg.db'), keys=names(myRankedList), keytype='ENSEMBL', columns='ENTREZID' )
    map <- setNames( geneInfo[, 'ENTREZID'], geneInfo[, 'ENSEMBL'] )
    names(myRankedList) <- map[names(myRankedList)]   
}

ADD COMMENT • link 6.6 years ago Panos Balomenos • 0

0

Entering edit mode

Thanks for your interest. I am waiting for new version.

ADD REPLY • link 6.6 years ago Talip ▴ 10

0

Entering edit mode

I have tried new version and it works now. Thanks very much for your interest.

ADD REPLY • link 6.5 years ago Talip ▴ 10