bootstrapNullAlternativeModel() returns error
0
0
Entering edit mode
Tobias ▴ 40
@tobias-24288
Last seen 15 months ago
Switzerland

Hi TPP2D package maintainers,

I am trying to analyse data from a 2D-TPP experiment. Unfortunately, the following step returns an error:

> null_model_B2 <- bootstrapNullAlternativeModel(df = preproc_df, params_df = model_params_df, B = 2)
[1] "Warning: You have specificed B < 20, it is recommended to use at least B = 20 in order to obtain reliable results."
  |===================================================================================================| 100%

Error: BiocParallel errors
  1 remote errors, element index: 6885
  0 unevaluated and other errors
  first remote error: subscript out of bounds
In addition: Warning message:
In max(nObs) : no non-missing arguments to max; returning -Inf

The vignette code runs without problems (using the package example data). Here is my session info:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

Random number generation:
 RNG:     L'Ecuyer-CMRG
 Normal:  Inversion
 Sample:  Rejection

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TPP2D_1.10.0 dplyr_1.0.8  readr_2.1.2

loaded via a namespace (and not attached):
 [1] zip_2.2.0           Rcpp_1.0.8.3        pillar_1.7.0        compiler_4.1.2      bitops_1.0-7       
 [6] iterators_1.0.14    tools_4.1.2         bit_4.0.4           lifecycle_1.0.1     tibble_3.1.6       
[11] gtable_0.3.0        pkgconfig_2.0.3     rlang_1.0.2         openxlsx_4.2.5      foreach_1.5.2      
[16] rstudioapi_0.13     DBI_1.1.2           cli_3.2.0           parallel_4.1.2      withr_2.5.0        
[21] stringr_1.4.0       generics_0.1.2      vctrs_0.4.0         hms_1.1.1           bit64_4.0.5        
[26] grid_4.1.2          tidyselect_1.1.2    glue_1.6.2          R6_2.5.1            fansi_1.0.3        
[31] BiocParallel_1.28.3 vroom_1.5.7         limma_3.50.1        tidyr_1.2.0         tzdb_0.3.0         
[36] ggplot2_3.3.5       purrr_0.3.4         magrittr_2.0.3      scales_1.1.1        codetools_0.2-18   
[41] ellipsis_0.3.2      MASS_7.3-56         assertthat_0.2.1    colorspace_2.0-3    utf8_1.2.2         
[46] stringi_1.7.6       RCurl_1.98-1.6      munsell_0.5.0       doParallel_1.0.17   crayon_1.5.1
TPP2D • 745 views
ADD COMMENT
0
Entering edit mode

The input data to the function looks like:

> preproc_df
# A tibble: 314,317 × 16
   representative `Total Peptides` `Total Spectral Co…` clustername temperature experiment label RefCol  conc
   <chr>                     <dbl>                <dbl> <chr>             <dbl> <chr>      <chr> <chr>  <dbl>
 1 GATD3B                       13                   31 GATD3B               37 T1_2       126   128C       5
 2 NBDY                          3                    6 NBDY                 37 T1_2       126   128C       5
 3 UBA6                         31                   52 UBA6                 37 T1_2       126   128C       5
 4 ESYT2                        22                   45 ESYT2                37 T1_2       126   128C       5
 5 UHRF1BP1L                    21                   26 UHRF1BP1L            37 T1_2       126   128C       5
 6 SHTN1                        48                   95 SHTN1                37 T1_2       126   128C       5
 7 TMEM223                       4                    6 TMEM223              37 T1_2       126   128C       5
 8 SLC22A23                      4                    4 SLC22A23             37 T1_2       126   128C       5
 9 MEX3A                         3                    3 MEX3A                37 T1_2       126   128C       5
10 ILVBL                        18                   44 ILVBL                37 T1_2       126   128C       5
# … with 314,307 more rows, and 7 more variables: raw_value <dbl>, raw_rel_value <dbl>, log_conc <dbl>,
#   rel_value <dbl>, value <dbl>, log2_value <dbl>, protein_id <chr>
> model_params_df
# A tibble: 6,885 × 19
   representative clustername  nObs min_qupm max_qupm nCoeffsH0 rssH0 parH0  estimateH0 residualsH0 nCoeffsH1
   <chr>          <chr>       <int>    <dbl>    <dbl>     <int> <dbl> <list> <list>     <list>          <int>
 1 A2M            A2M            50      Inf     -Inf        10 2.66  <dbl>  <dbl [50]> <dbl [50]>         24
 2 A2ML1          A2ML1          20      Inf     -Inf         4 5.06  <dbl>  <dbl [20]> <dbl [20]>         12
 3 AAAS           AAAS           40      Inf     -Inf         8 4.27  <dbl>  <dbl [40]> <dbl [40]>         20
 4 AACS           AACS           50      Inf     -Inf        10 0.427 <dbl>  <dbl [50]> <dbl [50]>         24
 5 AADAT          AADAT          40      Inf     -Inf         8 0.172 <dbl>  <dbl [40]> <dbl [40]>         20
 6 AAGAB          AAGAB          50      Inf     -Inf        10 0.416 <dbl>  <dbl [50]> <dbl [50]>         24
 7 AAK1           AAK1           50      Inf     -Inf        10 3.32  <dbl>  <dbl [50]> <dbl [50]>         24
 8 AAMDC          AAMDC          50      Inf     -Inf        10 0.656 <dbl>  <dbl [50]> <dbl [50]>         24
 9 AAMP           AAMP           50      Inf     -Inf        10 0.324 <dbl>  <dbl [50]> <dbl [50]>         24
10 AAR2           AAR2           50      Inf     -Inf        10 0.489 <dbl>  <dbl [50]> <dbl [50]>         24
# … with 6,875 more rows, and 8 more variables: rssH1 <dbl>, parH1 <list>, estimateH1 <list>,
#   residualsH1 <list>, pEC50H1 <dbl>, slopeH1 <dbl>, pEC50_slopeH1 <dbl>, detected_effectH1 <chr>

Since the error message mentions index 6885, I guess the problematic entry is:

> model_params_df[6885,]
# A tibble: 1 × 19
  representative clustername  nObs min_qupm max_qupm nCoeffsH0 rssH0 parH0  estimateH0  residualsH0 nCoeffsH1
  <chr>          <chr>       <int>    <dbl>    <dbl>     <int> <dbl> <list> <list>      <list>          <int>
1 NA             NA            130      Inf     -Inf        10  5.27 <dbl>  <dbl [130]> <dbl [130]>        24
# … with 8 more variables: rssH1 <dbl>, parH1 <list>, estimateH1 <list>, residualsH1 <list>, pEC50H1 <dbl>,
#   slopeH1 <dbl>, pEC50_slopeH1 <dbl>, detected_effectH1 <chr>

The model does not have a representative or clustername. The corresponding entries in the preproc_df:

> preproc_df[is.na(preproc_df$clustername),]
# A tibble: 130 × 16
   representative `Total Peptides` `Total Spectral Co…` clustername temperature experiment label RefCol  conc
   <chr>                     <dbl>                <dbl> <chr>             <dbl> <chr>      <chr> <chr>  <dbl>
 1 NA                           11                   35 NA                   37 T1_2       126   128C    5   
 2 NA                            8                    9 NA                   37 T1_2       126   128C    5   
 3 NA                           11                   35 NA                   37 T1_2       127N  128C    1   
 4 NA                            8                    9 NA                   37 T1_2       127N  128C    1   
 5 NA                           11                   35 NA                   37 T1_2       127C  128C    0.1 
 6 NA                            8                    9 NA                   37 T1_2       127C  128C    0.1 
 7 NA                           11                   35 NA                   37 T1_2       128N  128C    0.01
 8 NA                            8                    9 NA                   37 T1_2       128N  128C    0.01
 9 NA                           11                   35 NA                   37 T1_2       128C  128C    0   
10 NA                            8                    9 NA                   37 T1_2       128C  128C    0   
# … with 120 more rows, and 7 more variables: raw_value <dbl>, raw_rel_value <dbl>, log_conc <dbl>,
#   rel_value <dbl>, value <dbl>, log2_value <dbl>, protein_id <chr>
ADD REPLY
0
Entering edit mode

Interestingly, I can not really explain how this model 6885 was generated. preproc_df is based on the output of import2Dataset() named import_dfand that contains only one representative having missing clustername:

> import_df %>% filter(is.na(clustername))
# A tibble: 130 × 13
   representative     `Total Peptides` `Total Spectra…` clustername temperature experiment label RefCol  conc
   <chr>                         <dbl>            <dbl> <chr>             <dbl> <chr>      <chr> <chr>  <dbl>
 1 A6NIZ1|Q6ZSR9|Q86…               11               35 NA                   37 T1_2       126   128C    5   
 2 A6NIZ1|Q6ZSR9|Q86…                8                9 NA                   37 T1_2       126   128C    5   
 3 A6NIZ1|Q6ZSR9|Q86…               11               35 NA                   37 T1_2       127N  128C    1   
 4 A6NIZ1|Q6ZSR9|Q86…                8                9 NA                   37 T1_2       127N  128C    1   
 5 A6NIZ1|Q6ZSR9|Q86…               11               35 NA                   37 T1_2       127C  128C    0.1 
 6 A6NIZ1|Q6ZSR9|Q86…                8                9 NA                   37 T1_2       127C  128C    0.1 
 7 A6NIZ1|Q6ZSR9|Q86…               11               35 NA                   37 T1_2       128N  128C    0.01
 8 A6NIZ1|Q6ZSR9|Q86…                8                9 NA                   37 T1_2       128N  128C    0.01
 9 A6NIZ1|Q6ZSR9|Q86…               11               35 NA                   37 T1_2       128C  128C    0   
10 A6NIZ1|Q6ZSR9|Q86…                8                9 NA                   37 T1_2       128C  128C    0   
# … with 120 more rows, and 4 more variables: raw_value <dbl>, raw_rel_value <dbl>, log_conc <dbl>,
#   rel_value <dbl>
> unique(import_df[is.na(import_df$clustername),]$representative)
[1] "A6NIZ1|Q6ZSR9|Q86TA4"

All other cases are complete:

> summary(complete.cases(import_df))
   Mode   FALSE    TRUE 
logical     130  314227
ADD REPLY

Login before adding your answer.

Traffic: 797 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6