diffbind dba.count error "files could not be accessed" on windows 11
1
1
Entering edit mode
Theo ▴ 10
@theodoregeorgomanolis-7993
Last seen 5 months ago
Germany

I got the following error while trying to create the initiall dba object:

> tamoxifen <- dba.count(tamoxifen) 
Computing summits...
Error: Some read files could not be accessed. See warnings for details.
In addition: There were 12 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201074_S18_L000/A006200317_201074_S18_L000.bam not accessible
2: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201076_S19_L000/A006200317_201076_S19_L000.bam not accessible
3: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201078_S20_L000/A006200317_201078_S20_L000.bam not accessible
4: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201080_S21_L000/A006200317_201080_S21_L000.bam not accessible
5: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201082_S22_L000/A006200317_201082_S22_L000.bam not accessible
6: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201084_S23_L000/A006200317_201084_S23_L000.bam not accessible
7: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201086_S24_L000/A006200317_201086_S24_L000.bam not accessible
8: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201088_S25_L000/A006200317_201088_S25_L000.bam not accessible
9: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201090_S26_L000/A006200317_201090_S26_L000.bam not accessible
10: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201092_S27_L000/A006200317_201092_S27_L000.bam not accessible
11: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201094_S28_L000/A006200317_201094_S28_L000.bam not accessible
12: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201096_S29_L000/A006200317_201096_S29_L000.bam not accessible

the files are there and using the file.exists I get TRUE at the test. here is a complete code:

read the csv file, I used the sep = \t due to the file being tab delimeted

samples<- read.csv("diffbind.csv",sep = "\t")
> print(samples)
   SampleID   Factor Condition Replicate
1    201074 H3K36me3   control         1
2    201076 H3K36me3   control         2
3    201078 H3K36me3   control         3
4    201080 H3K36me3       AA5         1
5    201082 H3K36me3       AA5         2
6    201084 H3K36me3       AA5         3
7    201086  H4K16ac   control         1
8    201088  H4K16ac   control         2
9    201090  H4K16ac   control         3
10   201092  H4K16ac       AA5         1
11   201094  H4K16ac       AA5         2
12   201096  H4K16ac       AA5         3
                                                                                                                 bamReads
1  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201074_S18_L000/A006200317_201074_S18_L000.bam
2  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201076_S19_L000/A006200317_201076_S19_L000.bam
3  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201078_S20_L000/A006200317_201078_S20_L000.bam
4  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201080_S21_L000/A006200317_201080_S21_L000.bam
5  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201082_S22_L000/A006200317_201082_S22_L000.bam
6  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201084_S23_L000/A006200317_201084_S23_L000.bam
7  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201086_S24_L000/A006200317_201086_S24_L000.bam
8  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201088_S25_L000/A006200317_201088_S25_L000.bam
9  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201090_S26_L000/A006200317_201090_S26_L000.bam
10 Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201092_S27_L000/A006200317_201092_S27_L000.bam
11 Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201094_S28_L000/A006200317_201094_S28_L000.bam
12 Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201096_S29_L000/A006200317_201096_S29_L000.bam
                                                                                                                                                           Peaks
1  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201074_S18_L000/peakcalling/seacr/A006200317_201074_S18_L000_treat.stringent.sort.bed
2  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201076_S19_L000/peakcalling/seacr/A006200317_201076_S19_L000_treat.stringent.sort.bed
3  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201078_S20_L000/peakcalling/seacr/A006200317_201078_S20_L000_treat.stringent.sort.bed
4  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201080_S21_L000/peakcalling/seacr/A006200317_201080_S21_L000_treat.stringent.sort.bed
5  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201082_S22_L000/peakcalling/seacr/A006200317_201082_S22_L000_treat.stringent.sort.bed
6  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201084_S23_L000/peakcalling/seacr/A006200317_201084_S23_L000_treat.stringent.sort.bed
7  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201086_S24_L000/peakcalling/seacr/A006200317_201086_S24_L000_treat.stringent.sort.bed
8  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201088_S25_L000/peakcalling/seacr/A006200317_201088_S25_L000_treat.stringent.sort.bed
9  Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201090_S26_L000/peakcalling/seacr/A006200317_201090_S26_L000_treat.stringent.sort.bed
10 Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201092_S27_L000/peakcalling/seacr/A006200317_201092_S27_L000_treat.stringent.sort.bed
11 Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201094_S28_L000/peakcalling/seacr/A006200317_201094_S28_L000_treat.stringent.sort.bed
12 Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201096_S29_L000/peakcalling/seacr/A006200317_201096_S29_L000_treat.stringent.sort.bed
   PeakCaller scorecol
1         bed        3
2         bed        3
3         bed        3
4         bed        3
5         bed        3
6         bed        3
7         bed        3
8         bed        3
9         bed        3
10        bed        3
11        bed        3
12        bed        3

so the created matrix looks good. lets create the dba object now:

> tamoxifen <- dba(sampleSheet=samples)
201074  H3K36me3 control  1 bed
201076  H3K36me3 control  2 bed
201078  H3K36me3 control  3 bed
201080  H3K36me3 AA5  1 bed
201082  H3K36me3 AA5  2 bed
201084  H3K36me3 AA5  3 bed
201086  H4K16ac control  1 bed
201088  H4K16ac control  2 bed
201090  H4K16ac control  3 bed
201092  H4K16ac AA5  1 bed
201094  H4K16ac AA5  2 bed
201096  H4K16ac AA5  3 bed
> tamoxifen
12 Samples, 3504 sites in matrix (5003 total):
       ID   Factor Condition Replicate Intervals
1  201074 H3K36me3   control         1      3117
2  201076 H3K36me3   control         2      3042
3  201078 H3K36me3   control         3      2888
4  201080 H3K36me3       AA5         1      3245
5  201082 H3K36me3       AA5         2      3014
6  201084 H3K36me3       AA5         3      3278
7  201086  H4K16ac   control         1       114
8  201088  H4K16ac   control         2       117
9  201090  H4K16ac   control         3       171
10 201092  H4K16ac       AA5         1       340
11 201094  H4K16ac       AA5         2       282
12 201096  H4K16ac       AA5         3       249

this also looks good. I am checking if R in windows 11 can actually find the files:

> file.exists(path = samples$bamReads)
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> file.exists(path = samples$Peaks)
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

So all looks ready to do the count

> tamoxifen <- dba.count(tamoxifen) 
Computing summits...
Error: Some read files could not be accessed. See warnings for details.
In addition: There were 12 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201074_S18_L000/A006200317_201074_S18_L000.bam not accessible
2: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201076_S19_L000/A006200317_201076_S19_L000.bam not accessible
3: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201078_S20_L000/A006200317_201078_S20_L000.bam not accessible
4: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201080_S21_L000/A006200317_201080_S21_L000.bam not accessible
5: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201082_S22_L000/A006200317_201082_S22_L000.bam not accessible
6: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201084_S23_L000/A006200317_201084_S23_L000.bam not accessible
7: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201086_S24_L000/A006200317_201086_S24_L000.bam not accessible
8: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201088_S25_L000/A006200317_201088_S25_L000.bam not accessible
9: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201090_S26_L000/A006200317_201090_S26_L000.bam not accessible
10: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201092_S27_L000/A006200317_201092_S27_L000.bam not accessible
11: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201094_S28_L000/A006200317_201094_S28_L000.bam not accessible
12: Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201096_S29_L000/A006200317_201096_S29_L000.bam not accessible

I have no idea how to solve this any clues? SESSION INFO

sessionInfo( )
R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.utf8  LC_CTYPE=English_United Kingdom.utf8    LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.utf8    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] DiffBind_3.8.4              SummarizedExperiment_1.28.0 Biobase_2.58.0              MatrixGenerics_1.10.0      
 [5] matrixStats_0.63.0          GenomicRanges_1.50.2        GenomeInfoDb_1.34.9         IRanges_2.32.0             
 [9] S4Vectors_0.36.2            BiocGenerics_0.44.0        

loaded via a namespace (and not attached):
 [1] bitops_1.0-7             RColorBrewer_1.1-3       numDeriv_2016.8-1.1      tools_4.2.1              utf8_1.2.3              
 [6] R6_2.5.1                 irlba_2.3.5.1            KernSmooth_2.23-20       DBI_1.1.3                colorspace_2.1-0        
[11] apeglm_1.20.0            tidyselect_1.2.0         compiler_4.2.1           cli_3.6.0                DelayedArray_0.24.0     
[16] rtracklayer_1.58.0       caTools_1.18.2           scales_1.2.1             SQUAREM_2021.1           mvtnorm_1.1-3           
[21] mixsqp_0.3-48            stringr_1.5.0            digest_0.6.31            Rsamtools_2.14.0         XVector_0.38.0          
[26] jpeg_0.1-10              pkgconfig_2.0.3          htmltools_0.5.4          fastmap_1.1.1            invgamma_1.1            
[31] bbmle_1.0.25             limma_3.54.2             BSgenome_1.66.3          htmlwidgets_1.6.2        rlang_1.1.0             
[36] rstudioapi_0.14          BiocIO_1.8.0             generics_0.1.3           hwriter_1.3.2.1          BiocParallel_1.32.6     
[41] gtools_3.9.4             dplyr_1.1.2              RCurl_1.98-1.12          magrittr_2.0.3           GenomeInfoDbData_1.2.9  
[46] interp_1.1-4             Matrix_1.5-3             Rcpp_1.0.10              munsell_0.5.0            fansi_1.0.4             
[51] lifecycle_1.0.3          stringi_1.7.12           yaml_2.3.7               MASS_7.3-58.3            zlibbioc_1.44.0         
[56] gplots_3.1.3             plyr_1.8.8               grid_4.2.1               parallel_4.2.1           ggrepel_0.9.3           
[61] bdsmatrix_1.3-6          crayon_1.5.2             deldir_1.0-9             lattice_0.21-8           Biostrings_2.66.0       
[66] locfit_1.5-9.7           pillar_1.9.0             rjson_0.2.21             systemPipeR_2.4.0        codetools_0.2-18        
[71] XML_3.99-0.14            glue_1.6.2               ShortRead_1.56.1         GreyListChIP_1.30.0      latticeExtra_0.6-30     
[76] BiocManager_1.30.20      png_0.1-8                vctrs_0.6.0              gtable_0.3.3             amap_0.8-19             
[81] ashr_2.2-54              ggplot2_3.4.2            emdbook_1.3.12           restfulr_0.0.15          coda_0.19-4             
[86] truncnorm_1.0-9          tibble_3.2.1             GenomicAlignments_1.34.1
>
DiffBind • 1.2k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 1 day ago
United States

What do you get when you do

file.access(samples$bamReads, 4)

If they are all -1, then you don't have permission to read the files.

ADD COMMENT
0
Entering edit mode

you hit the spot. But this is a weird issue. Using windows file manager I got, my username as owner. Using WSL2 I am seeing the owner is root and group is again root. permission though are read and write for all. interesting is that I got in the same directory the peaks file that can be accessed without issues, but file.access() gives me a -1. ```R file.access(samples$bamReads, 4) Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201074_S18_L000/A006200317_201074_S18_L000.bam -1

```R
file.access(samples$Peaks, 4)
Z:/Shared folder/Shared data/DNA/NGS_AP01_cfrezza_A006200317/A006200317_201074_S18_L000/peakcalling/seacr/A006200317_201074_S18_L000_treat.stringent.sort.bed

The peak file has been nicely read:

> tamoxifen <- dba(sampleSheet=samples)
201074  H3K36me3 control  1 bed
201076  H3K36me3 control  2 bed
201078  H3K36me3 control  3 bed
201080  H3K36me3 AA5  1 bed
201082  H3K36me3 AA5  2 bed
201084  H3K36me3 AA5  3 bed
201086  H4K16ac control  1 bed
201088  H4K16ac control  2 bed
201090  H4K16ac control  3 bed
201092  H4K16ac AA5  1 bed
201094  H4K16ac AA5  2 bed
201096  H4K16ac AA5  3 bed
class(tamoxifen$peaks)
[1] "list"
summary(tamoxifen$peaks[[1]])
     Chr                Start                End                Score          
 Length:3117        Min.   :    21811   Min.   :    23550   Min.   :0.0004991  
 Class :character   1st Qu.: 34277443   1st Qu.: 34282284   1st Qu.:0.0007487  
 Mode  :character   Median : 75446496   Median : 75451966   Median :0.0007487  
                    Mean   : 75166513   Mean   : 75175177   Mean   :0.0013513  
                    3rd Qu.:110799411   3rd Qu.:110802399   3rd Qu.:0.0009983  
                    Max.   :189858415   Max.   :189866932   Max.   :1.0000000
ADD REPLY
0
Entering edit mode

That is weird. No idea why the bed file would be accessible but not the bam. I assume Z: is a samba share? Over the years I have seen people having problems reading files off a samba share on Windows, so maybe it's just something having to do with that. As an example, sometimes people try to put their R library dir on a shared drive and that usually ends in tears. The normal prescription is to tell people not to do that, but moving a bunch of bam files onto your Windows box might not be something you can do.

Sorry I can't be more helpful.

ADD REPLY
0
Entering edit mode

No worries, moved those locally and that error disaeared. Must to be something woth samba servers. Any way. Now I got another problem and I will open a new question. Thank you James!

ADD REPLY

Login before adding your answer.

Traffic: 656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6