Question

issue with DiffBind dba.count

1

Entering edit mode

mtsompana ▴ 10

@mtsompana-8724

Last seen 3.6 years ago

United States

Enter the body of text here

Code should be placed in three backticks as shown below

getwd()
setwd("/Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/3.DiffBind_csv_files")

#Read csv file
samples = read.csv("Screening.csv", header=TRUE, sep=",")

#read peaksets using the "dba" DiffBind function
screening = dba(sampleSheet = "Screening.csv")
screening



# include your problematic code here with any corresponding output 
#calculate binding matrix with scores based on read counts (affinity scores) for every sample using dba.count and default settings
screening=dba.count(screening)
screening

# please also include the results of running the following in an R session 
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/008a_screening_S16.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/002a_screening_S20.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/008a_screening_S17.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/060a_screening_S37.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/060a_screening_S38.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/002a_screening_S21.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/002a_screening_S21.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/008a_screening_S16.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/002a_screening_S20.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/060a_screening_S38.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/008a_screening_S17.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/060a_screening_S37.sorted.bam.bai
[W::hts_idx_load2] The index file is older than the data file: /Users/tsompana/Box/6. PROJECTS/8. Justin.Fasting.Enzalutamide.ATAC-seq/3.DiffBind/1.FinalBam/060a_screening_S38.sorted.bam.bai
sessionInfo( )

bam bai DiffBind dba.count Diffbind • 1.7k views

ADD COMMENT • link updated 3.7 years ago by Rory Stark ★ 5.2k • written 3.7 years ago by mtsompana ▴ 10

score 0 · Answer 1 · 2021-08-12

0

Entering edit mode

Rory Stark ★ 5.2k

@rory-stark-5741

Last seen 3 months ago

Cambridge, UK

Have you checked the dates of the .bai files compared to their corresponding .bam files? The .bai files should be newer if they are accurate.

ADD COMMENT • link 3.7 years ago Rory Stark ★ 5.2k

0

Entering edit mode

It ended up being a memory issue I believe.

I have a related questions please:

I am now running dba.count with the following settings. Is this the best way to analyze ATAC-seq data?

C2D1=dba.count(C2D1, minOverlap=2, score=DBA_SCORE_NORMALIZED, fragmentSize=0, summits=0, filter=1, bRemoveDuplicates=FALSE, bUseSummarizeOverlaps =FALSE, bParallel=FALSE)

Thank you very much.

ADD REPLY • link 3.7 years ago mtsompana ▴ 10

0

Entering edit mode

There really isn't one "best" set of parameter values, it depends on what you know about your data and what you are looking for. Some observations:

Generally, ATAC-seq benefits from using the summits parameter at a relatively low value (50 or 100) to focus on clear regions of open chromatin with less background signal. Note that often there are two "peaks" in ATAC fragment lengths.

The fragmentSize parameter is only used if you have single-end data (otherwise the paired-end insert size is used). In that case, it is a good idea to specify a value for this parameter to "extend" the single-end reads by the mean fragment size.

bUseSummarizeOverlaps=TRUE is generally the preferred option. If it is set to FALSE, it will use some internal code that, among other things, does not match paired-end reads, resulting in counts that may be up to double (each end counted separately). That option also gives you more fine-grained control over how the counting is done via $config parameters.

ADD REPLY • link 3.7 years ago Rory Stark ★ 5.2k