Why does greyListChIP return different grey lists every time I newly run the script?
1
0
Entering edit mode
Giulia • 0
@06fc245c
Last seen 15 days ago
United Kingdom

Hello, I am trying to make a custom greyList using the greyListChIP package and I having some issue generating a consistent grey list. Every time I newly run the script below (even after closing R and start anew), different grey lists for the same input file are generate every time (see below code). Why is that? How do I fix the issue?

library(BiocParallel)
library(GreyListChIP)

register(SerialParam())
#create a Greylist for each INPUT
fn <- file.path('pathToChromosomeLenghtFile/chrNameLength.txt")
gl <- new("GreyList",karyoFile=fn)
gl <- countReads(gl,"pathToInputBamFile/Sample1_dup_sorted.filter.bam")
gl <- calcThreshold(gl,reps=10,sampleSize=1000,p=0.99,cores=1)
gl <- makeGreyList(gl,maxGap=10000)
gl


sessionInfo( )
R version 4.3.3 (2024-02-29)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.5

1st run: gl coverage: 4566073 bp (1.06%) 2nd run after restarting R: gl coverage: 4700729 bp (1.09%) 3rd run after restarting and without running register(SerialParam()): gl coverage: 3622458 bp (0.84%)

GreyListChIP • 838 views
ADD COMMENT
1
Entering edit mode
@83af6ba7
Last seen 6 weeks ago
United Kingdom

Hi Giulia,

You will get slightly different results each time you generate a grey list due to the random sampling of binned counts used in fitting a negative binomial distribution and calculating the threshold for abnormally high signal.

In your code, you've opted to generate 10 samples, each of 1000 bins, and a mean of the size and mu parameters for each fitted negative binomial distribution are used to calculate the threshold. If you re-run the code, a different set of 10 samples of 1000 bins will be used and the threshold may well differ, but likely not by much.

If you need the results to be consistent each time you generate a grey list, you can set the random number generator seed to some number of your choosing using set.seed() before calling calcThreshold(), e.g.,

set.seed(1)
gl <- calcThreshold(gl,reps=10,sampleSize=1000,p=0.99,cores=1)

Hope that helps, Matt

ADD COMMENT

Login before adding your answer.

Traffic: 439 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6