ChIPQC Error in .Call2 NCList object is too deep
6
1
Entering edit mode
@victor-goitea-14304
Last seen 7.2 years ago
United_States/Oklahoma_City/OMRF

Dear all,

I need help to solve this error, using the latest version of ChIPQC1.15. I am not bioinformatician. I would like to give the bam file to you can reproduce the problem but I don't know how to do with such a big file.

 

> samples = read.csv("samples_Ours_test0.csv")

SampleID      Tissue Factor Treatment Replicate                            bamReads Peaks
1 File_ours D  A T         1 File.bam    NA

> Test0 = ChIPQC(samples,annotation="hg19")

Compiling annotation...
Using default blacklist for hg19...
Computing metrics for 1 samples...
list
Bam file has 84 contigs
Calculating coverage histogram for 1

Calculating SSD for 1

Calculating unique positions per strand for 1

Calculating shift for 1

 300 / 300
Counting reads in features for 1

Error in .Call2("new_NCListAsINTSXP_from_NCList", nclist_xp, PACKAGE = "IRanges") : 
  compute_NCListAsINTSXP_length: NCList object is too deep (has more than
  100000 levels of nested ranges)

In addition: Warning messages:
1: In ChIPQC(samples, annotation = "hg19") :
  No chromosomes specified in peaks, using all.
2: In if is.na(blacklist)) blacklist = NULL :
  the condition has length > 1 and only the first element will be used

If I run like this

> Test0 = ChIPQCsample(bamFile,annotation="hg19",peaks=NULL,chromosomes="chr11")

The report change to

Error in checkSlotAssignment(object, name, value) : 
  assignment of an object of class “IRanges” is not valid for slot ‘ranges’ in an object of class “ChIPQCsample”; is(value, "IRanges_OR_IPos") is not TRUE

 

I have some test samples where CHIPQC run OK but with my samples I got these errors. So I check for differences in the bam files and I found: success CHIPQC success bam files (align with bowtie, chromosome list "chr1"- "chr22", "chrX", "chrY", "chrMT" ), CHIPQC fail bam files (align with bwa, chromosome list 1-22", X, Y, MT, and several GL##). So the first thing I tried was to edit and filter my bam files, so they list the chromosome name as chr#, I also tried to get rid of the GL## by different methods, but I only could eliminate the reads information in that location( I tried a lot of way and all give me the same result). The header remain listing the 84 contig, I couldn't find a successful method on this and I don't know if it is the solution or the error is unrelated to this. I also tried to follow the custom genome annotation A: Unable to perform analysis on all the chromosomes with the latest version of ChI but it didn't work on my hands (maybe I'm doing a mistake listing all the chr and their ranges)

Here is the list of my original contigs obtained by:  

awk '{print $1 "\t0\t" $2}' human_hg19.fa.fai > human_hg19.bed

 

1 0 249250621

2 0 243199373

3 0 198022430

4 0 191154276

5 0 180915260

6 0 171115067

7 0 159138663

8 0 146364022

9 0 141213431

10 0 135534747

11 0 135006516

12 0 133851895

13 0 115169878

14 0 107349540

15 0 102531392

16 0 90354753

17 0 81195210

18 0 78077248

19 0 59128983

20 0 63025520

21 0 48129895

22 0 51304566

X 0 155270560

Y 0 59373566

MT 0 16569

GL000207.1 0 4262

GL000226.1 0 15008

GL000229.1 0 19913

GL000231.1 0 27386

GL000210.1 0 27682

GL000239.1 0 33824

GL000235.1 0 34474

GL000201.1 0 36148

GL000247.1 0 36422

GL000245.1 0 36651

GL000197.1 0 37175

GL000203.1 0 37498

GL000246.1 0 38154

GL000249.1 0 38502

GL000196.1 0 38914

GL000248.1 0 39786

GL000244.1 0 39929

GL000238.1 0 39939

GL000202.1 0 40103

GL000234.1 0 40531

GL000232.1 0 40652

GL000206.1 0 41001

GL000240.1 0 41933

GL000236.1 0 41934

GL000241.1 0 42152

GL000243.1 0 43341

GL000242.1 0 43523

GL000230.1 0 43691

GL000237.1 0 45867

GL000233.1 0 45941

GL000204.1 0 81310

GL000198.1 0 90085

GL000208.1 0 92689

GL000191.1 0 106433

GL000227.1 0 128374

GL000228.1 0 129120

GL000214.1 0 137718

GL000221.1 0 155397

GL000209.1 0 159169

GL000218.1 0 161147

GL000220.1 0 161802

GL000213.1 0 164239

GL000211.1 0 166566

GL000199.1 0 169874

GL000217.1 0 172149

GL000216.1 0 172294

GL000215.1 0 172545

GL000205.1 0 174588

GL000219.1 0 179198

GL000224.1 0 179693

GL000223.1 0 180455

GL000195.1 0 182896

GL000212.1 0 186858

GL000222.1 0 186861

GL000200.1 0 187035

GL000193.1 0 189789

GL000194.1 0 191469

GL000225.1 0 211173

GL000192.1 0 547496

 

I also tried this:

 

> chromosomesToTest <- c(paste0("chr",c(1:22,"X","Y")))

> chromosomesToTest

Test0 = ChIPQC(samples,annotation="hg19",chromosomes=chromosomesToTest)

And still the error remains the same.

 

> traceback()
15: stop(e)
14: value[[3L]](cond)
13: tryCatchOne(expr, names, parentenv, handlers[[1L]])
12: tryCatchList(expr, classes, parentenv, handlers)
11: tryCatch({
        FUN(...)
    }, error = handle_error)
10: withCallingHandlers({
        tryCatch({
            FUN(...)
        }, error = handle_error)
    }, warning = handle_warning)
9: FUN(X[[i]], ...)
8: lapply(X, FUN, ...)
7: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = param)
6: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = param)
5: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM)
4: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM)
3: bplapply(samplelist, doChIPQCsample, experiment, chromosomes, 
       annotation, mapQCth, blacklist, profileWin, fragmentLength, 
       shifts)
2: bplapply(samplelist, doChIPQCsample, experiment, chromosomes, 
       annotation, mapQCth, blacklist, profileWin, fragmentLength, 
       shifts)
1: ChIPQC(samples, annotation = "hg19")
> Test0 = ChIPQC(samples,annotation="hg

ChIPQC • 2.5k views
ADD COMMENT
0
Entering edit mode

Please add the output of sessionInfo(), and be sure to use up-to-date versions of R (3.4.2) and Bioconductor (3.6) with no package conflicts (BiocInstaller::biocValid() returns TRUE).

ADD REPLY
0
Entering edit mode

 sessionInfo()

R version 3.4.2 (2017-09-28)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ChIPQC_1.15.0              DiffBind_2.6.0             SummarizedExperiment_1.8.0
 [4] DelayedArray_0.4.0         matrixStats_0.52.2         Biobase_2.38.0            
 [7] GenomicRanges_1.30.0       GenomeInfoDb_1.14.0        IRanges_2.12.0            
[10] S4Vectors_0.16.0           BiocGenerics_0.24.0        ggplot2_2.2.1

* Packages too new for Bioconductor version '3.6'

       Version  LibPath                                                         
ChIPQC "1.15.0" "/Library/Frameworks/R.framework/Versions/3.4/Resources/library"

downgrade with biocLite("ChIPQC")

Error: 1 package(s) too new

 

After downgrade biocLite("ChIPQC")

> BiocInstaller::biocValid()
[1] TRUE

The error still remain.

I found out that the program run well if I specify to run just in one chromosome. Also, it doesn't matter how is the name of the chromosomes as much it match in the specification of the command or values definition. Also I could get rid of the other contigs and only keep chr 1-22,X,Y. But the error remains being the same

 

> chromosomesToTest1 <- c(paste0(c(1)))

> exampleExp1 = ChIPQC(samples2,annotation="hg19",chromosomes=chromosomesToTest1)

Compiling annotation...
Using default blacklist for hg19...
Computing metrics for 1 samples...
list
Bam file has 24 contigs
Calculating coverage histogram for 1

Calculating SSD for 1

Calculating unique positions per strand for 1

Calculating shift for 1

 300 / 300
Counting reads in features for 1

[1] 1
Warning message:
In if is.na(blacklist)) blacklist = NULL :
  the condition has length > 1 and only the first element will be used

 

But, here is with two chromosomes and starts to fail:

> chromosomesToTest2 <- c(paste0(c(1:2)))

> exampleExp2 = ChIPQC(samples2,annotation="hg19",chromosomes=chromosomesToTest2)

Compiling annotation...
Using default blacklist for hg19...
Computing metrics for 1 samples...
list
Bam file has 24 contigs
Calculating coverage histogram for 1

Calculating SSD for 1

Calculating unique positions per strand for 1

Calculating shift for 1

 300 / 300
Counting reads in features for 1

Error in .Call2("new_NCListAsINTSXP_from_NCList", nclist_xp, PACKAGE = "IRanges") : 
  compute_NCListAsINTSXP_length: NCList object is too deep (has more than
  100000 levels of nested ranges)
In addition: Warning message:
In if is.na(blacklist)) blacklist = NULL :
  the condition has length > 1 and only the first element will be used

ADD REPLY
2
Entering edit mode
@herve-pages-1542
Last seen 17 days ago
Seattle, WA, United States

Hi Thomas,

The current implementation of NCList and GNCList objects imposes a hard-coded limit of 100000 nested ranges. Note that you hit this limit only when you build an NCList or GNCList object by explicitly calling the NCList() or GNCList() constructor. Most of the time this is not necessary though, and is actually not recommended (see ?NCList and ?GNCList for some details about this).

So in ChIPQC/R/sampleQC.R, replacing line 111:

Sample_GIT <- GNCList(GRanges(seqnames=seqnames(temp),ranges=ranges(temp),strand=strand(temp),elementMetadata(temp)))

with:

Sample_GIT <- GRanges(seqnames=seqnames(temp),ranges=ranges(temp),strand=strand(temp),elementMetadata(temp))

should make the problem go away.

Cheers,

H.

 

ADD COMMENT
2
Entering edit mode
@thomas-carroll-7019
Last seen 2.2 years ago
United States/New York/The Rockefeller …

hi Victor, hi Herve,

Yes, it looks like on chromosome 2 you have an area over 2 millions reads deep (overlapping reads) which means ChIPQC breaches the limit of 100000 set in the way ChIPQC creates the GNCList object. 

I will adopt the change suggested by Herve tonight and it should be in the release version of ChIPQC soon.

best,

tom

 

ADD COMMENT
0
Entering edit mode

Hi Tom, hi Herve,

Thank you very much for solve this problem and for the explanation.

Best,

Victor

 

 

 

ADD REPLY
0
Entering edit mode
@thomas-carroll-7019
Last seen 2.2 years ago
United States/New York/The Rockefeller …

hi,

I can see a few problems here.

>* Packages too new for Bioconductor version '3.6'

>      Version  LibPath                                                         

> ChIPQC "1.15.0" "/Library/Frameworks/R.framework/Versions/3.4/Resources/library"

> downgrade with biocLite("ChIPQC")

The output from sessionInfo appears to be telling us that ChIPQC version is out of line with your version of Bioconductor. Mixing up packages from different Bioconductor versions may cause some problems here so I recommend using an appropriate ChIPQC version. It tells you as much in sessionInfo() output.

 

Your command to output chromosome names seems to be show your chromosomes are numbers 1-22 and X,Y but the R command you use will look for chr1-chr22 and chrX,chrY. This may explain different performance with Bowtie aligned working and BWA aligned not, although the troubles looks to do with the different genome build and its contig names rather than the choice of aligner.

You could try below on bwa files.

> chromosomesToTest <- c(1:22,"X","Y")

Also to get a similar output of chromosome names and lengths from fasta in R, and avoid using awk, you could try

> seqinfo(FaFile("hg19.fasta"))
Seqinfo object with 25 sequences from an unspecified genome:
  seqnames seqlengths isCircular genome
  chr1      249250621       <NA>   <NA>
  chr2      243199373       <NA>   <NA>
  chr3      198022430       <NA>   <NA>
  chr4      191154276       <NA>   <NA>
  chr5      180915260       <NA>   <NA>
  ...             ...        ...    ...
  chr21      48129895       <NA>   <NA>
  chr22      51304566       <NA>   <NA>
  chrX      155270560       <NA>   <NA>
  chrY       59373566       <NA>   <NA>
  chrM          16571       <NA>   <NA>

If you are still seeing problems after fixing package version issues and chromosome names I can look again.

tom

ADD COMMENT
0
Entering edit mode
@victor-goitea-14304
Last seen 7.2 years ago
United_States/Oklahoma_City/OMRF

Hi Thomas,

I downgrade ChIPQC with biocLite("ChIPQC"). See the sessionInfo() and BiocInstaller::biocValid() . However, it doesn't change the behavior respect to the error.

 

> BiocInstaller::biocValid()
[1] TRUE

> sessionInfo()

R version 3.4.2 (2017-09-28)

Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] Rsamtools_1.30.0           Biostrings_2.46.0          XVector_0.18.0            
 [4] BiocInstaller_1.28.0       ChIPQC_1.14.0              DiffBind_2.6.0            
 [7] SummarizedExperiment_1.8.0 DelayedArray_0.4.0         matrixStats_0.52.2        
[10] Biobase_2.38.0             GenomicRanges_1.30.0       GenomeInfoDb_1.14.0       
[13] IRanges_2.12.0             S4Vectors_0.16.0           BiocGenerics_0.24.0       
[16] ggplot2_2.2.1             

ADD COMMENT
0
Entering edit mode

I show here how the program still fail over ALL chromosomes (1:22,"X","Y") but it works running in only one chromosome. When I try two chromosomes, it depends on the chromosomes that I choose. For example, with chromosomes 21 and 22 it works but using chromosomes 1 and 2 it fails with the same error message posted at the beginning. 

> seqinfo(FaFile("human_g1k_v37.fasta"))
Seqinfo object with 84 sequences from an unspecified genome:
  seqnames   seqlengths isCircular genome
  1           249250621       <NA>   <NA>
  2           243199373       <NA>   <NA>
  3           198022430       <NA>   <NA>
  4           191154276       <NA>   <NA>
  5           180915260       <NA>   <NA>
  ...               ...        ...    ...
  GL000200.1     187035       <NA>   <NA>
  GL000193.1     189789       <NA>   <NA>
  GL000194.1     191469       <NA>   <NA>
  GL000225.1     211173       <NA>   <NA>
  GL000192.1     547496       <NA>   <NA>

>samples = read.csv("samples_Ours_test0.csv")
> samples
    SampleID      Tissue Factor Treatment Replicate       bamReads    Peaks
1    A2_D              D         A2       T              1                    File.bam        NA
> chromosomesToTest <- c(1:22,"X","Y")

> Test1 = ChIPQC(samples,annotation="hg19",chromosomes=chromosomesToTest)

Compiling annotation...
Using default blacklist for hg19...
Computing metrics for 1 samples...
list
Bam file has 84 contigs
Calculating coverage histogram for 1

Calculating SSD for 1

Calculating unique positions per strand for 1

Calculating shift for 1

 300 / 300
Counting reads in features for 1

Error in .Call2("new_NCListAsINTSXP_from_NCList", nclist_xp, PACKAGE = "IRanges") : 
  compute_NCListAsINTSXP_length: NCList object is too deep (has more than
  100000 levels of nested ranges)

In addition: Warning message:
In if is.na(blacklist)) blacklist = NULL :
  the condition has length > 1 and only the first element will be used

Limiting the analysis to chr1

> Test2 = ChIPQC(samples,annotation="hg19",chromosomes="1")

Compiling annotation...
Using default blacklist for hg19...
Computing metrics for 1 samples...
list
Bam file has 84 contigs
Calculating coverage histogram for 1

Calculating SSD for 1

Calculating unique positions per strand for 1

Calculating shift for 1

 300 / 300
Counting reads in features for 1

[1] 1
Warning message:
In if is.na(blacklist)) blacklist = NULL :
  the condition has length > 1 and only the first element will be used

ADD REPLY
0
Entering edit mode

Two chromosomes "21"and "22"

> Test2
Samples: 1 : A2_D
                Tissue Factor Treatment Replicate
                     D         A2       T              1
             Reads Map% Filt% Dup% ReadL FragL RelCC  SSD RiP% RiBL%
   A2_D 8856458  100  10.3    0    47    95  1.12 24.9   NA     0

 

> chromosomesToTest2 <- c("21","22")
> Test3 = ChIPQC(samples,annotation="hg19",chromosomes=chromosomesToTest2)

G6-1_Arid2 DIVA_Brg1KO Arid2  Tamoxifen 1 bed
Compiling annotation...
Using default blacklist for hg19...
Computing metrics for 1 samples...
list
Bam file has 84 contigs
Calculating coverage histogram for 21

Calculating SSD for 21

Calculating unique positions per strand for 21

Calculating shift for 21

 300 / 300
Counting reads in features for 21

Calculating coverage histogram for 22

Calculating SSD for 22

Calculating unique positions per strand for 22

Calculating shift for 22

 300 / 300
Counting reads in features for 22

[1] 1
[1] 1
Warning message:
In if is.na(blacklist)) blacklist = NULL :
  the condition has length > 1 and only the first element will be used

Two chromosomes "1"and "2"

> chromosomesToTest2 <- c("1","2")

> Test4 = ChIPQC(samples,annotation="hg19",chromosomes=chromosomesToTest2)

....

Error in .Call2("new_NCListAsINTSXP_from_NCList", nclist_xp, PACKAGE = "IRanges") : 
  compute_NCListAsINTSXP_length: NCList object is too deep (has more than
  100000 levels of nested ranges)

In addition: Warning message:
In if is.na(blacklist)) blacklist = NULL :
  the condition has length > 1 and only the first element will be used

> traceback()
15: stop(e)
14: value[[3L]](cond)
13: tryCatchOne(expr, names, parentenv, handlers[[1L]])
12: tryCatchList(expr, classes, parentenv, handlers)
11: tryCatch({
        FUN(...)
    }, error = handle_error)
10: withCallingHandlers({
        tryCatch({
            FUN(...)
        }, error = handle_error)
    }, warning = handle_warning)
9: FUN(X[[i]], ...)
8: lapply(X, FUN, ...)
7: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = param)
6: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = param)
5: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM)
4: bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM)
3: bplapply(samplelist, doChIPQCsample, experiment, chromosomes, 
       annotation, mapQCth, blacklist, profileWin, fragmentLength, 
       shifts)
2: bplapply(samplelist, doChIPQCsample, experiment, chromosomes, 
       annotation, mapQCth, blacklist, profileWin, fragmentLength, 
       shifts)
1: ChIPQC(samples, annotation = "hg19", chromosomes = chromosomesToTest2)

ADD REPLY
0
Entering edit mode
@thomas-carroll-7019
Last seen 2.2 years ago
United States/New York/The Rockefeller …

Okay,  I think I will need to take a look at Bam file to debug.

Could you share the Bam file with me by Dropbox or something similar? You can send me link at tc.infomatics@gmail.com and we can post solution here if it is useful to others.

Vb,

tom

 

ADD COMMENT
0
Entering edit mode
@victor-goitea-14304
Last seen 7.2 years ago
United_States/Oklahoma_City/OMRF

I shared the files by dropbox. Did you get the link? Let me know if you need something else or if there is some problem with the Dropbox. Thank you for take  a look.

Best,

Victor.

ADD COMMENT

Login before adding your answer.

Traffic: 501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6