BitSeq getExpression crash
1
0
Entering edit mode
@maayan-kreitzman-5853
Last seen 10.3 years ago
Hi Peter, I'm still having some difficulty with the first step of BitSeq. I made myself a mini dataset with just 5 million reads so that I could run everything in my R console without writing and R script and submitting to the cluster (just until I know what I'm doing) but, I got this error: Error in getMeanVariance(c(outFile), meanFile, log = log, pretend = pretend) : Conditions: file /tmp/RtmpRHbr1N/A08485_gr4.sam_mini-BS- 27c66ebf4a21.rpkm failed to open. since it seemed to be an issue with the temp directory, I tried to change my TMPDIR in the shell to somewhere with plenty of space: [mkreitzman at xhost09 ~]$ echo $TMPDIR /projects/wtss_scratch/maayan but, this did not make a difference to where the temp files were created. I copied the whole session below. thanks, Maayan > res1 <- getExpression("/projects/mkreitzman_prj/expression_quantific ation_testing/testing/BitSeq/A08485_gr4.sam_mini.sam", "/projects/mkre itzman_prj/expression_quantification_testing/testing/test_data/strand_ specific/transcriptome/Homo_sapiens.GRCh37.69.cdna.all.fa", + log = TRUE, seed=47) ## Computing alignment probabilities. [time: +1.283333 m] [time: +0.333333 m] [time: +0.000000 m] [time: +1.200000 m] [time: +0.500000 m] [time: +0.000000 m] ## Estimating transcript expression levels. Mappings: 1606295 Ntotal: 2408007 10000 [time: +0.000000 s] 100000 [time: +0.000000 s] 1000000 [time: +2.000000 s] Finished Reading! Total hits = 3212590 Isoforms: 183986 Burn in: 1000 DONE. [time: +6.633333 m] Sampling DONE. [time: +9.600000 m] rHat (for 1000 samples) rHat (rHat from subset | tid | mean theta) 1.0040 ( 1.0040 | 36651 | 0.0000) 1.0040 ( 1.0072 | 178210 | 0.0000) 1.0036 ( 1.0088 | 148680 | 0.0000) Mean rHat of worst 10 transcripts: 1.003527 Mean C0: (50 50 50 50 ). Nunmap: 801712 Producing 649 final samples. Sampling DONE. [time: +6.833333 m] rHat (for 649 samples) rHat (rHat from subset | tid | mean theta) 1.0061 ( 1.0051 | 157831 | 0.0000) 1.0059 ( 1.0048 | 104363 | 0.0000) 1.0058 ( 1.0068 | 71659 | 0.0000) Mean rHat of worst 10 transcripts: 1.005543 Mean C0: (50 50 50 50 ). Nunmap: 801712 Total samples: 6596 ## Computing means. Error in getMeanVariance(c(outFile), meanFile, log = log, pretend = pretend) : Conditions: file /tmp/RtmpRHbr1N/A08485_gr4.sam_mini-BS- 27c66ebf4a21.rpkm failed to open. ________________________________________ From: Peter Glaus [glaus@cs.man.ac.uk] Sent: Friday, March 22, 2013 5:32 AM To: Maayan Kreitzman Subject: Re: BitSeq getExpression crash Hi Maayan, I believe the error is caused by process running out of memory. I am not 100% sure, but when I saw this kinds of errors before, it was caused by lack of memory. The estimation can be quite CPU and memory intensive, so I advice running it on a computing cluster instead of using regular desktop/notebook machine. Regarding your function call, when running actual analysis (not just testing/trying out), please use higher values for MCMC_burnIn, MCMC_samplesN and MCMC_samplesSave (the default when leaving these blank is 1000 and is usually "good enough"), the computation will take longer, however the estimates will be much more accurate as well. (The values 200, 200, 50 are used in the vignette because the example data is very small, and the vignette has to run within time limit.) Also, for future reference when you have questions regarding Bioconductor packages, please post to Bioconductor user mailing list (and CC package author), as you might sometimes get replies from other users and also your post might help some other users if they encounter similar problem in the future. Best regards, Peter. On 21/03/13 21:46, Maayan Kreitzman wrote: > Dear Peter, > I'm trying to run BitSeq, and am running into a problem after several hours of the getExpression function running. > This same thing happened twice, on different servers. What weird is that not only does the function crash, it actually exits R. > this is the error message: > > terminate called after throwing an instance of 'std::bad_alloc' > what(): St9bad_alloc > Aborted > > I have no experience whatsoever with R, so this may be a novice mistake, but your help would be greatly appreciated. > I've copied the whole session below. > > thanks in advance, > Maayan > > >> library("BitSeq") > Loading required package: Rsamtools > Loading required package: IRanges > Loading required package: BiocGenerics > > Attaching package: ?BiocGenerics? > > The following object(s) are masked from ?package:stats?: > > xtabs > > The following object(s) are masked from ?package:base?: > > anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, > get, intersect, lapply, Map, mapply, mget, order, paste, pmax, > pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, > rownames, sapply, setdiff, table, tapply, union, unique > > Loading required package: GenomicRanges > Loading required package: Biostrings > Loading required package: zlibbioc >> res1 <- getExpression("/projects/mkreitzman_prj/expression_quantifi cation_testing/testing/test_data/strand_specific/transcriptome/bowtie2 transcriptome/A08473_gr4.sam", > + "/projects/mkreitzman_prj/expression_quantification_testing/testin g/test_data/strand_specific/transcriptome/Homo_sapiens.GRCh37.69.cdna. all.fa", > + log = TRUE, MCMC_burnIn=200,MCMC_samplesN=200,MCMC_samplesSave=50,seed=47) > ## Computing alignment probabilities. > [time: +1.866667 m] > [time: +36.400000 m] > [time: +0.000000 m] > [time: +117.050000 m] > [time: +0.500000 m] > [time: +0.000000 m] > ## Estimating transcript expression levels. > Mappings: 71092830 > Ntotal: 123098679 > 10000 [time: +1.000000 s] > 100000 [time: +0.000000 s] > 1000000 [time: +3.000000 s] > 10000000 [time: +25.000000 s] > Read only 14186178 reads. > Finished Reading! > Total hits = 28372355 > Isoforms: 183985 > Burn in: 200 DONE. [time: +12.016667 m] > > Sampling DONE. [time: +12.850000 m] > rHat (for 200 samples) > rHat (rHat from subset | tid | mean theta) > 1.0252 ( 1.1173 | 89080 | 0.0000) > 1.0216 ( 1.1351 | 126802 | 0.0000) > 1.0183 ( 1.0151 | 183201 | 0.0000) > Mean rHat of worst 10 transcripts: 1.018596 > Mean C0: (3516 3520 3529 3518 ). Nunmap: 52005849 > > Producing 33 final samples. > > Sampling DONE. [time: +2.166667 m] > rHat (for 33 samples) > rHat (rHat from subset | tid | mean theta) > 1.1193 ( 1.1332 | 117458 | 0.0000) > 1.1181 ( 1.1229 | 158878 | 0.0000) > 1.1074 ( 1.1004 | 43840 | 0.0000) > Mean rHat of worst 10 transcripts: 1.108279 > Mean C0: (3528 3512 3523 3520 ). Nunmap: 52005849 > > Total samples: 932 > terminate called after throwing an instance of 'std::bad_alloc' > what(): St9bad_alloc > Aborted
Alignment PROcess BitSeq Alignment PROcess BitSeq • 1.2k views
ADD COMMENT
0
Entering edit mode
Peter Glaus ▴ 70
@peter-glaus-5589
Last seen 10.3 years ago
Dear Maayan, not sure what is the problem as I was not able to replicate this error. Could you please try: 1) running with verbose=TRUE option, so there will be a bit more output. 2) use option outPrefix. With this option, BitSeq does not use temporary directories. Examples: assuming your working direcotry in R is: /projects/mkreitzman_prj/expression_quantification_testing/testing/Bit Seq you could run: getExpression("A08485_gr4.sam_mini.sam", "../test_data/strand_specific/transcriptome/Homo_sapiens.GRCh37.69.cdn a.all.fa", outPrefix="A08485_gr4.sam_mini.run1", log=TRUE, seed=47, verbose=TRUE); Which should create 3 file in the directory: A08485_gr4.sam_mini.run1.rpkm, A08485_gr4.sam_mini.run1.thetaMeans and A08485_gr4.sam_mini.run1.mean. (You can also use relative/absolute paths in the outPrefix, just make sure the directories exist.) If the run fails again, you can have a look into the directory and see what kind of files were created. Regards, Peter. On 25/03/13 22:00, Maayan Kreitzman wrote: > Hi Peter, > I'm still having some difficulty with the first step of BitSeq. > I made myself a mini dataset with just 5 million reads so that I could run everything in my R console without writing and R script and submitting to the cluster (just until I know what I'm doing) > > but, I got this error: > Error in getMeanVariance(c(outFile), meanFile, log = log, pretend = pretend) : > Conditions: file /tmp/RtmpRHbr1N/A08485_gr4.sam_mini-BS- 27c66ebf4a21.rpkm failed to open. > > since it seemed to be an issue with the temp directory, I tried to change my TMPDIR in the shell to somewhere with plenty of space: > [mkreitzman at xhost09 ~]$ echo $TMPDIR > /projects/wtss_scratch/maayan > > but, this did not make a difference to where the temp files were created. > I copied the whole session below. > > thanks, > Maayan > > >> res1 <- getExpression("/projects/mkreitzman_prj/expression_quantifi cation_testing/testing/BitSeq/A08485_gr4.sam_mini.sam", "/projects/mkr eitzman_prj/expression_quantification_testing/testing/test_data/strand _specific/transcriptome/Homo_sapiens.GRCh37.69.cdna.all.fa", > + log = TRUE, seed=47) > ## Computing alignment probabilities. > [time: +1.283333 m] > [time: +0.333333 m] > [time: +0.000000 m] > [time: +1.200000 m] > [time: +0.500000 m] > [time: +0.000000 m] > ## Estimating transcript expression levels. > Mappings: 1606295 > Ntotal: 2408007 > 10000 [time: +0.000000 s] > 100000 [time: +0.000000 s] > 1000000 [time: +2.000000 s] > Finished Reading! > Total hits = 3212590 > Isoforms: 183986 > Burn in: 1000 DONE. [time: +6.633333 m] > > Sampling DONE. [time: +9.600000 m] > rHat (for 1000 samples) > rHat (rHat from subset | tid | mean theta) > 1.0040 ( 1.0040 | 36651 | 0.0000) > 1.0040 ( 1.0072 | 178210 | 0.0000) > 1.0036 ( 1.0088 | 148680 | 0.0000) > Mean rHat of worst 10 transcripts: 1.003527 > Mean C0: (50 50 50 50 ). Nunmap: 801712 > > Producing 649 final samples. > > Sampling DONE. [time: +6.833333 m] > rHat (for 649 samples) > rHat (rHat from subset | tid | mean theta) > 1.0061 ( 1.0051 | 157831 | 0.0000) > 1.0059 ( 1.0048 | 104363 | 0.0000) > 1.0058 ( 1.0068 | 71659 | 0.0000) > Mean rHat of worst 10 transcripts: 1.005543 > Mean C0: (50 50 50 50 ). Nunmap: 801712 > > Total samples: 6596 > ## Computing means. > Error in getMeanVariance(c(outFile), meanFile, log = log, pretend = pretend) : > Conditions: file /tmp/RtmpRHbr1N/A08485_gr4.sam_mini-BS- 27c66ebf4a21.rpkm failed to open. > ________________________________________ > From: Peter Glaus [glaus at cs.man.ac.uk] > Sent: Friday, March 22, 2013 5:32 AM > To: Maayan Kreitzman > Subject: Re: BitSeq getExpression crash > > Hi Maayan, > I believe the error is caused by process running out of memory. I am not > 100% sure, but when I saw this kinds of errors before, it was caused by > lack of memory. The estimation can be quite CPU and memory intensive, so > I advice running it on a computing cluster instead of using regular > desktop/notebook machine. > > Regarding your function call, when running actual analysis (not just > testing/trying out), please use higher values for MCMC_burnIn, > MCMC_samplesN and MCMC_samplesSave (the default when leaving these blank > is 1000 and is usually "good enough"), the computation will take longer, > however the estimates will be much more accurate as well. > (The values 200, 200, 50 are used in the vignette because the example > data is very small, and the vignette has to run within time limit.) > > Also, for future reference when you have questions regarding > Bioconductor packages, please post to Bioconductor user mailing list > (and CC package author), as you might sometimes get replies from other > users and also your post might help some other users if they encounter > similar problem in the future. > > Best regards, > Peter. > > On 21/03/13 21:46, Maayan Kreitzman wrote: >> Dear Peter, >> I'm trying to run BitSeq, and am running into a problem after several hours of the getExpression function running. >> This same thing happened twice, on different servers. What weird is that not only does the function crash, it actually exits R. >> this is the error message: >> >> terminate called after throwing an instance of 'std::bad_alloc' >> what(): St9bad_alloc >> Aborted >> >> I have no experience whatsoever with R, so this may be a novice mistake, but your help would be greatly appreciated. >> I've copied the whole session below. >> >> thanks in advance, >> Maayan >> >> >>> library("BitSeq") >> Loading required package: Rsamtools >> Loading required package: IRanges >> Loading required package: BiocGenerics >> >> Attaching package: ?BiocGenerics? >> >> The following object(s) are masked from ?package:stats?: >> >> xtabs >> >> The following object(s) are masked from ?package:base?: >> >> anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find, >> get, intersect, lapply, Map, mapply, mget, order, paste, pmax, >> pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int, >> rownames, sapply, setdiff, table, tapply, union, unique >> >> Loading required package: GenomicRanges >> Loading required package: Biostrings >> Loading required package: zlibbioc >>> res1 <- getExpression("/projects/mkreitzman_prj/expression_quantif ication_testing/testing/test_data/strand_specific/transcriptome/bowtie 2transcriptome/A08473_gr4.sam", >> + "/projects/mkreitzman_prj/expression_quantification_testing/testi ng/test_data/strand_specific/transcriptome/Homo_sapiens.GRCh37.69.cdna .all.fa", >> + log = TRUE, MCMC_burnIn=200,MCMC_samplesN=200,MCMC_samplesSave=50,seed=47) >> ## Computing alignment probabilities. >> [time: +1.866667 m] >> [time: +36.400000 m] >> [time: +0.000000 m] >> [time: +117.050000 m] >> [time: +0.500000 m] >> [time: +0.000000 m] >> ## Estimating transcript expression levels. >> Mappings: 71092830 >> Ntotal: 123098679 >> 10000 [time: +1.000000 s] >> 100000 [time: +0.000000 s] >> 1000000 [time: +3.000000 s] >> 10000000 [time: +25.000000 s] >> Read only 14186178 reads. >> Finished Reading! >> Total hits = 28372355 >> Isoforms: 183985 >> Burn in: 200 DONE. [time: +12.016667 m] >> >> Sampling DONE. [time: +12.850000 m] >> rHat (for 200 samples) >> rHat (rHat from subset | tid | mean theta) >> 1.0252 ( 1.1173 | 89080 | 0.0000) >> 1.0216 ( 1.1351 | 126802 | 0.0000) >> 1.0183 ( 1.0151 | 183201 | 0.0000) >> Mean rHat of worst 10 transcripts: 1.018596 >> Mean C0: (3516 3520 3529 3518 ). Nunmap: 52005849 >> >> Producing 33 final samples. >> >> Sampling DONE. [time: +2.166667 m] >> rHat (for 33 samples) >> rHat (rHat from subset | tid | mean theta) >> 1.1193 ( 1.1332 | 117458 | 0.0000) >> 1.1181 ( 1.1229 | 158878 | 0.0000) >> 1.1074 ( 1.1004 | 43840 | 0.0000) >> Mean rHat of worst 10 transcripts: 1.108279 >> Mean C0: (3528 3512 3523 3520 ). Nunmap: 52005849 >> >> Total samples: 932 >> terminate called after throwing an instance of 'std::bad_alloc' >> what(): St9bad_alloc >> Aborted
ADD COMMENT

Login before adding your answer.

Traffic: 1050 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6