Question

dexseq fitDispersionFunction(ecs) error

0

Entering edit mode

Rao,Xiayu ▴ 550

@raoxiayu-6003

Last seen 9.8 years ago

United States

Hello, Alejandro Thank you very much for your previous guidance. I did what you suggested, merging the count data for all samples to generate a matrix and used this following code to get ecs. > ecs <- newExonCountSet(countData=countdata, design=sampleTable, geneIDs=geneid, exonIDs=exonid) The estimateDispersions(ecs) step was "Done", but the problem came as the following. Could you please give me some advice on dealing with this error ?? ............................................. Done > ecs <- fitDispersionFunction(ecs) Error in fitDispersionFunction(ecs) : no CR dispersion estimations found, please first call estimateDispersions function I found a way from the forum to check. It seems that the dispersion estimates are all NAs, which could also be seen from fData(ecs). > sum(!is.na(fData(ecs)$dispBeforeSharing)) / nrow(fData(ecs)) [1] 0 > head(countdata) SRR791052 SRR791054 SRR791056 SRR791057 SRR791058 ENSG00000000003:0001 0 0 0 0 0 ENSG00000000003:0002 75 120 67 108 93 ENSG00000000003:0003 0 0 0 0 0 ENSG00000000003:0004 78 126 59 58 74 ENSG00000000003:0005 0 0 0 0 0 ENSG00000000003:0006 67 114 37 47 54 SRR791059 SRR791061 SRR791062 SRR791066 SRR791067 ENSG00000000003:0001 0 0 0 0 2 ENSG00000000003:0002 126 85 108 43 41 ENSG00000000003:0003 0 1 0 0 0 ENSG00000000003:0004 131 67 98 37 14 ENSG00000000003:0005 0 0 0 0 0 ENSG00000000003:0006 94 56 80 37 26 SRR791068 SRR791069 SRR791070 SRR791071 SRR791072 ENSG00000000003:0001 0 0 0 0 0 ENSG00000000003:0002 34 64 55 62 64 ENSG00000000003:0003 1 0 0 0 0 ENSG00000000003:0004 22 26 32 54 47 ENSG00000000003:0005 0 0 0 0 0 ENSG00000000003:0006 33 36 42 60 40 SRR791073 ENSG00000000003:0001 0 ENSG00000000003:0002 50 ENSG00000000003:0003 0 ENSG00000000003:0004 18 ENSG00000000003:0005 1 ENSG00000000003:0006 29 > head(geneid) [1] "ENSG00000000003" "ENSG00000000003" "ENSG00000000003" "ENSG00000000003" [5] "ENSG00000000003" "ENSG00000000003" > head(exonid) [1] "0001" "0002" "0003" "0004" "0005" "0006" > sampleTable condition SRR791052 HER2+ SRR791054 HER2+ SRR791056 HER2+ SRR791057 HER2+ SRR791058 HER2+ SRR791059 HER2+ SRR791061 HER2+ SRR791062 HER2+ SRR791066 Benign SRR791067 Benign SRR791068 Benign SRR791069 Benign SRR791070 Benign SRR791071 Benign SRR791072 Benign SRR791073 Benign Note: I am using R 3.0.3 and DEXSeq 1.8.0. Thanks, Xiayu -----Original Message----- From: Rao,Xiayu Sent: Wednesday, April 16, 2014 9:18 AM To: 'Alejandro Reyes'; bioconductor at r-project.org Subject: RE: dexseq read.HTSeqCounts error Thank you, Alejandro! That sounds to be a good way to go. I will do what you suggested. Thanks, Xiayu -----Original Message----- From: Alejandro Reyes [mailto:alejandro.reyes@embl.de] Sent: Wednesday, April 16, 2014 2:36 AM To: Rao,Xiayu; bioconductor at r-project.org Subject: Re: dexseq read.HTSeqCounts error Hi Xiayu, Please keep your answers in the mailing list, so it can be useful for users encountering similar issues. Yes, that is the problem, the function is expecting the same number of lines for each file (it is expecting the users to use the same annotation file or the same features as counting bins for all their samples). What you could do is read independently your files within R, generate a matrix of the union of all possible counting bins (the first column of your files) times samples (columns), and then fill them in with the information from your files, leaving a 0 on the corresponding missing values. Then use the function newExonCountSet to generate your ecs file. Best regards, Alejandro > Hi, Alejandro > > Thank you for your prompt response. Below is the output. I think my files are OK, but they have different number of lines. So you think that is the issue?? I am not counting reads that locate in exonic regions, so I do not have a .gtf file. I am counting the reads within some splice sites. Do you have any suggestions for me in my case? > >> all( file.exists(as.character(sampleTable$countfiles)) ) > [1] TRUE > > -sh-4.1$ wc -l *count.txt > 222604 SRR791043_count.txt > 215908 SRR791044_count.txt > 232519 SRR791045_count.txt > 227275 SRR791046_count.txt > 241367 SRR791047_count.txt > 253112 SRR791048_count.txt > 234680 SRR791049_count.txt > 199632 SRR791050_count.txt > 208880 SRR791051_count.txt > 195265 SRR791052_count.txt > ......... > > Thanks, > Xiayu > > > -----Original Message----- > From: Alejandro Reyes [mailto:alejandro.reyes at embl.de] > Sent: Tuesday, April 15, 2014 11:44 AM > To: Rao,Xiayu; 'bioconductor at r-project.org' > Subject: Re: dexseq read.HTSeqCounts error > > Dear Xiayu Rao, > > What is the output of: > > all( file.exists(as.character(sampleTable$countfiles)) ) > > ? > > Maybe one of your files is corrupted? Do they have all the same number of lines? > > Best regards, > Alejandro > > >> Hi, Alejandro and others >> >> Thank you for your information on the forum, which is very helpful. I now have a problem in running dexseq. I need your advice to solve the problem. Any input would be very appreciated. >> >>> ecs <- >>> read.HTSeqCounts(countfiles=as.character(sampleTable$countFile),desi >>> g >>> n=sampleTable, flattenedfile=NULL) >> Error in `rownames<-`(`*tmp*`, value = c("ENSG00000000003:001", "ENSG00000000003:002", : >> attempt to set 'rownames' on an object with no dimensions >> >> I then checked: >>> dim(lf) >> NULL >>> dim(dcounts) >> NULL >> >> I generated count files by myself instead of using dexseq_count.py. I tried to follow the standard count file format: >> -sh-4.1$ more SRR791043_count.txt >> ENSG00000000003:001 284 >> ENSG00000000003:002 179 >> ENSG00000000003:003 275 >> ENSG00000000003:004 156 >> ENSG00000000003:005 177 >> ENSG00000000003:006 157 >> ENSG00000000003:007 9 >> ENSG00000000003:008 45 >> ENSG00000000003:009 4 >> ENSG00000000003+ENSG00000102362:001 3 >> ............ >> Intergene1:001 1 >> ......... >> intergene2097:001 11 >> intergene2098:001 99 >> intergene2099:001 54 >> intergene2100:001 3 >> intergene2101:001 3 >> intergene2102:001 20 >> intergene2103:001 1 >> intergene2104:001 2 >> intergene2105:001 1 >> intergene2106:001 10 >> _ambiguous 0 >> _empty 0 >> _lowaqual 0 >> _notaligned 0 >> >> >>> sampleTable >> countFile condition >> SRR791052 SRR791052_count.txt HER2+ >> SRR791054 SRR791054_count.txt HER2+ >> SRR791056 SRR791056_count.txt HER2+ >> SRR791057 SRR791057_count.txt HER2+ >> SRR791058 SRR791058_count.txt HER2+ >> SRR791059 SRR791059_count.txt HER2+ >> SRR791061 SRR791061_count.txt HER2+ >> SRR791062 SRR791062_count.txt HER2+ >> SRR791066 SRR791066_count.txt Benign >> SRR791067 SRR791067_count.txt Benign >> SRR791068 SRR791068_count.txt Benign >> SRR791069 SRR791069_count.txt Benign >> SRR791070 SRR791070_count.txt Benign >> SRR791071 SRR791071_count.txt Benign >> SRR791072 SRR791072_count.txt Benign >> SRR791073 SRR791073_count.txt Benign >> >> Note: I am using R 3.0.3 and DEXSeq 1.8.0 >> >> >> Thanks, >> Xiayu >>

GO DEXSeq GO DEXSeq • 1.6k views

ADD COMMENT • link updated 11.0 years ago by Alejandro Reyes ★ 1.9k • written 11.0 years ago by Rao,Xiayu ▴ 550

score 0 · Answer 1 · 2014-04-22

Dear Xiayu, Thanks for your report, we are looking into that right now, it has to do with versions of dependencies outside bioconductor. As a temporary solution, you can install an old version of statmod compatible with your version of DEXSeq by doing in your R session: library(devtools);install_url("http://cran.r-project.org/src/contrib/A rchive/statmod/statmod_1.4.18.tar.gz") Your code below should work then! Best regards, Alejandro > Hello, Alejandro > > Thank you very much for your previous guidance. I did what you suggested, merging the count data for all samples to generate a matrix and used this following code to get ecs. >> ecs <- newExonCountSet(countData=countdata, design=sampleTable, geneIDs=geneid, exonIDs=exonid) > The estimateDispersions(ecs) step was "Done", but the problem came as the following. Could you please give me some advice on dealing with this error ?? > ............................................. > Done >> ecs <- fitDispersionFunction(ecs) > Error in fitDispersionFunction(ecs) : > no CR dispersion estimations found, please first call estimateDispersions function > > I found a way from the forum to check. It seems that the dispersion estimates are all NAs, which could also be seen from fData(ecs). >> sum(!is.na(fData(ecs)$dispBeforeSharing)) / nrow(fData(ecs)) > [1] 0 > > >> head(countdata) > SRR791052 SRR791054 SRR791056 SRR791057 SRR791058 > ENSG00000000003:0001 0 0 0 0 0 > ENSG00000000003:0002 75 120 67 108 93 > ENSG00000000003:0003 0 0 0 0 0 > ENSG00000000003:0004 78 126 59 58 74 > ENSG00000000003:0005 0 0 0 0 0 > ENSG00000000003:0006 67 114 37 47 54 > SRR791059 SRR791061 SRR791062 SRR791066 SRR791067 > ENSG00000000003:0001 0 0 0 0 2 > ENSG00000000003:0002 126 85 108 43 41 > ENSG00000000003:0003 0 1 0 0 0 > ENSG00000000003:0004 131 67 98 37 14 > ENSG00000000003:0005 0 0 0 0 0 > ENSG00000000003:0006 94 56 80 37 26 > SRR791068 SRR791069 SRR791070 SRR791071 SRR791072 > ENSG00000000003:0001 0 0 0 0 0 > ENSG00000000003:0002 34 64 55 62 64 > ENSG00000000003:0003 1 0 0 0 0 > ENSG00000000003:0004 22 26 32 54 47 > ENSG00000000003:0005 0 0 0 0 0 > ENSG00000000003:0006 33 36 42 60 40 > SRR791073 > ENSG00000000003:0001 0 > ENSG00000000003:0002 50 > ENSG00000000003:0003 0 > ENSG00000000003:0004 18 > ENSG00000000003:0005 1 > ENSG00000000003:0006 29 > >> head(geneid) > [1] "ENSG00000000003" "ENSG00000000003" "ENSG00000000003" "ENSG00000000003" > [5] "ENSG00000000003" "ENSG00000000003" > >> head(exonid) > [1] "0001" "0002" "0003" "0004" "0005" "0006" > >> sampleTable > condition > SRR791052 HER2+ > SRR791054 HER2+ > SRR791056 HER2+ > SRR791057 HER2+ > SRR791058 HER2+ > SRR791059 HER2+ > SRR791061 HER2+ > SRR791062 HER2+ > SRR791066 Benign > SRR791067 Benign > SRR791068 Benign > SRR791069 Benign > SRR791070 Benign > SRR791071 Benign > SRR791072 Benign > SRR791073 Benign > > Note: I am using R 3.0.3 and DEXSeq 1.8.0. > > Thanks, > Xiayu > > > > -----Original Message----- > From: Rao,Xiayu > Sent: Wednesday, April 16, 2014 9:18 AM > To: 'Alejandro Reyes'; bioconductor at r-project.org > Subject: RE: dexseq read.HTSeqCounts error > > Thank you, Alejandro! That sounds to be a good way to go. I will do what you suggested. > > Thanks, > Xiayu > > -----Original Message----- > From: Alejandro Reyes [mailto:alejandro.reyes at embl.de] > Sent: Wednesday, April 16, 2014 2:36 AM > To: Rao,Xiayu; bioconductor at r-project.org > Subject: Re: dexseq read.HTSeqCounts error > > Hi Xiayu, > > Please keep your answers in the mailing list, so it can be useful for users encountering similar issues. > > Yes, that is the problem, the function is expecting the same number of lines for each file (it is expecting the users to use the same annotation file or the same features as counting bins for all their samples). What you could do is read independently your files within R, generate a matrix of the union of all possible counting bins (the first column of your files) times samples (columns), and then fill them in with the information from your files, leaving a 0 on the corresponding missing values. Then use the function newExonCountSet to generate your ecs file. > > Best regards, > Alejandro > > > > > >> Hi, Alejandro >> >> Thank you for your prompt response. Below is the output. I think my files are OK, but they have different number of lines. So you think that is the issue?? I am not counting reads that locate in exonic regions, so I do not have a .gtf file. I am counting the reads within some splice sites. Do you have any suggestions for me in my case? >> >>> all( file.exists(as.character(sampleTable$countfiles)) ) >> [1] TRUE >> >> -sh-4.1$ wc -l *count.txt >> 222604 SRR791043_count.txt >> 215908 SRR791044_count.txt >> 232519 SRR791045_count.txt >> 227275 SRR791046_count.txt >> 241367 SRR791047_count.txt >> 253112 SRR791048_count.txt >> 234680 SRR791049_count.txt >> 199632 SRR791050_count.txt >> 208880 SRR791051_count.txt >> 195265 SRR791052_count.txt >> ......... >> >> Thanks, >> Xiayu >> >> >> -----Original Message----- >> From: Alejandro Reyes [mailto:alejandro.reyes at embl.de] >> Sent: Tuesday, April 15, 2014 11:44 AM >> To: Rao,Xiayu; 'bioconductor at r-project.org' >> Subject: Re: dexseq read.HTSeqCounts error >> >> Dear Xiayu Rao, >> >> What is the output of: >> >> all( file.exists(as.character(sampleTable$countfiles)) ) >> >> ? >> >> Maybe one of your files is corrupted? Do they have all the same number of lines? >> >> Best regards, >> Alejandro >> >> >>> Hi, Alejandro and others >>> >>> Thank you for your information on the forum, which is very helpful. I now have a problem in running dexseq. I need your advice to solve the problem. Any input would be very appreciated. >>> >>>> ecs <- >>>> read.HTSeqCounts(countfiles=as.character(sampleTable$countFile),desi >>>> g >>>> n=sampleTable, flattenedfile=NULL) >>> Error in `rownames<-`(`*tmp*`, value = c("ENSG00000000003:001", "ENSG00000000003:002", : >>> attempt to set 'rownames' on an object with no dimensions >>> >>> I then checked: >>>> dim(lf) >>> NULL >>>> dim(dcounts) >>> NULL >>> >>> I generated count files by myself instead of using dexseq_count.py. I tried to follow the standard count file format: >>> -sh-4.1$ more SRR791043_count.txt >>> ENSG00000000003:001 284 >>> ENSG00000000003:002 179 >>> ENSG00000000003:003 275 >>> ENSG00000000003:004 156 >>> ENSG00000000003:005 177 >>> ENSG00000000003:006 157 >>> ENSG00000000003:007 9 >>> ENSG00000000003:008 45 >>> ENSG00000000003:009 4 >>> ENSG00000000003+ENSG00000102362:001 3 >>> ............ >>> Intergene1:001 1 >>> ......... >>> intergene2097:001 11 >>> intergene2098:001 99 >>> intergene2099:001 54 >>> intergene2100:001 3 >>> intergene2101:001 3 >>> intergene2102:001 20 >>> intergene2103:001 1 >>> intergene2104:001 2 >>> intergene2105:001 1 >>> intergene2106:001 10 >>> _ambiguous 0 >>> _empty 0 >>> _lowaqual 0 >>> _notaligned 0 >>> >>> >>>> sampleTable >>> countFile condition >>> SRR791052 SRR791052_count.txt HER2+ >>> SRR791054 SRR791054_count.txt HER2+ >>> SRR791056 SRR791056_count.txt HER2+ >>> SRR791057 SRR791057_count.txt HER2+ >>> SRR791058 SRR791058_count.txt HER2+ >>> SRR791059 SRR791059_count.txt HER2+ >>> SRR791061 SRR791061_count.txt HER2+ >>> SRR791062 SRR791062_count.txt HER2+ >>> SRR791066 SRR791066_count.txt Benign >>> SRR791067 SRR791067_count.txt Benign >>> SRR791068 SRR791068_count.txt Benign >>> SRR791069 SRR791069_count.txt Benign >>> SRR791070 SRR791070_count.txt Benign >>> SRR791071 SRR791071_count.txt Benign >>> SRR791072 SRR791072_count.txt Benign >>> SRR791073 SRR791073_count.txt Benign >>> >>> Note: I am using R 3.0.3 and DEXSeq 1.8.0 >>> >>> >>> Thanks, >>> Xiayu >>>