Entering edit mode
Hello, Alejandro
Thank you very much for your previous guidance. I did what you
suggested, merging the count data for all samples to generate a matrix
and used this following code to get ecs.
> ecs <- newExonCountSet(countData=countdata, design=sampleTable,
geneIDs=geneid, exonIDs=exonid)
The estimateDispersions(ecs) step was "Done", but the problem came as
the following. Could you please give me some advice on dealing with
this error ??
.............................................
Done
> ecs <- fitDispersionFunction(ecs)
Error in fitDispersionFunction(ecs) :
no CR dispersion estimations found, please first call
estimateDispersions function
I found a way from the forum to check. It seems that the dispersion
estimates are all NAs, which could also be seen from fData(ecs).
> sum(!is.na(fData(ecs)$dispBeforeSharing)) / nrow(fData(ecs))
[1] 0
> head(countdata)
SRR791052 SRR791054 SRR791056 SRR791057 SRR791058
ENSG00000000003:0001 0 0 0 0 0
ENSG00000000003:0002 75 120 67 108 93
ENSG00000000003:0003 0 0 0 0 0
ENSG00000000003:0004 78 126 59 58 74
ENSG00000000003:0005 0 0 0 0 0
ENSG00000000003:0006 67 114 37 47 54
SRR791059 SRR791061 SRR791062 SRR791066 SRR791067
ENSG00000000003:0001 0 0 0 0 2
ENSG00000000003:0002 126 85 108 43 41
ENSG00000000003:0003 0 1 0 0 0
ENSG00000000003:0004 131 67 98 37 14
ENSG00000000003:0005 0 0 0 0 0
ENSG00000000003:0006 94 56 80 37 26
SRR791068 SRR791069 SRR791070 SRR791071 SRR791072
ENSG00000000003:0001 0 0 0 0 0
ENSG00000000003:0002 34 64 55 62 64
ENSG00000000003:0003 1 0 0 0 0
ENSG00000000003:0004 22 26 32 54 47
ENSG00000000003:0005 0 0 0 0 0
ENSG00000000003:0006 33 36 42 60 40
SRR791073
ENSG00000000003:0001 0
ENSG00000000003:0002 50
ENSG00000000003:0003 0
ENSG00000000003:0004 18
ENSG00000000003:0005 1
ENSG00000000003:0006 29
> head(geneid)
[1] "ENSG00000000003" "ENSG00000000003" "ENSG00000000003"
"ENSG00000000003"
[5] "ENSG00000000003" "ENSG00000000003"
> head(exonid)
[1] "0001" "0002" "0003" "0004" "0005" "0006"
> sampleTable
condition
SRR791052 HER2+
SRR791054 HER2+
SRR791056 HER2+
SRR791057 HER2+
SRR791058 HER2+
SRR791059 HER2+
SRR791061 HER2+
SRR791062 HER2+
SRR791066 Benign
SRR791067 Benign
SRR791068 Benign
SRR791069 Benign
SRR791070 Benign
SRR791071 Benign
SRR791072 Benign
SRR791073 Benign
Note: I am using R 3.0.3 and DEXSeq 1.8.0.
Thanks,
Xiayu
-----Original Message-----
From: Rao,Xiayu
Sent: Wednesday, April 16, 2014 9:18 AM
To: 'Alejandro Reyes'; bioconductor at r-project.org
Subject: RE: dexseq read.HTSeqCounts error
Thank you, Alejandro! That sounds to be a good way to go. I will do
what you suggested.
Thanks,
Xiayu
-----Original Message-----
From: Alejandro Reyes [mailto:alejandro.reyes@embl.de]
Sent: Wednesday, April 16, 2014 2:36 AM
To: Rao,Xiayu; bioconductor at r-project.org
Subject: Re: dexseq read.HTSeqCounts error
Hi Xiayu,
Please keep your answers in the mailing list, so it can be useful for
users encountering similar issues.
Yes, that is the problem, the function is expecting the same number of
lines for each file (it is expecting the users to use the same
annotation file or the same features as counting bins for all their
samples). What you could do is read independently your files within
R, generate a matrix of the union of all possible counting bins (the
first column of your files) times samples (columns), and then fill
them in with the information from your files, leaving a 0 on the
corresponding missing values. Then use the function newExonCountSet to
generate your ecs file.
Best regards,
Alejandro
> Hi, Alejandro
>
> Thank you for your prompt response. Below is the output. I think my
files are OK, but they have different number of lines. So you think
that is the issue?? I am not counting reads that locate in exonic
regions, so I do not have a .gtf file. I am counting the reads within
some splice sites. Do you have any suggestions for me in my case?
>
>> all( file.exists(as.character(sampleTable$countfiles)) )
> [1] TRUE
>
> -sh-4.1$ wc -l *count.txt
> 222604 SRR791043_count.txt
> 215908 SRR791044_count.txt
> 232519 SRR791045_count.txt
> 227275 SRR791046_count.txt
> 241367 SRR791047_count.txt
> 253112 SRR791048_count.txt
> 234680 SRR791049_count.txt
> 199632 SRR791050_count.txt
> 208880 SRR791051_count.txt
> 195265 SRR791052_count.txt
> .........
>
> Thanks,
> Xiayu
>
>
> -----Original Message-----
> From: Alejandro Reyes [mailto:alejandro.reyes at embl.de]
> Sent: Tuesday, April 15, 2014 11:44 AM
> To: Rao,Xiayu; 'bioconductor at r-project.org'
> Subject: Re: dexseq read.HTSeqCounts error
>
> Dear Xiayu Rao,
>
> What is the output of:
>
> all( file.exists(as.character(sampleTable$countfiles)) )
>
> ?
>
> Maybe one of your files is corrupted? Do they have all the same
number of lines?
>
> Best regards,
> Alejandro
>
>
>> Hi, Alejandro and others
>>
>> Thank you for your information on the forum, which is very helpful.
I now have a problem in running dexseq. I need your advice to solve
the problem. Any input would be very appreciated.
>>
>>> ecs <-
>>>
read.HTSeqCounts(countfiles=as.character(sampleTable$countFile),desi
>>> g
>>> n=sampleTable, flattenedfile=NULL)
>> Error in `rownames<-`(`*tmp*`, value = c("ENSG00000000003:001",
"ENSG00000000003:002", :
>> attempt to set 'rownames' on an object with no dimensions
>>
>> I then checked:
>>> dim(lf)
>> NULL
>>> dim(dcounts)
>> NULL
>>
>> I generated count files by myself instead of using dexseq_count.py.
I tried to follow the standard count file format:
>> -sh-4.1$ more SRR791043_count.txt
>> ENSG00000000003:001 284
>> ENSG00000000003:002 179
>> ENSG00000000003:003 275
>> ENSG00000000003:004 156
>> ENSG00000000003:005 177
>> ENSG00000000003:006 157
>> ENSG00000000003:007 9
>> ENSG00000000003:008 45
>> ENSG00000000003:009 4
>> ENSG00000000003+ENSG00000102362:001 3
>> ............
>> Intergene1:001 1
>> .........
>> intergene2097:001 11
>> intergene2098:001 99
>> intergene2099:001 54
>> intergene2100:001 3
>> intergene2101:001 3
>> intergene2102:001 20
>> intergene2103:001 1
>> intergene2104:001 2
>> intergene2105:001 1
>> intergene2106:001 10
>> _ambiguous 0
>> _empty 0
>> _lowaqual 0
>> _notaligned 0
>>
>>
>>> sampleTable
>> countFile condition
>> SRR791052 SRR791052_count.txt HER2+
>> SRR791054 SRR791054_count.txt HER2+
>> SRR791056 SRR791056_count.txt HER2+
>> SRR791057 SRR791057_count.txt HER2+
>> SRR791058 SRR791058_count.txt HER2+
>> SRR791059 SRR791059_count.txt HER2+
>> SRR791061 SRR791061_count.txt HER2+
>> SRR791062 SRR791062_count.txt HER2+
>> SRR791066 SRR791066_count.txt Benign
>> SRR791067 SRR791067_count.txt Benign
>> SRR791068 SRR791068_count.txt Benign
>> SRR791069 SRR791069_count.txt Benign
>> SRR791070 SRR791070_count.txt Benign
>> SRR791071 SRR791071_count.txt Benign
>> SRR791072 SRR791072_count.txt Benign
>> SRR791073 SRR791073_count.txt Benign
>>
>> Note: I am using R 3.0.3 and DEXSeq 1.8.0
>>
>>
>> Thanks,
>> Xiayu
>>