Dear Eleanor,
Well, a couple of comments.
First, edgeR does not have a limitation on the number of genes it can
run on.
I suggest that you upgrade the most recent version of edgeR, which I
suspect you do not have, and run
y <- estimateDisp(y,design)
Second, given that you have already analyzed the full set of piRNAs
successfully, why in the world would you need to rerun the analysis on
just half of them? This does seem like a self-inflicted problem.
Gordon
> Date: Wed, 2 Apr 2014 09:58:23 -0700
> From: Eleanor Su <eleanorjinsu at="" gmail.com="">
> To: Steve Lianoglou <lianoglou.steve at="" gene.com="">
> Cc: "bioconductor at stat.math.ethz.ch" <bioconductor at="" stat.math.ethz.ch="">
> Subject: Re: [BioC] Limitations in edgeR?
>
> Hi Steve,
>
> I'm running the same analysis on both datasets (the larger and the
> smaller). When I rerun the analysis on the smaller dataset (which
actually
> IS half of the identities from the larger data set), I come across
an error
> message when estimating glm trended dispersion. Here are the
commands I'm
> using:
>
>> rawdata<-read.delim("piRNAtotalcount>10.txt", check.names=FALSE,
> stringsAsFactors=FALSE)
>> y <- DGEList(counts=rawdata[,2:11], genes=rawdata[,1])
>> Family<-factor(c(6,6,9,9,11,11,26,26,28,28))
>> Treatment<-factor(c("C","H","C","H","C","H","C","H","C","H"))
>> data.frame(Sample=colnames(y),Family,Treatment)
> Sample Family Treatment
> 1 6C 6 C
> 2 6H 6 H
> 3 9C 9 C
> 4 9H 9 H
> 5 11C 11 C
> 6 11H 11 H
> 7 26C 26 C
> 8 26H 26 H
> 9 28C 28 C
> 10 28H 28 H
>> design<-model.matrix(~Family+Treatment)
>> rownames(design)<-colnames(y)
>> y<-estimateGLMTrendedDisp(y,design)
> Error in optim(par0, fun, y = y.nonzero[i, ], design = design,
offset =
> offset.nonzero[i, :
> function cannot be evaluated at initial parameters
>
> I only encounter this error when running the smaller dataset.
>
> Best,
> Eleanor
>
>
>
> On Wed, Apr 2, 2014 at 9:49 AM, Steve Lianoglou <lianoglou.steve at="" gene.com="">wrote:
>
>> Hi Eleanor,
>>
>> On Tue, Apr 1, 2014 at 11:09 AM, Eleanor Su <eleanorjinsu at="" gmail.com="">
>> wrote:
>>> Hi All,
>>>
>>> I'm currently trying to analyze differential expression of piRNAs
in some
>>> small data sets but am coming across issues that I didn't before
when I
>>> analyzed with a larger data set. The larger data set contained 324
piRNA
>>> identities while the smaller data set contained half as many piRNA
>>> identities. Is there a minimum number of gene identities required
in
>> order
>>> to analyze differential expression in edgeR?
>>
>> It's hard to help without knowing what the issues are that you are
>> running into, so ... what's going wrong?
>>
>> One way you could explore this question yourself is to use the
larger
>> (324 piRNA) dataset that "went well" and simply take half of the
data
>> from it and rerun the same analysis on the smaller set. Do you get
>> different results?
>>
>> While you're playing with that idea, please provide a follow up
email
>> with more specific details about what the issues are that you are
>> running into with your new (smaller) dataset.
>>
>> HTH,
>> -steve
>>
>> --
>> Steve Lianoglou
>> Computational Biologist
>> Genentech
______________________________________________________________________
The information in this email is confidential and
intend...{{dropped:4}}