problem with siggenes

0

Entering edit mode

Edoardo Saccenti ▴ 130

@edoardo-saccenti-1054

Last seen 10.2 years ago

I would like to manage a FDR analysis via SAM as implemented in siggenes package. First I read 2 file.CEL into an affybatch object called "mydata" Then i used rma routine to correct my data obtaining an exprSet object called "myeset" According to the guide I need to pass to sam the data (myeset in this case) and a vector cl This is a one class case, so so cl must be a vector of ones of length equal to number of sample. As the number of sample is 2 (2 CEL files) cl <- c(1,1) Typing at the R prompt: out <- sam.dstat(myeset, cl, rand=123) I get the following: We're doing 4 complete permutations Error in rowSums(x, prod(dn), p, na.rm) : invalid value of n In addition: Warning message: There are 147 genes with zero variance. These genes are removed, and their d-values are set to NA. I'm sure I'm doing some stupid mistake 'couse I'm new to R and BioC: nevertheless can anybody help me? Thanks edoardo "Raffiniert ist der Herr Gott, aber boshaft ist Er nicht." --- Dr. Edoardo Saccenti FiorGen Pharmacogenomics Foundation CERM Nuclear Magnetic Resonace Research Center Scientific Pole - University of Florence Via Luigi Sacconi n? 6 50019 Sesto Fiorentino (FI) tel: +39 055 4574193 fax: +39 055 4574253 saccenti@cerm.unifi.it www.cerm.unifi.it

Pharmacogenomics siggenes Pharmacogenomics siggenes • 1.3k views

ADD COMMENT • link updated 19.9 years ago by Holger Schwender ▴ 900 • written 19.9 years ago by Edoardo Saccenti ▴ 130

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 10.2 years ago

As far as I know, if you only have two arrays, one from each "treatment" in your experiment, there is no way that you can do any kind of statistics at all.... -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor- bounces@stat.math.ethz.ch] On Behalf Of Edoardo Saccenti Sent: 13 January 2005 16:45 To: bioconductor@stat.math.ethz.ch Subject: [BioC] problem with siggenes I would like to manage a FDR analysis via SAM as implemented in siggenes package. First I read 2 file.CEL into an affybatch object called "mydata" Then i used rma routine to correct my data obtaining an exprSet object called "myeset" According to the guide I need to pass to sam the data (myeset in this case) and a vector cl This is a one class case, so so cl must be a vector of ones of length equal to number of sample. As the number of sample is 2 (2 CEL files) cl <- c(1,1) Typing at the R prompt: out <- sam.dstat(myeset, cl, rand=123) I get the following: We're doing 4 complete permutations Error in rowSums(x, prod(dn), p, na.rm) : invalid value of n In addition: Warning message: There are 147 genes with zero variance. These genes are removed, and their d-values are set to NA. I'm sure I'm doing some stupid mistake 'couse I'm new to R and BioC: nevertheless can anybody help me? Thanks edoardo "Raffiniert ist der Herr Gott, aber boshaft ist Er nicht." --- Dr. Edoardo Saccenti FiorGen Pharmacogenomics Foundation CERM Nuclear Magnetic Resonace Research Center Scientific Pole - University of Florence Via Luigi Sacconi n? 6 50019 Sesto Fiorentino (FI) tel: +39 055 4574193 fax: +39 055 4574253 saccenti@cerm.unifi.it www.cerm.unifi.it _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor

ADD COMMENT • link 19.9 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

Fangxin Hong ▴ 810

@fangxin-hong-912

Last seen 10.2 years ago

First, please ignore the emails I sent out yesterday, I was using an old version of siggenes. However, I do find problems with siggenes. There is Excel version SAM method which can be downloaded from Stanford website. For the same data set, I got very different results from siggenes and from Excel SAM, > out=sam(data,cl,delta=seq(0.1,7,0.1),rand=123) > FDR=summary(out)[,5] > Delta=summary(out)[,1] > d.min=min(Delta[FDR<0.05]) > gene.list1=summary(out,d.min,ll=FALSE)$row.sig.genes > gene.list2=list.siggenes(out,d.min) siggenes identify much less genes than Excel SAM does. In addition, if only one gene identified using certain delta value, summary()$row.sig.genes (gene.list1 above) will not list that gene since there is error in the function. list.siggenes will only print the identified genes out, but won't assign gene list/name to other object (gene.list2 is empty in the example) Anyone knows what is going on here or what mistakes I might made. Thanks. Fangxin > As far as I know, if you only have two arrays, one from each "treatment" > in your experiment, there is no way that you can do any kind of statistics > at all.... > > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Edoardo > Saccenti > Sent: 13 January 2005 16:45 > To: bioconductor@stat.math.ethz.ch > Subject: [BioC] problem with siggenes > > > I would like to manage a FDR analysis via > SAM as implemented in siggenes package. > > First I read 2 file.CEL into an affybatch object called "mydata" Then i > used rma routine to correct my data obtaining an exprSet object called > "myeset" > > According to the guide I need to pass to sam > the data (myeset in this case) and a vector cl > > This is a one class case, so > so cl must be a vector of ones of length equal to number > of sample. > As the number of sample is 2 (2 CEL files) > > cl <- c(1,1) > > Typing at the R prompt: > > out <- sam.dstat(myeset, cl, rand=123) > > I get the following: > > We're doing 4 complete permutations > Error in rowSums(x, prod(dn), p, na.rm) : invalid value of n > In addition: Warning message: > There are 147 genes with zero variance. These genes are removed, > and their d-values are set to NA. > > I'm sure I'm doing some stupid mistake 'couse I'm new to R and BioC: > nevertheless can anybody help me? > > Thanks > edoardo > > > > "Raffiniert ist der Herr Gott, > aber boshaft ist Er nicht." > > --- > Dr. Edoardo Saccenti > FiorGen Pharmacogenomics Foundation > CERM Nuclear Magnetic Resonace Research Center > Scientific Pole - University of Florence > Via Luigi Sacconi n? 6 > 50019 Sesto Fiorentino (FI) > tel: +39 055 4574193 > fax: +39 055 4574253 > saccenti@cerm.unifi.it > www.cerm.unifi.it > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > -- Fangxin Hong, Ph.D. Plant Biology Laboratory The Salk Institute 10010 N. Torrey Pines Rd. La Jolla, CA 92037 E-mail: fhong@salk.edu

ADD COMMENT • link 19.9 years ago Fangxin Hong ▴ 810

0

Entering edit mode

Holger Schwender ▴ 900

@holger-schwender-344

Last seen 10.2 years ago

Hi Fangxin, there are some differences between the default options of the R version and the Excel version. First of all, R SAM computes by default Welch's t-statistic (i.e. a t-statistic that does not assume equal group variances) while Excel SAM computes the "usual" t-statistic. Set var.equal=TRUE to obtain the usual t-statistics. Second R SAM computes by default the mean number of falsely called genes, whereas Excel SAM computes the median number of falsely called genes. Set med=TRUE to obtain the median number. There are some other differences but the above differences should be the main reasons for the different number of genes you obtain. I have summarized all the changes more than once at the Excel SAM forum sam-software@yahoo.com and I will add a function RvsExcelSAM in the next version of siggenes. You are correct. There is a bug in the summary function for SAM: You will get an error when you have no or just one differentially expressed genes. This will be fixed in the next version of siggenes. I planned to publish this version in the middle of January (i.e. sometimes around today) but because of lots of other work it will take a little longer. The only purpose of the list.genes function is to list the significant genes and not to give you some of the statistics (summary is the function that does this). And it was actually thought to put this name in some file. That's why it currently has no output. But it will have one in the next version of siggenes. Best, Holger > First, please ignore the emails I sent out yesterday, I was using an old > version of siggenes. > > However, I do find problems with siggenes. There is Excel version SAM > method which can be downloaded from Stanford website. For the same data > set, I got very different results from siggenes and from Excel SAM, > > > out=sam(data,cl,delta=seq(0.1,7,0.1),rand=123) > > FDR=summary(out)[,5] > > Delta=summary(out)[,1] > > d.min=min(Delta[FDR<0.05]) > > gene.list1=summary(out,d.min,ll=FALSE)$row.sig.genes > > gene.list2=list.siggenes(out,d.min) > > siggenes identify much less genes than Excel SAM does. In addition, if > only one gene identified using certain delta value, > summary()$row.sig.genes (gene.list1 above) will not list that gene since > there is error in the function. list.siggenes will only print the > identified genes out, but won't assign gene list/name to other object > (gene.list2 is empty in the example) > > Anyone knows what is going on here or what mistakes I might made. > > Thanks. > Fangxin > > > As far as I know, if you only have two arrays, one from each "treatment" > > in your experiment, there is no way that you can do any kind of > statistics > > at all.... > > > > -----Original Message----- > > From: bioconductor-bounces@stat.math.ethz.ch > > [mailto:bioconductor-bounces@stat.math.ethz.ch] On Behalf Of Edoardo > > Saccenti > > Sent: 13 January 2005 16:45 > > To: bioconductor@stat.math.ethz.ch > > Subject: [BioC] problem with siggenes > > > > > > I would like to manage a FDR analysis via > > SAM as implemented in siggenes package. > > > > First I read 2 file.CEL into an affybatch object called "mydata" Then i > > used rma routine to correct my data obtaining an exprSet object called > > "myeset" > > > > According to the guide I need to pass to sam > > the data (myeset in this case) and a vector cl > > > > This is a one class case, so > > so cl must be a vector of ones of length equal to number > > of sample. > > As the number of sample is 2 (2 CEL files) > > > > cl <- c(1,1) > > > > Typing at the R prompt: > > > > out <- sam.dstat(myeset, cl, rand=123) > > > > I get the following: > > > > We're doing 4 complete permutations > > Error in rowSums(x, prod(dn), p, na.rm) : invalid value of n > > In addition: Warning message: > > There are 147 genes with zero variance. These genes are removed, > > and their d-values are set to NA. > > > > I'm sure I'm doing some stupid mistake 'couse I'm new to R and BioC: > > nevertheless can anybody help me? > > > > Thanks > > edoardo > > > > > > > > "Raffiniert ist der Herr Gott, > > aber boshaft ist Er nicht." > > > > --- > > Dr. Edoardo Saccenti > > FiorGen Pharmacogenomics Foundation > > CERM Nuclear Magnetic Resonace Research Center > > Scientific Pole - University of Florence > > Via Luigi Sacconi n? 6 > > 50019 Sesto Fiorentino (FI) > > tel: +39 055 4574193 > > fax: +39 055 4574253 > > saccenti@cerm.unifi.it > > www.cerm.unifi.it > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > > > > > -- > Fangxin Hong, Ph.D. > Plant Biology Laboratory > The Salk Institute > 10010 N. Torrey Pines Rd. > La Jolla, CA 92037 > E-mail: fhong@salk.edu > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- +++ Sparen Sie mit GMX DSL +++ http://www.gmx.net/de/go/dsl AKTION f?r Wechsler: DSL-Tarife ab 3,99 EUR/Monat + Startguthaben

ADD COMMENT • link 19.9 years ago Holger Schwender ▴ 900

Login before adding your answer.