SAM output explanation and an FDR question

0

Entering edit mode

Ettinger, Nicholas ▴ 110

@ettinger-nicholas-1549

Last seen 10.7 years ago

Hello all! (A) For a not so statistically gifted grad student, can someone either tell me or point me to a place where I can understand what exactly all the columns in my "summary(sam.out)" data mean? I understand Delta, cutlow, cutup and FDR but the others (s0, False, Called, j2, j1) I am not exactly sure. I looked through the siggenes vignette but this question is not addressed specifically. > summary(sam.out) SAM Analysis for the One-Class Case s0 = 0.1046 (The 30 % quantile of the s values.) Number of permutations: 16 (complete permutation) MEAN number of falsely called genes is computed. Delta p0 False Called FDR cutlow cutup j2 j1 1 0.1 0.833 24511.25 27472 0.743 -0.130 1.459 26547 53751 2 0.2 0.833 2863.875 5314 0.449 -1.066 Inf 5314 54676 3 0.3 0.833 449.625 1085 0.345 -1.615 Inf 1085 54676 4 0.4 0.833 53 140 0.315 -2.307 Inf 140 54676 (B) A related question: is there any kind of consensus on "how much" FDR is "too much" FDR? In other words, it is pretty well accepted that for small numbers of hypotheses, we want p <0.05 to guide us as to whether a change is significant or not. Has any kind of consensus evolved in a similar manner for FDR? Any primary literature addressing this? Thank you!! ---Nick

siggenes siggenes • 1.7k views

ADD COMMENT • link updated 19.3 years ago by Naomi Altman ★ 6.0k • written 19.3 years ago by Ettinger, Nicholas ▴ 110

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 4.0 years ago

United States

I can answer a couple of your questions. a) Tests of differential expression for a particular gene use a ratio of differences in log expression to an estimate of how variable this difference should be in a random sample. When the number of arrays is small, the estimate of variability is poor and can be improved by adding a constant that is computed using the data from all the genes. That constant is s0. b) There is no consensus about appropriate values for FDR, because what is appropriate depends both on the goal of the study (find a few interesting genes, or find all the important genes) and on pi0, the percentage of genes that do not differentially express. We should also worry about FNR. When pi0 is large (say over 90%) then FNR is negligible for FDR values typically chosen. When pi0 is small (say less than 20%) then FDR is negligible, but FNR may be high. --Naomi At 05:32 PM 1/10/2006, Ettinger, Nicholas wrote: >Hello all! > >(A) For a not so statistically gifted grad student, can someone either >tell me or point me to a place where I can understand what exactly all >the columns in my "summary(sam.out)" data mean? I understand Delta, >cutlow, cutup and FDR but the others (s0, False, Called, j2, j1) I am >not exactly sure. I looked through the siggenes vignette but this >question is not addressed specifically. > > > summary(sam.out) > >SAM Analysis for the One-Class Case > > s0 = 0.1046 (The 30 % quantile of the s values.) > > Number of permutations: 16 (complete permutation) > > MEAN number of falsely called genes is computed. > > Delta p0 False Called FDR cutlow cutup j2 j1 >1 0.1 0.833 24511.25 27472 0.743 -0.130 1.459 26547 53751 >2 0.2 0.833 2863.875 5314 0.449 -1.066 Inf 5314 54676 >3 0.3 0.833 449.625 1085 0.345 -1.615 Inf 1085 54676 >4 0.4 0.833 53 140 0.315 -2.307 Inf 140 54676 > >(B) A related question: is there any kind of consensus on "how much" FDR >is "too much" FDR? In other words, it is pretty well accepted that for >small numbers of hypotheses, we want p <0.05 to guide us as to whether a >change is significant or not. Has any kind of consensus evolved in a >similar manner for FDR? Any primary literature addressing this? > >Thank you!! >---Nick > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 19.3 years ago Naomi Altman ★ 6.0k

Login before adding your answer.