Wilcoxon test [was loged data or not loged previous to use normalize.quantile]

0

Entering edit mode

Matthew Hannah ▴ 940

@matthew-hannah-621

Last seen 10.1 years ago

>Not forgetting that the two-sample t-test performs fine under the same circumstances (large >balanced samples), even for non-normal distributions and unequal variances. > >Regards >Gordon Does anyone by any chance have a few references for this point, particularly for non-normal distributions. I've seen references to monte-carlo simulation studies to look at assumption violations but being at a biological institute it's difficult to get access to good statistics texts. All internet searches just mention 'large' and 'balanced' samples. I would be especially interested in 'what if' situations like you gave for the wilcoxon test. I have group sizes between 0-30, generally unbalanced to some degree (mean min/max = 15/25). I know these are not that large (if large at all). But I'm looking to 'quantify' what problems I may get comparing sample sizes of say 6, 15, 21, 25, 29. If there are also non-normal dist, skew and outliers to take into account in some cases. I'm wondering if I have unbalanced group size (x > y) whether it would reduce the problems of unbalanced variance to x1 <- sample(x,y) then test (x1,y) for a number (10?) of repeats and then take the maximum p.value I guess anything with n < 10 would have to be discarded first. Looking at the data case by case is not possible with >500 compounds and ~20 groups. Cheers for any info, Matt

• 852 views

ADD COMMENT • link updated 19.5 years ago by A.J. Rossini ▴ 210 • written 19.5 years ago by Matthew Hannah ▴ 940

0

Entering edit mode

A.J. Rossini ▴ 210

@aj-rossini-973

Last seen 10.1 years ago

One citation in this area which is readable is: Annu Rev Public Health. 2002;23:151-69. Epub 2001 Oct 25. Related Articles, Links The importance of the normality assumption in large public health data sets. Lumley T, Diehr P, Emerson S, Chen L. referenced at: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&l ist_uids=11910059&dopt=Abstract On 4/11/05, Matthew Hannah <hannah@mpimp-golm.mpg.de> wrote: > >Not forgetting that the two-sample t-test performs fine under the same > circumstances (large > >balanced samples), even for non-normal distributions and unequal > variances. > > > >Regards > >Gordon > > Does anyone by any chance have a few references for this point, > particularly for non-normal distributions. I've seen references to > monte-carlo simulation studies to look at assumption violations but > being at a biological institute it's difficult to get access to good > statistics texts. All internet searches just mention 'large' and > 'balanced' samples. I would be especially interested in 'what if' > situations like you gave for the wilcoxon test. > > I have group sizes between 0-30, generally unbalanced to some degree > (mean min/max = 15/25). I know these are not that large (if large at > all). But I'm looking to 'quantify' what problems I may get comparing > sample sizes of say 6, 15, 21, 25, 29. If there are also non-normal > dist, skew and outliers to take into account in some cases. > > I'm wondering if I have unbalanced group size (x > y) whether it would > reduce the problems of unbalanced variance to > x1 <- sample(x,y) > then test (x1,y) for a number (10?) of repeats and then take the maximum > p.value > I guess anything with n < 10 would have to be discarded first. > > Looking at the data case by case is not possible with >500 compounds and > ~20 groups. > > Cheers for any info, > Matt > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > -- best, -tony "Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes" (AJR, 4Jan05). A.J. Rossini blindglobe@gmail.com

ADD COMMENT • link 19.5 years ago A.J. Rossini ▴ 210

Login before adding your answer.