wiki:StatisticalAnalysis

Version 2 (modified by gkronber, 7 years ago) (diff)

--

A page for collecting and discussing statistical analysis of metaheuristic optimization experiments.

In metaheuristic optimization experiments we measure the outcome of a stochastic process. From this measurements we hope to estimate the distribution mean of the outcome. The outcome usually is a certain quality or the time/iterations required to reach a certain quality. These outcomes are random variables with unknown distributions.

The goal in those experiments is to show that one process is able to achieve a better output than another process. There are several ways to show this:

  • Boxplot charts
  • Overlaid histograms (have not yet seen anyone doing it, but could also be worth a try)
  • Statistical hypotheses tests for unequality of two means

Statistical Analysis Methods

Possible Workflow / Methodology

  1. Testing data for e.g. normal distributions to decide if parametric or non-parametric tests to apply
  2. In case of multiple comparisons perform ANOVA, Friedman or another test
  3. In case multiple comparisons are significant use pairwise comparisons with post hoc analysis adjustments

Critique

  1. Steven Goodman. 2008. A Dirty Dozen: Twelve P-Value Misconceptions. Seminars in Hematology Volume 45, Issue 3, July 2008, Pages 135–140
  2. Jacob Cohen. 1994. The Earth is Round (p < 0.05). American Physcologist. http://ist-socrates.berkeley.edu/~maccoun/PP279_Cohen1.pdf

References

  • García, Fernández, Luengo, Herrera. 2010. Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences 180, pp. 2044–2064. (http://sci2s.ugr.es/sicidm/pdf/2010-Garcia-INS.pdf)

Attachments (1)

Download all attachments as: .zip