Opened 12 years ago

Closed 10 years ago

Last modified 10 years ago

#573 closed enhancement (invalid)

Implement statistical analysis

Reported by: abeham Owned by: abeham
Priority: medium Milestone: HeuristicLab 3.3.0
Component: ZZZ OBSOLETE: StatisticalAnalysis Version: 3.2
Keywords: Cc:

Description

Several statistical analysis methods would be nice to have in HeuristicLab 3.

Among those there are parametric and non-parametric statistical tests such as ANOVA, t-test, mann whitney U test,... As well as classes that provide functions for calculating mean, median, quantil, standard deviation, variance,..

Change History (8)

comment:1 Changed 12 years ago by abeham

  • Status changed from new to assigned

comment:2 Changed 12 years ago by abeham

A Mann Whitney U test was implemented in r1517. It still needs to cope with equal ranks within a population, but seems to work quite well otherwise.

There's still a structure to be defined regarding interfaces and base classes, as well as hl3 operators

comment:3 Changed 12 years ago by abeham

Fixed a bug in r1548 with respect to unsorted arrays. They're sorted now in all cases. Some quick tests with an implementation found on the net showed that the results are very close. A small difference exists in the calculation of the U value which needs to be looked into.

comment:4 Changed 12 years ago by abeham

Fixed a small bug in r1549. Integer division was performed when double division should have been done. This explains the small differences that were noted in the previous comment. Regarding comment 2, equal ranks within a variable are no problem, since they'll be summed up at the end anyway.

A few tests have now been performed with the implementation at http://elegans.swmed.edu/~leon/stats/utest.html and the calculated U values as well as approximation p values are identical.

Still to do is to implement a one tailed test (very similar to the 2-tailed test), as well as define an interface for non parametric statistical tests.

For practical considerations a 2-tailed test on the hypothesis that both independently sampled variables stem from the same distribution should be performed. When the test rejects this hypothesis the calculated sample mean or median provides the answer to which is lower or higher (better or worse).

comment:5 Changed 12 years ago by abeham

Added a simple calculator for some descriptive statistics in r1745

comment:6 Changed 12 years ago by abeham

simplified code of the simple calculator in r1783

comment:7 Changed 10 years ago by abeham

  • Resolution set to invalid
  • Status changed from assigned to closed

nobody needs this anymore

comment:8 Changed 10 years ago by swagner

  • Milestone changed from Current to HeuristicLab 3.3.0

Milestone Current deleted

Note: See TracTickets for help on using tickets.