Opened 8 years ago

Closed 7 years ago

Last modified 7 years ago

#1669 closed feature request (obsolete)

Create demo and benchmark problems for regression tasks

Reported by: mkommend Owned by: sforsten
Priority: medium Milestone:
Component: Problems.DataAnalysis Version: branch
Keywords: Cc:

Description (last modified by sforsten)

Benchmark modeling problems for regression and classification shall be available within HL. Data for a selected problem shall be generated automatically and the parameters of the problems shall also be set.

If it's not possible to generate data for a specific problem, the data shall be loaded from a csv file and the problem specific parameters shall also be set.

Change History (32)

comment:1 Changed 8 years ago by sforsten

  • Owner changed from mkommend to sforsten
  • Status changed from new to accepted

comment:2 Changed 8 years ago by sforsten

  • Description modified (diff)

comment:3 Changed 8 years ago by sforsten

  • Description modified (diff)

comment:4 Changed 8 years ago by gkronber

Please add a comment for each changeset you commit to the repository to this ticket.

  • r6965: created branch for data analysis benchmark problems
  • r6968: first version which can automatically generate data for some problems from http://www.vanillamodeling.com/
  • r6969: missing .resx file added and svn ignore list extended
  • r6973: different kinds of distribution have been implemented for the data generation. It hasn't been tested yet.

comment:5 Changed 8 years ago by gkronber

r6973: please use the classes in HeuristicLab.Random to sample from uniform and normal distributions.

Also, I think the class StepDistribution is dubious. The class is used to generate sequences of double values with a fixed interval. This definition is completely different from the definition of sampling from a random distribution. Sampling from random distributions is independent and non-deterministic, but taking a value from the StepDistribution is dependent on the previous value and also it is deterministic. Please remove class StepDistribution and implement the functionality in a different way. An idea could be to implement something like Enumerable.Range() or Enumerable.Repeat().

Last edited 8 years ago by gkronber (previous) (diff)

comment:6 Changed 8 years ago by gkronber

  • Cc mkommend added

comment:7 Changed 8 years ago by gkronber

  • Cc mkommend removed

comment:8 Changed 8 years ago by sforsten

r6991: First version with a simpler design as discussed with Michael Kommenda has been implemented and will be tested soon. Currently only the KotanchekFunction.cs is changed accordingly. Other benchmarks are going to follow soon.

The classes for the different distributions are not needed any longer. Static methods in RegressionBenchmark replace them.

comment:9 Changed 8 years ago by sforsten

r7025: benchmark problems of Nguyen, Korns and Keijzer from http://groups.csail.mit.edu/EVO-DesignOpt/GPBenchmarks/ have been added. The benchmark problems from http://www.vanillamodeling.com/ have been adapted to the ones from Vladislavleva.

Not all benchmarks are working correctly so far, but they will be tested soon.

comment:10 Changed 8 years ago by sforsten

r7031: A few mistakes have been corrected. All benchmark problems now generate problem data without throwing an error.

comment:11 Changed 8 years ago by sforsten

r7044: real world problems have been added

comment:12 Changed 8 years ago by sforsten

r7081: typos have been corrected
r7085: branch has been merged with the trunk in revision 7081 and methods in RegressionBenchmark have been renamed.

Last edited 8 years ago by sforsten (previous) (diff)

comment:13 Changed 8 years ago by sforsten

r7095: Input variables of Korn functions have been adjusted according to the description.

comment:14 Changed 8 years ago by sforsten

r7096: Poly-10 benchmark has been added and two small bug fixes.

comment:15 Changed 8 years ago by mkommend

Comments regarding the functionality:

  • The training and test partition do not include the last sample specified (start <= training < end). Therefore the training end should be equal to the test start, so that no sample is left out.
  • When the algorithm is running the combobox as well as the load button should be disabled. This can be achieved by overriding the SetEnabledStateOfControl method in the views.
  • The spatial coevolution and the poly 10 problem are not implemented.
  • Classification benchmark problems are still missing.
Last edited 8 years ago by mkommend (previous) (diff)

comment:16 Changed 8 years ago by sforsten

r7127:

  • Spatial co-evolution benchmark has been added
  • Benchmarks of Trent McConaghy have been added
  • 2 Classification benchmarks have been added (Mammography and Iris dataset)
  • Training and test set include now all samples from the dataset
  • Load button and combo box are now disabled when the algorithm is running

comment:17 Changed 8 years ago by sforsten

r7138:

  • Iris benchmark has been corrected and data set will ordered randomly
  • Benchmarks of Trent McConaghy have been corrected
  • Descriptions have been added (Mammography and Iris)
  • Bug fix in ClassificationRealWorldBenchmark

comment:18 Changed 8 years ago by gkronber

r7290 merged r7209:7283 from trunk into regression benchmark branch

comment:19 Changed 8 years ago by sforsten

r7298:

  • merged r7284:7296 from trunk into regression benchmark branch
  • removed all unchanged projects and adjusted the output path for remaining projects to "..\..\..\..\trunk\sources\bin\"

comment:20 Changed 8 years ago by sforsten

r7307: adjusted some benchmark problems from Keijzer

comment:21 Changed 8 years ago by sforsten

r7308: added a dialog to select benchmark problems

comment:22 Changed 8 years ago by sforsten

r7328: corrected mistake while generating the dataset

comment:23 Changed 8 years ago by sforsten

r7336:

  • bug fixed in RegressionBenchmark
  • adapted trainings partitions of some benchmark problems

comment:24 Changed 8 years ago by sforsten

r7396:

  • deleted unnecessary code

comment:25 Changed 8 years ago by sforsten

r7405:

  • changed TestPartition and TrainingPartition of SpatialCoevolution
  • put a method from RegressionBenchmark to BreimanOne

comment:26 Changed 8 years ago by sforsten

  • Owner changed from sforsten to mkommend
  • Status changed from accepted to reviewing

comment:27 Changed 8 years ago by sforsten

r7428:

  • change in test partition of some Keijzer benchmark problems
  • change in test and training partition in spatial co-evolution benchmark problem

comment:28 Changed 8 years ago by mkommend

  • Owner changed from mkommend to sforsten
  • Status changed from reviewing to assigned

Currently a generic way of providing benchmark problems is developed and this functionality should be adapted to fit to the overall design. As this is not completely finished, the development should be on hold until the new problem providers are implemented.

comment:29 Changed 8 years ago by sforsten

r7499:

  • .resx files have been deleted

comment:30 Changed 7 years ago by sforsten

  • Status changed from assigned to accepted

comment:31 Changed 7 years ago by sforsten

  • Resolution set to obsolete
  • Status changed from accepted to closed

The ticket is obsolete, because the benchmark problems have been integrated as problem instances in ticket #1784

r7962: delete obsolete branch

Last edited 7 years ago by sforsten (previous) (diff)

comment:32 Changed 7 years ago by mkommend

  • Milestone HeuristicLab 3.3.x Backlog deleted
Note: See TracTickets for help on using tickets.