Opened 6 years ago

Closed 5 years ago

#1784 closed feature request (done)

Add Regression and Classification problem instances

Reported by: sforsten Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.7
Component: Problems.Instances Version: 3.3.7
Keywords: Cc:

Description

Regression and Classification benchmark problems shall be available within HL as it has already been done in ticket #1669, but with an enhanced design. The classes and interfaces provided by Problems.Instances have to be used.

Generating benchmark problems, import from file and export shall be supported.

Change History (61)

comment:1 Changed 6 years ago by sforsten

r7567: add folder to branch project
r7568: delete folder to branch project
r7569: branch project Problems.Instances
r7570: branch project Problems.DataAnalysis
r7571: branch project Problems.DataAnalysis.Views

comment:2 Changed 6 years ago by sforsten

r7572: branch creation done

comment:3 Changed 6 years ago by sforsten

r7603:

  • first implementation of regression problem instances with one instance to test

comment:4 Changed 6 years ago by sforsten

r7605: branch project for debugging purposes

comment:5 Changed 6 years ago by sforsten

r7606: delete project which was branched by mistake
r7607: branch project for debugging purposes
r7610:

  • added export button and corrected the LoadData in RegressionInstanceProvider
  • added RegressionProblemView

comment:6 Changed 6 years ago by sforsten

r7664:

  • added Keijzer, Korns, Vladislavleva und Nguyen regression problem instances
  • changes have been made in the ProblemView. Some parts have been replaced with views from Problems.Instances.Views

comment:7 Changed 6 years ago by sforsten

r7665:

  • minor bug fixes in several views

comment:8 Changed 6 years ago by sforsten

r7666:

  • deleted obsolete project Problems.Instances.Regression.Views
  • added TrentMcConaghy and Various problem instances (zip file from TrentMcConaghy is rather big)

comment:9 Changed 6 years ago by sforsten

r7667:

  • updated the Plugin.cs.frame files
  • added other real world problem instances
  • put some methods from TrentMcConaghyInstanceProvider to the super class ResourceRegressionInstanceProvider

comment:10 Changed 6 years ago by sforsten

r7682:

  • added Problem.Instances.Classification project
  • added classification problem instances
  • added a class Transformer to Problem.Instances

r7683:

  • added abstract ProblemInstanceProviderView
  • changes !IProblemInstanceConsumer and !IProblemInstanceExporter interfaces
  • deleted unnecessary files

comment:11 Changed 6 years ago by sforsten

r7684:

  • deleted RegressionProblemView
  • deleted of the import button in the DataAnalysisProblemView
  • small changes in the views of the Problems.Instances.Views project

comment:12 Changed 6 years ago by sforsten

r7685:

  • changed namespace of files in Problem.Instances.Classification
  • added description to iris and mammography problem instances

comment:13 Changed 6 years ago by sforsten

r7687:

  • all references have been set to CopyLocal "false"

comment:14 Changed 6 years ago by sforsten

r7698:

  • ProblemInstanceProvider are sorted now
  • the return values of ValueGenerator have been changed to IEnumerable
  • changes have been applied to classes which are using the ValueGenerator
  • change of the cast in ProblemInstanceProviderViewGeneric and importButton.Enable is set now in SetEnabledStateOfControls

comment:15 Changed 6 years ago by sforsten

  • Status changed from new to accepted

comment:16 Changed 6 years ago by sforsten

  • Owner changed from sforsten to mkommend
  • Status changed from accepted to reviewing

comment:17 Changed 5 years ago by sforsten

r7748: branch Problems.DataAnalysis.Symbolic.Regression to test problem instances with symbolic regression

r7749: branch Problems.DataAnalysis.Symbolic.Classification to test problem instances with symbolic classification

r7750:

  • merged Problems.DataAnalysis r7273:7748 from trunk
  • prepared SymbolicClassificationSingleObjectiveProblem and SymbolicRegressionSingleObjectiveProblem to load and export problem instances

comment:18 Changed 5 years ago by sforsten

r7751: branch HeuristicLab.Problems.TravelingSalesman.Views
r7752: branch HeuristicLab.Problems.QuadraticAssignment.Views
r7753: branch HeuristicLab.Problems.Instances.TSPLIB.Views

r7754:

  • deleted not used interface and view
  • changed the superclass of TSPLIBTSPInstanceProviderView to ProblemInstanceProviderViewGeneric
  • deleted the Location change in the designer files of TravelingSalesmanProlemView and QuadraticAssignmentProblemView
  • set all references to CopyLocal false

r7755:

  • really deleted not used interface and view

comment:19 Changed 5 years ago by sforsten

r7758:

  • deleted not needed Consumer in ProblemInstanceProvider and IProblemInstanceProvider
  • changed protection level of exporter and consumer in ProblemInstanceProviderViewGeneric
  • renamed property FileExtension to FileName in ResourceClassificationInstanceProvider and ResourceRegressionInstanceProvider
  • deleted ImportProblemDataFromFile method from IDataAnalysisProblem and all classes and interfaces, which implement this method
  • removed unnecessary yield return in GetDoubleValues in the Dataset. Now it's a normal return statement

comment:20 Changed 5 years ago by sforsten

r7759:

  • deleted ClassificationData and RegressionData. RegressionProblemData and ClassificationProblemData are used instead
  • deleted not needed Transformer
  • ValueGenerator is now a static class and yield return is used return IEnumerable

comment:21 Changed 5 years ago by sforsten

r7770:

  • added some regions for readability
  • added import and export methods in DataAnalysisProblem and SymbolicDataAnalysisProblem to reduce code duplication
  • added a recursive and an iterative approach without many linq expression to generate all combinations of list elements in ValueGenerator

comment:22 Changed 5 years ago by sforsten

r7771: merged everything from trunk revision 7770 to branch ProblemInstancesRegressionAndClassification

Last edited 5 years ago by sforsten (previous) (diff)

comment:23 Changed 5 years ago by sforsten

r7772:

  • corrected build path of Problems.Instances
  • simplified the GenerateAllCombinationsOfValuesInLists method in ValueGenerator

comment:24 Changed 5 years ago by sforsten

r7773: branch to add reference

comment:25 Changed 5 years ago by sforsten

r7774: added Problems.Instances reference to Algorithms.DataAnalysis

comment:26 Changed 5 years ago by mkommend

  • Owner changed from mkommend to sforsten
  • Status changed from reviewing to assigned

Review Comments:

  • ProblemView Line 51: check with consumerView.Providers.Any()
  • ProblemInstanceConsumerView: Add property / method to access all and the currently selected provider. Do not use the combo box for this.
  • Adapt also multi-objective problems to use Problems.Instances.
  • Why are concrete classes for problem data used instead of the according interfaces?
  • Refactor loading of real world problems in a similar way as the TSPLibInstanceProvider preforms it

comment:27 Changed 5 years ago by mkommend

r7794: Changed check for ProblemInstanceConsumer in ProblemView.

comment:28 Changed 5 years ago by sforsten

  • Owner changed from sforsten to mkommend
  • Status changed from assigned to reviewing

r7805: changes have been applied, according to the review comments of mkommend

comment:29 Changed 5 years ago by sforsten

r7823: merge branch ProblemInstancesRegressionAndClassification into trunk

comment:30 Changed 5 years ago by sforsten

r7825: changes in the references and output directories

comment:31 Changed 5 years ago by sforsten

r7826: changed the class name in the Plugin.cs.frame files.

r7827: The branch has been prepared so that only the Trent McConaghy problem instance are in there. The problem instances are in an additional plugin and can be added to HeuristicLab anytime.

comment:32 Changed 5 years ago by sforsten

r7830: adjusted the version numbers of some projects

r7831: adjusted the version numbers in branch

comment:33 Changed 5 years ago by sforsten

r7834: added missing plugin dependency

comment:34 Changed 5 years ago by mkommend

r7835: Added missing reference to HL.problems.instances.views in HL.tests.

comment:35 Changed 5 years ago by gkronber

Review comments:

  • buttons are too small for the icons.
  • the tooltip 'Import a IRegressionProblem problem from file.' is not very helpful. In particular it is not clear that I can import CSV files through this button.
  • The option to import CSV files should be more prominent. Personally I'd like it better if the load and save button are located on the top left of the view and the option to import benchmark problems should be only on third place.
  • The actions for selecting and loading benchmark instances should be separated from the load and save actions, because they are contextually separated. (separate import/export and load benchmark visually in the GUI).

comment:36 Changed 5 years ago by mkommend

Review comments:

  • Unify regression and classification problem instances in one plugin HeuristicLab.Problem.Instances.DataAnalysis
  • Add the TableFileParser to this newly created plugin.
  • Improve the TableFileParser (no static methods, parse methods which automatically detect the file format)
  • Poly10 problem instances has a typing error.

comment:37 Changed 5 years ago by sforsten

r7849:

  • added project HeuristicLab.Problem.Instances.DataAnalysis and deleted HeuristicLab.Problem.Instances.Classification and HeuristicLab.Problem.Instances.Regression
  • buttons are now big enough for the icons

comment:38 Changed 5 years ago by sforsten

r7851: changed the TableFileParser, so that you don't have to determine the file format by yourself. Comments have been added for the different Parse methods.

comment:39 Changed 5 years ago by sforsten

r7860:

  • added additional Keijzer problem instances
  • capitalized names real world problem instances
  • added Friedman I and II
  • added link to VariousInstanceProvider
  • changed symbol of info button for ProblemInstanceProvider in ProblemInstanceConsumerView
  • added CSVProvider for classification and regression problems
  • ProblemInstanceProviderViewGeneric only shows controls to load problem instances, if the selected ProblemInstanceProvider contains IDataDescriptor

comment:40 Changed 5 years ago by sforsten

r7863: changed name of the DataDescriptor in the SamplesTest for regression and classification

comment:41 Changed 5 years ago by sforsten

r7870:

  • added unit test for regression and classification ProblemInstanceProvider
  • changed namespace for the TableFileParserTest

comment:42 Changed 5 years ago by gkronber

CSV files that contain columns with non-numeric data (for instance DateTimes) cannot be imported anymore.

The constructor for DataAnalysisProblemData throws an exception:

 protected DataAnalysisProblemData(Dataset dataset, IEnumerable<string> allowedInputVariables) {
      if (dataset == null) throw new ArgumentNullException("The dataset must not be null.");
      if (allowedInputVariables == null) throw new ArgumentNullException("The allowedInputVariables must not be null.");

      if (allowedInputVariables.Except(dataset.DoubleVariables).Any())
        throw new ArgumentException("All allowed input variables must be present in the dataset and of type double.");}}}

comment:43 Changed 5 years ago by mkommend

  • Owner changed from mkommend to sforsten
  • Status changed from reviewing to assigned

comment:44 Changed 5 years ago by sforsten

r7963: prepare remaining branch to work with Problems.Instances.DataAnalysis

comment:45 Changed 5 years ago by sforsten

  • Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.7
  • Owner changed from sforsten to mkommend
  • Status changed from assigned to reviewing
  • Version changed from branch to 3.3.7

r7965: CSV files that contain columns with non-numeric data can be imported again.

comment:46 Changed 5 years ago by gkronber

ranges for training partition and test partition are not set correctly when switching problems. For instance load the Poly-10 problem and afterwards load the 'Spatial co-evolution problem' or the 'Vladislavleva Kotanchek problem'

comment:47 Changed 5 years ago by gkronber

Why are only the first 1000 points shuffled in the Spatial co-evolution problem instance with 1675 points?

comment:48 Changed 5 years ago by sforsten

r7988:

  • the training and test partition for Spatial co-evolution problem instance has been corrected

Spatial co-evolution: The first 1000 points are for training and therefore randomly distributed. The last 676 points are for testing (Range: –5 <=x <=5 and -5<=y<=5, normally with test data 0.4 apart, which gives a total of 676 data points. Also see http://groups.csail.mit.edu/EVO-DesignOpt/GPBenchmarks/).

Vladislavleva Kotanchek: The training and test partition are set correctly (See the link above). The additional unused points can be used for training, if 100 points are not enough. So no additional extra points have to be created manually.

comment:49 Changed 5 years ago by abeham

r8084: Added CSV problem provider for ClusteringProblemData

Last edited 5 years ago by abeham (previous) (diff)

comment:50 Changed 5 years ago by mkommend

r8199: Renamed CSV instance providers and corrected tooltips.

Last edited 5 years ago by mkommend (previous) (diff)

comment:51 Changed 5 years ago by mkommend

  • Status changed from reviewing to readytorelease

comment:52 Changed 5 years ago by gkronber

  • Owner changed from mkommend to gkronber
  • Status changed from readytorelease to reviewing

The implementation should be reviewed to match the gp-benchmarks paper of GECCO2012 http://gpbenchmarks.org/wp-content/uploads/2012/06/gpbenchmarks-GECCO2012.pdf

comment:53 Changed 5 years ago by gkronber

r8224:

  • fixed uniform sampling.
  • adapted Nguyen instances.

comment:54 Changed 5 years ago by gkronber

r8225: adapted Korns instances.

comment:55 Changed 5 years ago by gkronber

r8226: renumbered files for Keijzer instances

comment:56 Changed 5 years ago by gkronber

r8238: adapted Keijzer instances

comment:57 Changed 5 years ago by gkronber

r8240: adapted Vladislavleva instances

comment:58 Changed 5 years ago by gkronber

r8241: adapted notes about function set for Vladislavleva instances

comment:59 Changed 5 years ago by gkronber

r8245: adapted value ranges for Korns benchmark instances to prevent generating NaN target values

comment:60 Changed 5 years ago by gkronber

  • Status changed from reviewing to readytorelease

comment:61 Changed 5 years ago by gkronber

  • Resolution set to done
  • Status changed from readytorelease to closed
Note: See TracTickets for help on using tickets.