Opened 13 years ago
Closed 12 years ago
#1784 closed feature request (done)
Add Regression and Classification problem instances
Reported by: | sforsten | Owned by: | gkronber |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.7 |
Component: | Problems.Instances | Version: | 3.3.7 |
Keywords: | Cc: |
Description
Regression and Classification benchmark problems shall be available within HL as it has already been done in ticket #1669, but with an enhanced design. The classes and interfaces provided by Problems.Instances have to be used.
Generating benchmark problems, import from file and export shall be supported.
Change History (61)
comment:1 Changed 13 years ago by sforsten
comment:2 Changed 13 years ago by sforsten
r7572: branch creation done
comment:3 Changed 13 years ago by sforsten
- first implementation of regression problem instances with one instance to test
comment:4 Changed 13 years ago by sforsten
r7605: branch project for debugging purposes
comment:5 Changed 13 years ago by sforsten
comment:6 Changed 13 years ago by sforsten
- added Keijzer, Korns, Vladislavleva und Nguyen regression problem instances
- changes have been made in the ProblemView. Some parts have been replaced with views from Problems.Instances.Views
comment:7 Changed 13 years ago by sforsten
- minor bug fixes in several views
comment:8 Changed 13 years ago by sforsten
- deleted obsolete project Problems.Instances.Regression.Views
- added TrentMcConaghy and Various problem instances (zip file from TrentMcConaghy is rather big)
comment:9 Changed 13 years ago by sforsten
- updated the Plugin.cs.frame files
- added other real world problem instances
- put some methods from TrentMcConaghyInstanceProvider to the super class ResourceRegressionInstanceProvider
comment:10 Changed 13 years ago by sforsten
comment:11 Changed 13 years ago by sforsten
- deleted RegressionProblemView
- deleted of the import button in the DataAnalysisProblemView
- small changes in the views of the Problems.Instances.Views project
comment:12 Changed 13 years ago by sforsten
- changed namespace of files in Problem.Instances.Classification
- added description to iris and mammography problem instances
comment:13 Changed 13 years ago by sforsten
- all references have been set to CopyLocal "false"
comment:14 Changed 13 years ago by sforsten
- ProblemInstanceProvider are sorted now
- the return values of ValueGenerator have been changed to IEnumerable
- changes have been applied to classes which are using the ValueGenerator
- change of the cast in ProblemInstanceProviderViewGeneric and importButton.Enable is set now in SetEnabledStateOfControls
comment:15 Changed 13 years ago by sforsten
- Status changed from new to accepted
comment:16 Changed 13 years ago by sforsten
- Owner changed from sforsten to mkommend
- Status changed from accepted to reviewing
comment:17 Changed 13 years ago by sforsten
r7748: branch Problems.DataAnalysis.Symbolic.Regression to test problem instances with symbolic regression
r7749: branch Problems.DataAnalysis.Symbolic.Classification to test problem instances with symbolic classification
- merged Problems.DataAnalysis r7273:7748 from trunk
- prepared SymbolicClassificationSingleObjectiveProblem and SymbolicRegressionSingleObjectiveProblem to load and export problem instances
comment:18 Changed 13 years ago by sforsten
r7751: branch HeuristicLab.Problems.TravelingSalesman.Views
r7752: branch HeuristicLab.Problems.QuadraticAssignment.Views
r7753: branch HeuristicLab.Problems.Instances.TSPLIB.Views
- deleted not used interface and view
- changed the superclass of TSPLIBTSPInstanceProviderView to ProblemInstanceProviderViewGeneric
- deleted the Location change in the designer files of TravelingSalesmanProlemView and QuadraticAssignmentProblemView
- set all references to CopyLocal false
- really deleted not used interface and view
comment:19 Changed 13 years ago by sforsten
- deleted not needed Consumer in ProblemInstanceProvider and IProblemInstanceProvider
- changed protection level of exporter and consumer in ProblemInstanceProviderViewGeneric
- renamed property FileExtension to FileName in ResourceClassificationInstanceProvider and ResourceRegressionInstanceProvider
- deleted ImportProblemDataFromFile method from IDataAnalysisProblem and all classes and interfaces, which implement this method
- removed unnecessary yield return in GetDoubleValues in the Dataset. Now it's a normal return statement
comment:20 Changed 13 years ago by sforsten
- deleted ClassificationData and RegressionData. RegressionProblemData and ClassificationProblemData are used instead
- deleted not needed Transformer
- ValueGenerator is now a static class and yield return is used return IEnumerable
comment:21 Changed 13 years ago by sforsten
- added some regions for readability
- added import and export methods in DataAnalysisProblem and SymbolicDataAnalysisProblem to reduce code duplication
- added a recursive and an iterative approach without many linq expression to generate all combinations of list elements in ValueGenerator
comment:22 Changed 13 years ago by sforsten
r7771: merged everything from trunk revision 7770 to branch ProblemInstancesRegressionAndClassification
- r7661:7770: Optimization.Views
- r7661:7770: Problems.DataAnalysis.Views
- r7661:7770: Problems.Instances
- r7647:7770: Problems.QuadraticAssignment.Views
comment:23 Changed 13 years ago by sforsten
- corrected build path of Problems.Instances
- simplified the GenerateAllCombinationsOfValuesInLists method in ValueGenerator
comment:24 Changed 13 years ago by sforsten
r7773: branch to add reference
comment:25 Changed 13 years ago by sforsten
r7774: added Problems.Instances reference to Algorithms.DataAnalysis
comment:26 Changed 13 years ago by mkommend
- Owner changed from mkommend to sforsten
- Status changed from reviewing to assigned
Review Comments:
- ProblemView Line 51: check with consumerView.Providers.Any()
- ProblemInstanceConsumerView: Add property / method to access all and the currently selected provider. Do not use the combo box for this.
- Adapt also multi-objective problems to use Problems.Instances.
- Why are concrete classes for problem data used instead of the according interfaces?
- Refactor loading of real world problems in a similar way as the TSPLibInstanceProvider preforms it
comment:27 Changed 13 years ago by mkommend
r7794: Changed check for ProblemInstanceConsumer in ProblemView.
comment:28 Changed 13 years ago by sforsten
- Owner changed from sforsten to mkommend
- Status changed from assigned to reviewing
r7805: changes have been applied, according to the review comments of mkommend
comment:29 Changed 13 years ago by sforsten
r7823: merge branch ProblemInstancesRegressionAndClassification into trunk
comment:30 Changed 13 years ago by sforsten
r7825: changes in the references and output directories
comment:31 Changed 13 years ago by sforsten
comment:32 Changed 13 years ago by sforsten
comment:33 Changed 13 years ago by sforsten
r7834: added missing plugin dependency
comment:34 Changed 13 years ago by mkommend
r7835: Added missing reference to HL.problems.instances.views in HL.tests.
comment:35 Changed 13 years ago by gkronber
Review comments:
- buttons are too small for the icons.
- the tooltip 'Import a IRegressionProblem problem from file.' is not very helpful. In particular it is not clear that I can import CSV files through this button.
- The option to import CSV files should be more prominent. Personally I'd like it better if the load and save button are located on the top left of the view and the option to import benchmark problems should be only on third place.
- The actions for selecting and loading benchmark instances should be separated from the load and save actions, because they are contextually separated. (separate import/export and load benchmark visually in the GUI).
comment:36 Changed 13 years ago by mkommend
Review comments:
- Unify regression and classification problem instances in one plugin HeuristicLab.Problem.Instances.DataAnalysis
- Add the TableFileParser to this newly created plugin.
- Improve the TableFileParser (no static methods, parse methods which automatically detect the file format)
- Poly10 problem instances has a typing error.
comment:37 Changed 13 years ago by sforsten
- added project HeuristicLab.Problem.Instances.DataAnalysis and deleted HeuristicLab.Problem.Instances.Classification and HeuristicLab.Problem.Instances.Regression
- buttons are now big enough for the icons
comment:38 Changed 13 years ago by sforsten
r7851: changed the TableFileParser, so that you don't have to determine the file format by yourself. Comments have been added for the different Parse methods.
comment:39 Changed 13 years ago by sforsten
- added additional Keijzer problem instances
- capitalized names real world problem instances
- added Friedman I and II
- added link to VariousInstanceProvider
- changed symbol of info button for ProblemInstanceProvider in ProblemInstanceConsumerView
- added CSVProvider for classification and regression problems
- ProblemInstanceProviderViewGeneric only shows controls to load problem instances, if the selected ProblemInstanceProvider contains IDataDescriptor
comment:40 Changed 13 years ago by sforsten
r7863: changed name of the DataDescriptor in the SamplesTest for regression and classification
comment:41 Changed 13 years ago by sforsten
- added unit test for regression and classification ProblemInstanceProvider
- changed namespace for the TableFileParserTest
comment:42 Changed 13 years ago by gkronber
CSV files that contain columns with non-numeric data (for instance DateTimes) cannot be imported anymore.
The constructor for DataAnalysisProblemData throws an exception:
protected DataAnalysisProblemData(Dataset dataset, IEnumerable<string> allowedInputVariables) { if (dataset == null) throw new ArgumentNullException("The dataset must not be null."); if (allowedInputVariables == null) throw new ArgumentNullException("The allowedInputVariables must not be null."); if (allowedInputVariables.Except(dataset.DoubleVariables).Any()) throw new ArgumentException("All allowed input variables must be present in the dataset and of type double.");}}}
comment:43 Changed 12 years ago by mkommend
- Owner changed from mkommend to sforsten
- Status changed from reviewing to assigned
comment:44 Changed 12 years ago by sforsten
r7963: prepare remaining branch to work with Problems.Instances.DataAnalysis
comment:45 Changed 12 years ago by sforsten
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.7
- Owner changed from sforsten to mkommend
- Status changed from assigned to reviewing
- Version changed from branch to 3.3.7
r7965: CSV files that contain columns with non-numeric data can be imported again.
comment:46 Changed 12 years ago by gkronber
ranges for training partition and test partition are not set correctly when switching problems. For instance load the Poly-10 problem and afterwards load the 'Spatial co-evolution problem' or the 'Vladislavleva Kotanchek problem'
comment:47 Changed 12 years ago by gkronber
Why are only the first 1000 points shuffled in the Spatial co-evolution problem instance with 1675 points?
comment:48 Changed 12 years ago by sforsten
- the training and test partition for Spatial co-evolution problem instance has been corrected
Spatial co-evolution: The first 1000 points are for training and therefore randomly distributed. The last 676 points are for testing (Range: –5 <=x <=5 and -5<=y<=5, normally with test data 0.4 apart, which gives a total of 676 data points. Also see http://groups.csail.mit.edu/EVO-DesignOpt/GPBenchmarks/).
Vladislavleva Kotanchek: The training and test partition are set correctly (See the link above). The additional unused points can be used for training, if 100 points are not enough. So no additional extra points have to be created manually.
comment:49 Changed 12 years ago by abeham
r8084: Added CSV problem provider for ClusteringProblemData
comment:50 Changed 12 years ago by mkommend
r8199: Renamed CSV instance providers and corrected tooltips.
comment:51 Changed 12 years ago by mkommend
- Status changed from reviewing to readytorelease
comment:52 Changed 12 years ago by gkronber
- Owner changed from mkommend to gkronber
- Status changed from readytorelease to reviewing
The implementation should be reviewed to match the gp-benchmarks paper of GECCO2012 http://gpbenchmarks.org/wp-content/uploads/2012/06/gpbenchmarks-GECCO2012.pdf
comment:53 Changed 12 years ago by gkronber
- fixed uniform sampling.
- adapted Nguyen instances.
comment:54 Changed 12 years ago by gkronber
r8225: adapted Korns instances.
comment:55 Changed 12 years ago by gkronber
r8226: renumbered files for Keijzer instances
comment:56 Changed 12 years ago by gkronber
r8238: adapted Keijzer instances
comment:57 Changed 12 years ago by gkronber
r8240: adapted Vladislavleva instances
comment:58 Changed 12 years ago by gkronber
r8241: adapted notes about function set for Vladislavleva instances
comment:59 Changed 12 years ago by gkronber
r8245: adapted value ranges for Korns benchmark instances to prevent generating NaN target values
comment:60 Changed 12 years ago by gkronber
- Status changed from reviewing to readytorelease
comment:61 Changed 12 years ago by gkronber
- Resolution set to done
- Status changed from readytorelease to closed
r7567: add folder to branch project
r7568: delete folder to branch project
r7569: branch project Problems.Instances
r7570: branch project Problems.DataAnalysis
r7571: branch project Problems.DataAnalysis.Views