Opened 12 years ago
Closed 9 years ago
#1998 closed feature request (done)
Model Comparison for Classification
Reported by: | sforsten | Owned by: | gkronber |
---|---|---|---|
Priority: | high | Milestone: | HeuristicLab 3.3.13 |
Component: | Algorithms.DataAnalysis | Version: | 3.3.12 |
Keywords: | Cc: |
Description
Similar to Regression, it shall be possible to easily create simple models for classification, which a classification model has to exceed to be accepted as good model.
0R, 1R and LDA classifiers will be created to compare them to the hopefully better classification model.
Change History (47)
comment:1 Changed 12 years ago by sforsten
- Status changed from new to accepted
comment:2 Changed 12 years ago by sforsten
comment:3 Changed 12 years ago by sforsten
r9116: branch project Problems.DataAnalysis.Views
r9117: branch project Algorithms.DataAnalysis.Views
- added OneRClassificationModelView
- added ClassificationSolutionComparisonView
- added several calculators (ConfusionMatrixCalculator, FOneScoreCalculator, MatthewsCorrelationCoefficientCalculator)
- fixed bug in OneR
- added StorableClass and Item attribute to several classes
comment:4 Changed 12 years ago by sforsten
- Owner changed from sforsten to mkommend
- Status changed from accepted to reviewing
- OneR now handles missing values separately
- adapted OneRClassificationModelView to show the class of missing values
- with a double-click on the row header in ClassificationSolutionComparisonView the selected solution opens in a new view
- put a try catch block around linear discriminant analysis solution (it is only shown, if it doesn't throw an exception)
ZeroR may not be a good model in the ClassificationSolutionComparisonView, because MatthewsCorrelationCoefficientCalculator will always calculate NaN and F1 score will also be NaN if the negative class is chosen by ZeroR.
comment:5 Changed 11 years ago by gkronber
- Owner changed from mkommend to gkronber
comment:6 Changed 11 years ago by gkronber
- Owner changed from gkronber to architects
- Status changed from reviewing to assigned
Can we queue this for the next release version?
comment:7 Changed 11 years ago by gkronber
- Owner changed from architects to mkommend
Discussed in architects meeting, should be integrated into the trunk.
comment:8 Changed 11 years ago by mkommend
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.10
comment:9 Changed 11 years ago by mkommend
r10553: Updated classification model comparison branch with trunk changes.
comment:10 Changed 11 years ago by mkommend
r10556: Updated classification model comparison branch with trunk changes (remaining changes).
comment:11 Changed 11 years ago by mkommend
r10560: Fixed bugs in classification solution comparison view.
comment:12 Changed 11 years ago by mkommend
r10568: Code cleanup in ZeroR classification algorithm.
comment:13 Changed 11 years ago by mkommend
r10569: Reimplemented OneR classification algoithm.
comment:14 Changed 11 years ago by mkommend
r10570: Added missing value handling in new implementation of OneR.
comment:15 Changed 11 years ago by mkommend
- Owner changed from mkommend to gkronber
- Status changed from assigned to reviewing
Please review r10560, r10568:10570. All other changesets have already been reviewed.
For comparison reasons I left the original version of the OneR algorithm untouched.
comment:16 Changed 11 years ago by gkronber
- Status changed from reviewing to assigned
comment:17 Changed 11 years ago by gkronber
- Status changed from assigned to accepted
comment:18 Changed 11 years ago by gkronber
Reviewed r10560 changes are OK:
Made a more detailed review of the source code. Notes from the overall source code review:
- Adapt to work with current trunk version (export button)
- Exception occurs when running a linear discriminant analysis model (works for the trunk)
- Move OneR and ZeroR out of the Algorithms.DataAnalysis.Linear folder.
- OneR and ZeroR should not be available in the "New Item" dialog.
- ExecutionTime for OneR should not be produced as a result.
The file OneRTest.cs can be deleted?No, the OneRTest.cs file is the new implementation by mkommend. The file OneR.cs should be deleted
comment:19 Changed 10 years ago by gkronber
- Milestone changed from HeuristicLab 3.3.10 to HeuristicLab 3.3.11
Not yet ready for release
comment:20 Changed 10 years ago by gkronber
- Milestone changed from HeuristicLab 3.3.11 to HeuristicLab 3.3.x Backlog
comment:21 Changed 10 years ago by mkommend
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.12
comment:22 Changed 10 years ago by mkommend
- Owner changed from gkronber to mkommend
- Status changed from accepted to readytorelease
comment:23 Changed 10 years ago by mkommend
- Owner changed from mkommend to gkronber
- Status changed from readytorelease to assigned
comment:24 Changed 10 years ago by gkronber
- Priority changed from medium to high
- Status changed from assigned to accepted
comment:25 Changed 9 years ago by gkronber
- Milestone changed from HeuristicLab 3.3.12 to HeuristicLab 3.3.13
Sorry
comment:26 Changed 9 years ago by gkronber
r13081: changed framework version to v4.5
comment:27 Changed 9 years ago by gkronber
r13082: merged changesets r10553:13081 (only on HeuristicLab.Problems.DataAnalysis.Views) from trunk to branch r13083: merged changesets r10551:13082 (only on HeuristicLab.Problems.DataAnalysis) from trunk to branch r13084: merged changesets r10551:13083 (only on HeuristicLab.Algorithms.DataAnalysis.Views) from trunk to branch r13085: merged changesets r10551:13084 (only on HeuristicLab.Algorithms.DataAnalysis) from trunk to branch
comment:28 Changed 9 years ago by gkronber
r13086: made compatibility changes necessary because of trunk developments (compile fail)
comment:29 Changed 9 years ago by gkronber
- deleted obsolete version of OneR algorithm (also does perform worse than mkommend's implementation in my tests)
- reused the ConstantRegressionModel as ConstantClassificationModel (OK?)
- fixed a few strings here and there
comment:30 Changed 9 years ago by gkronber
r13090: moved OneR and ZeroR out of the folder for linear models, renamed OneRTest -> OneR
comment:31 Changed 9 years ago by gkronber
r13091: minor changes while reviewing
Reviewed changes in
- HeuristicLab.Problems.DataAnalysis (online calculators of F1-score and Matthews correlation have been added)
- HeuristicLab.Algorithms.DataAnalysis (ZeroR and OneR algorithms have been added)
- HeuristicLab.Problems.DataAnalysis.Views (only ClassificationSolutionComparisonView has been added)
- HeuristicLab.Algorithms.DataAnalysis.Views (only the OneRClassificationModelView has been added)
TODO:
don't calculate F1 score for multi-class problems- terminate branch
more detailed review of online calculators and algorithmsadd F1 score and Matthew's correlation to classification results collectionmark ConstantRegressionModel obsolete and create new class: ConstantModel
comment:32 Changed 9 years ago by gkronber
r13092: removed creatable attribute from OneR and ZeroR
comment:33 Changed 9 years ago by gkronber
r13097: reverse merge of 13089 (reused the ConstantRegressionModel as ConstantClassificationModel (OK?))
comment:34 Changed 9 years ago by gkronber
- introduced new class ConstantModel (to merge ConstantRegressionModel, ConstantClassificationModel and ConstantTimeSeriesModel)
- fixed copyright statements
- tried to unify naming
- added F1 score and Matthews correlation to classification results collection
comment:35 Changed 9 years ago by gkronber
r13099: fixed strings (the person is called Matthews)
comment:36 Changed 9 years ago by gkronber
r13100: merged changes from the branch to trunk (btw. this branch was difficult to merge back to trunk because of it's specific structure)
comment:37 Changed 9 years ago by gkronber
r13101: bug fixes (typo, duplicate result item)
comment:38 Changed 9 years ago by gkronber
- Owner changed from gkronber to mkommend
- Status changed from accepted to reviewing
- Version changed from branch to 3.3.12
- changed namespace and name of view
- calculate f1 score only for solutions for binary classification problems
comment:39 Changed 9 years ago by gkronber
r13103: code simplification of ConfusionMatrixCalculator
comment:40 Changed 9 years ago by gkronber
r13104: fixed a problem in Classification/ClassificationSolutionComparisonView
comment:41 Changed 9 years ago by gkronber
Relevant new changes:
comment:42 Changed 9 years ago by mkommend
- Owner changed from mkommend to gkronber
- Status changed from reviewing to assigned
Reviewed and briefly tested all relevant changes (mostly the model comparison view, ZeroR and OneR algorithm and the new classification performance metrics). Thank you for reviewing and finishing the implementation.
The only review comment I have is to put the ConstantModel outside of the Regression folder, because it is not limited to regression anymore.
comment:43 Changed 9 years ago by gkronber
r13154: moved ConstantModel class out of the Regression folder
comment:44 Changed 9 years ago by gkronber
- Status changed from assigned to reviewing
comment:45 Changed 9 years ago by gkronber
r13155: terminated the old feature branch for model comparison
comment:46 Changed 9 years ago by gkronber
- Status changed from reviewing to readytorelease
comment:47 Changed 9 years ago by gkronber
- Resolution set to done
- Status changed from readytorelease to closed
r13156: merged r13100:13104 and r13154 from trunk to stable
r9069: initial commit of new branch
r9070: branch project HeuristicLab.Algorithms.DataAnalysis
r9071: finished preparing branch