Opened 5 years ago

Closed 2 years ago

#1998 closed feature request (done)

Model Comparison for Classification

Reported by: sforsten Owned by: gkronber
Priority: high Milestone: HeuristicLab 3.3.13
Component: Algorithms.DataAnalysis Version: 3.3.12
Keywords: Cc:

Description

Similar to Regression, it shall be possible to easily create simple models for classification, which a classification model has to exceed to be accepted as good model.

0R, 1R and LDA classifiers will be created to compare them to the hopefully better classification model.

Change History (47)

comment:1 Changed 5 years ago by sforsten

  • Status changed from new to accepted

r9069: initial commit of new branch

r9070: branch project HeuristicLab.Algorithms.DataAnalysis

r9071: finished preparing branch

comment:2 Changed 5 years ago by sforsten

r9073: branch project Problems.DataAnalysis

r9074:

  • added ZeroR and OneR classifiers
  • added ConstantClassificationModel/-Solution and OneRClassificationModel/-Solution

A view to display the OneRClassificationModel has to be added.

comment:3 Changed 5 years ago by sforsten

r9116: branch project Problems.DataAnalysis.Views

r9117: branch project Algorithms.DataAnalysis.Views

r9119:

  • added OneRClassificationModelView
  • added ClassificationSolutionComparisonView
  • added several calculators (ConfusionMatrixCalculator, FOneScoreCalculator, MatthewsCorrelationCoefficientCalculator)
  • fixed bug in OneR
  • added StorableClass and Item attribute to several classes

comment:4 Changed 5 years ago by sforsten

  • Owner changed from sforsten to mkommend
  • Status changed from accepted to reviewing

r9135:

  • OneR now handles missing values separately
  • adapted OneRClassificationModelView to show the class of missing values
  • with a double-click on the row header in ClassificationSolutionComparisonView the selected solution opens in a new view
  • put a try catch block around linear discriminant analysis solution (it is only shown, if it doesn't throw an exception)

ZeroR may not be a good model in the ClassificationSolutionComparisonView, because MatthewsCorrelationCoefficientCalculator will always calculate NaN and F1 score will also be NaN if the negative class is chosen by ZeroR.

comment:5 Changed 4 years ago by gkronber

  • Owner changed from mkommend to gkronber

comment:6 Changed 4 years ago by gkronber

  • Owner changed from gkronber to architects
  • Status changed from reviewing to assigned

Can we queue this for the next release version?

comment:7 Changed 4 years ago by gkronber

  • Owner changed from architects to mkommend

Discussed in architects meeting, should be integrated into the trunk.

comment:8 Changed 4 years ago by mkommend

  • Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.10

comment:9 Changed 4 years ago by mkommend

r10553: Updated classification model comparison branch with trunk changes.

comment:10 Changed 4 years ago by mkommend

r10556: Updated classification model comparison branch with trunk changes (remaining changes).

comment:11 Changed 4 years ago by mkommend

r10560: Fixed bugs in classification solution comparison view.

comment:12 Changed 4 years ago by mkommend

r10568: Code cleanup in ZeroR classification algorithm.

comment:13 Changed 4 years ago by mkommend

r10569: Reimplemented OneR classification algoithm.

comment:14 Changed 4 years ago by mkommend

r10570: Added missing value handling in new implementation of OneR.

comment:15 Changed 4 years ago by mkommend

  • Owner changed from mkommend to gkronber
  • Status changed from assigned to reviewing

Please review r10560, r10568:10570. All other changesets have already been reviewed.

For comparison reasons I left the original version of the OneR algorithm untouched.

comment:16 Changed 3 years ago by gkronber

  • Status changed from reviewing to assigned

comment:17 Changed 3 years ago by gkronber

  • Status changed from assigned to accepted

comment:18 Changed 3 years ago by gkronber

Reviewed r10560 changes are OK:

Made a more detailed review of the source code. Notes from the overall source code review:

  • Adapt to work with current trunk version (export button).
  • Exception occurs when running a linear discriminant analysis model (works for the trunk) cannot reproduce
  • Move OneR and ZeroR out of the Algorithms.DataAnalysis.Linear folder. done
  • OneR and ZeroR should not be available in the "New Item" dialog. done
  • ExecutionTime for OneR should not be produced as a result. I guess this is ok.
  • The file OneRTest.cs can be deleted? No, the OneRTest.cs file is the new implementation by mkommend. The file OneR.cs should be deleted
Last edited 2 years ago by gkronber (previous) (diff)

comment:19 Changed 3 years ago by gkronber

  • Milestone changed from HeuristicLab 3.3.10 to HeuristicLab 3.3.11

Not yet ready for release

comment:20 Changed 3 years ago by gkronber

  • Milestone changed from HeuristicLab 3.3.11 to HeuristicLab 3.3.x Backlog

comment:21 Changed 3 years ago by mkommend

  • Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.12

comment:22 Changed 3 years ago by mkommend

  • Owner changed from gkronber to mkommend
  • Status changed from accepted to readytorelease

comment:23 Changed 3 years ago by mkommend

  • Owner changed from mkommend to gkronber
  • Status changed from readytorelease to assigned

comment:24 Changed 2 years ago by gkronber

  • Priority changed from medium to high
  • Status changed from assigned to accepted

comment:25 Changed 2 years ago by gkronber

  • Milestone changed from HeuristicLab 3.3.12 to HeuristicLab 3.3.13

Sorry

comment:26 Changed 2 years ago by gkronber

r13081: changed framework version to v4.5

comment:27 Changed 2 years ago by gkronber

r13082: merged changesets r10553:13081 (only on HeuristicLab.Problems.DataAnalysis.Views) from trunk to branch r13083: merged changesets r10551:13082 (only on HeuristicLab.Problems.DataAnalysis) from trunk to branch r13084: merged changesets r10551:13083 (only on HeuristicLab.Algorithms.DataAnalysis.Views) from trunk to branch r13085: merged changesets r10551:13084 (only on HeuristicLab.Algorithms.DataAnalysis) from trunk to branch

Last edited 2 years ago by gkronber (previous) (diff)

comment:28 Changed 2 years ago by gkronber

r13086: made compatibility changes necessary because of trunk developments (compile fail)

comment:29 Changed 2 years ago by gkronber

r13089

  • deleted obsolete version of OneR algorithm (also does perform worse than mkommend's implementation in my tests)
  • reused the ConstantRegressionModel as ConstantClassificationModel (OK?)
  • fixed a few strings here and there

comment:30 Changed 2 years ago by gkronber

r13090: moved OneR and ZeroR out of the folder for linear models, renamed OneRTest -> OneR

comment:31 Changed 2 years ago by gkronber

r13091: minor changes while reviewing

Reviewed changes in

  • HeuristicLab.Problems.DataAnalysis (online calculators of F1-score and Matthews correlation have been added)
  • HeuristicLab.Algorithms.DataAnalysis (ZeroR and OneR algorithms have been added)
  • HeuristicLab.Problems.DataAnalysis.Views (only ClassificationSolutionComparisonView has been added)
  • HeuristicLab.Algorithms.DataAnalysis.Views (only the OneRClassificationModelView has been added)

TODO:

  • don't calculate F1 score for multi-class problems
  • terminate branch
  • more detailed review of online calculators and algorithms
  • add F1 score and Matthew's correlation to classification results collection
  • mark ConstantRegressionModel obsolete and create new class: ConstantModel
Last edited 2 years ago by gkronber (previous) (diff)

comment:32 Changed 2 years ago by gkronber

r13092: removed creatable attribute from OneR and ZeroR

comment:33 Changed 2 years ago by gkronber

r13097: reverse merge of 13089 (reused the ConstantRegressionModel as ConstantClassificationModel (OK?))

comment:34 Changed 2 years ago by gkronber

r13098:

  • introduced new class ConstantModel (to merge ConstantRegressionModel, ConstantClassificationModel and ConstantTimeSeriesModel)
  • fixed copyright statements
  • tried to unify naming
  • added F1 score and Matthews correlation to classification results collection
Last edited 2 years ago by gkronber (previous) (diff)

comment:35 Changed 2 years ago by gkronber

r13099: fixed strings (the person is called Matthews)

comment:36 Changed 2 years ago by gkronber

r13100: merged changes from the branch to trunk (btw. this branch was difficult to merge back to trunk because of it's specific structure)

comment:37 Changed 2 years ago by gkronber

r13101: bug fixes (typo, duplicate result item)

comment:38 Changed 2 years ago by gkronber

  • Owner changed from gkronber to mkommend
  • Status changed from accepted to reviewing
  • Version changed from branch to 3.3.12

r13102:

  • changed namespace and name of view
  • calculate f1 score only for solutions for binary classification problems

comment:39 Changed 2 years ago by gkronber

r13103: code simplification of ConfusionMatrixCalculator

comment:40 Changed 2 years ago by gkronber

r13104: fixed a problem in Classification/ClassificationSolutionComparisonView

comment:41 Changed 2 years ago by gkronber

Relevant new changes:

comment:42 Changed 2 years ago by mkommend

  • Owner changed from mkommend to gkronber
  • Status changed from reviewing to assigned

Reviewed and briefly tested all relevant changes (mostly the model comparison view, ZeroR and OneR algorithm and the new classification performance metrics). Thank you for reviewing and finishing the implementation.

The only review comment I have is to put the ConstantModel outside of the Regression folder, because it is not limited to regression anymore.

comment:43 Changed 2 years ago by gkronber

r13154: moved ConstantModel class out of the Regression folder

comment:44 Changed 2 years ago by gkronber

  • Status changed from assigned to reviewing

comment:45 Changed 2 years ago by gkronber

r13155: terminated the old feature branch for model comparison

comment:46 Changed 2 years ago by gkronber

  • Status changed from reviewing to readytorelease

comment:47 Changed 2 years ago by gkronber

  • Resolution set to done
  • Status changed from readytorelease to closed

r13156: merged r13100:13104 and r13154 from trunk to stable

Note: See TracTickets for help on using tickets.