Opened 3 years ago

Closed 15 months ago

Last modified 13 months ago

#2847 closed feature request (done)

Implement M5'-(Meta-)-Regression

Reported by: bwerth Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.16
Component: Algorithms.DataAnalysis Version: trunk
Keywords: Cc:

Description

M5' regression can provide tree based and rule based regression models without relying on a holdout set for pruning. An interesting property of M5' is that rule/tree nodes contain individual linear models, which could be replaced with arbitrary complex regression models (eg. GP, KernelRidgeRegression,...), allowing for the calculation of confidence values

Change History (36)

comment:1 Changed 3 years ago by bwerth

r15428 created initial branch

comment:2 Changed 3 years ago by bwerth

  • Status changed from new to accepted

comment:3 Changed 3 years ago by bwerth

r15430 first implementation of M5'-regression

comment:4 Changed 3 years ago by bwerth

r15470 worked on M5Regression

  • merged PCA into M5Regression as Helperclass
  • reworked ImpurityTypes to more general SplitTypes
  • some minor cleanups
  • added IDataAnalysisAlgorithm- interface to FixedDataAnalysisAlgorithm (required for ComplexLeaf)

comment:5 Changed 3 years ago by bwerth

  • Owner changed from bwerth to gkronber
  • Status changed from accepted to reviewing

comment:6 Changed 3 years ago by gkronber

Review ongoing, my notes for the code review:

  • Why are different casts (throwing and non-throwing) for Parameters?
  • PrincipleComponentTransformation contains a lot of code, whereby it is not clear why all of this is necessary (Reverse)?
  • Description of OrderSplitType class and it's parameter does not match.
  • Folder is called 'Spliting' instead of 'Splitting'
  • It seems overly complicated to extract the different variants of splitting into classes. Is there an easier solution?
  • Made a change where field names where e.g. called Random1 because of the property already used the name Random. Fields should start with a lower-case letter, properties should start with upper-case letter.
  • It would be great to have correctly pre-configured algorithm variants (e.g. M5') where the LeafType, MetaModel, PruningType and SplitType are set correctly.
  • I'm not quite sure how feasible it really is to combine LeafType, MetaModel, PruningType and SplitType freely. Does this even work?
  • The interface ISplitType has a method Split(data, size, splitAttr, splitValue) can we say objects represent a type? Same for the interfaces LeafType and PruningType; the members of these interfaces do not really relate to members of types.

comment:7 Changed 3 years ago by gkronber

r15549:15550: made some changes while reviewing

comment:8 Changed 3 years ago by jkarder

  • Version set to branch

comment:9 Changed 3 years ago by bwerth

r15614 made changes to M5 according to review comments

comment:10 Changed 3 years ago by bwerth

r15830:

  • changed Splitter and Pruner to actually do splitting and pruning
  • made algorithm pausable
  • added new splitting approach
  • reworked pruning to be more easily controlled
  • added regression tree view
  • ...

comment:11 Changed 3 years ago by bwerth

r15833 improved handling of empty or underdetermined datasets

comment:12 Changed 2 years ago by bwerth

r15967 added logistic dampening and some minor changes

comment:13 Changed 2 years ago by bwerth

r16069 fixed serialization bug

comment:14 Changed 22 months ago by bwerth

r16538

  • renamed branch to include ticket number;
  • merged current trunk version into branch;
  • updated Build.cmd and Build.ps1

comment:15 Changed 18 months ago by gkronber

r16842: merged r16565:16796 from trunk/HeuristicLab.Algorithms.DataAnalysis to branch

comment:16 Changed 18 months ago by gkronber

r16847: made some minor changes while reviewing

comment:17 Changed 18 months ago by gkronber

r16848: renamed folder Spliting -> Splitting

comment:18 Changed 18 months ago by gkronber

r16849: renamed ImpurityCalculator

comment:19 Changed 18 months ago by gkronber

r16850: deleted all files which are not referenced/used in the project

comment:20 Changed 18 months ago by gkronber

r16852: fixed some issues that produced errors when testing

comment:21 Changed 18 months ago by gkronber

r16853: merged back M5 branch to trunk (using old style merge because of issues with automatic merge).

comment:22 Changed 18 months ago by gkronber

r16855: moved M5 regression into a separate plugin as it depends on HL.DataAnalysis.Algorithms.Glmnet plugin

comment:23 Changed 18 months ago by gkronber

r16856: svn:ignore

comment:24 Changed 18 months ago by gkronber

r16858: refactored LinearModelToTreeConverter to make it work with M5 regression

comment:25 follow-up: Changed 18 months ago by gkronber

Pause, Save, Load, Continue does not work yet.

To reproduce:

  • Load SARCOS problem instance
  • Start and Pause after approx. 3 seconds
  • Save and Load the file
  • Press play --> NullException

comment:26 Changed 18 months ago by gkronber

  • Owner changed from gkronber to bwerth
  • Status changed from reviewing to assigned

@bweth: please check and try to fix the problem mentioned above.

I will remove mentions of M5' as suggested.

comment:27 Changed 18 months ago by gkronber

  • Version changed from branch to trunk

comment:28 Changed 16 months ago by gkronber

  • r17078: renamed HeuristicLab.Algorithms.M5 -> HeuristicLab.Algorithms.DecisionTrees
  • r17079:17083: more changes for renaming the M5 plugin
  • r17084: use newer version of alglib
Last edited 16 months ago by gkronber (previous) (diff)

comment:29 in reply to: ↑ 25 Changed 16 months ago by gkronber

Replying to gkronber:

Pause, Save, Load, Continue does not work yet.

To reproduce:

  • Load SARCOS problem instance
  • Start and Pause after approx. 3 seconds
  • Save and Load the file
  • Press play --> NullException

I partially fixed pause and resume with r17085.

There is a still a problem because of a bug for deserialization of Queue in HEAL.Attic See https://github.com/HeuristicLab/HEAL.Attic/issues/18

comment:30 Changed 16 months ago by gkronber

  • Owner changed from bwerth to gkronber

comment:31 Changed 16 months ago by gkronber

r17139: added Storable-properties to map Queues to Arrays (and vice versa) to work around problem with serialization of Queues in HEAL.Attic

comment:32 Changed 16 months ago by gkronber

  • Status changed from assigned to reviewing

After r17139 save & load & run works correctly. As I have reviewed all changes I think we can merge and release changes in this ticket. However, since r17139 there is a problem with unit tests on our CI server. The root cause for the problems needs to be determined first.

comment:33 Changed 15 months ago by gkronber

  • Status changed from reviewing to readytorelease

The most recent build on our CI server was successful.

comment:34 Changed 15 months ago by gkronber

r17159: merged r16853, r16855, r16856, r16858, r17078, r17079:17085, r17139 from trunk to stable

comment:35 Changed 15 months ago by jkarder

  • Resolution set to done
  • Status changed from readytorelease to closed

comment:36 Changed 13 months ago by gkronber

r17286: delete branch for ticket

Note: See TracTickets for help on using tickets.