Opened 18 months ago

Last modified 2 days ago

#2847 reviewing feature request

Implement M5'-(Meta-)-Regression

Reported by: bwerth Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.16
Component: Algorithms.DataAnalysis Version: branch
Keywords: Cc:

Description

M5' regression can provide tree based and rule based regression models without relying on a holdout set for pruning. An interesting property of M5' is that rule/tree nodes contain individual linear models, which could be replaced with arbitrary complex regression models (eg. GP, KernelRidgeRegression,...), allowing for the calculation of confidence values

Change History (25)

comment:1 Changed 18 months ago by bwerth

r15428 created initial branch

comment:2 Changed 18 months ago by bwerth

  • Status changed from new to accepted

comment:3 Changed 18 months ago by bwerth

r15430 first implementation of M5'-regression

comment:4 Changed 18 months ago by bwerth

r15470 worked on M5Regression

  • merged PCA into M5Regression as Helperclass
  • reworked ImpurityTypes to more general SplitTypes
  • some minor cleanups
  • added IDataAnalysisAlgorithm- interface to FixedDataAnalysisAlgorithm (required for ComplexLeaf)

comment:5 Changed 18 months ago by bwerth

  • Owner changed from bwerth to gkronber
  • Status changed from accepted to reviewing

comment:6 Changed 16 months ago by gkronber

Review ongoing, my notes for the code review:

  • Why are different casts (throwing and non-throwing) for Parameters?
  • PrincipleComponentTransformation contains a lot of code, whereby it is not clear why all of this is necessary (Reverse)?
  • Description of OrderSplitType class and it's parameter does not match.
  • Folder is called 'Spliting' instead of 'Splitting'
  • It seems overly complicated to extract the different variants of splitting into classes. Is there an easier solution?
  • Made a change where field names where e.g. called Random1 because of the property already used the name Random. Fields should start with a lower-case letter, properties should start with upper-case letter.
  • It would be great to have correctly pre-configured algorithm variants (e.g. M5') where the LeafType, MetaModel, PruningType and SplitType are set correctly.
  • I'm not quite sure how feasible it really is to combine LeafType, MetaModel, PruningType and SplitType freely. Does this even work?
  • The interface ISplitType has a method Split(data, size, splitAttr, splitValue) can we say objects represent a type? Same for the interfaces LeafType and PruningType; the members of these interfaces do not really relate to members of types.

comment:7 Changed 16 months ago by gkronber

r15549:15550: made some changes while reviewing

comment:8 Changed 16 months ago by jkarder

  • Version set to branch

comment:9 Changed 15 months ago by bwerth

r15614 made changes to M5 according to review comments

comment:10 Changed 14 months ago by bwerth

r15830:

  • changed Splitter and Pruner to actually do splitting and pruning
  • made algorithm pausable
  • added new splitting approach
  • reworked pruning to be more easily controlled
  • added regression tree view
  • ...

comment:11 Changed 14 months ago by bwerth

r15833 improved handling of empty or underdetermined datasets

comment:12 Changed 10 months ago by bwerth

r15967 added logistic dampening and some minor changes

comment:13 Changed 9 months ago by bwerth

r16069 fixed serialization bug

comment:14 Changed 3 months ago by bwerth

r16538

  • renamed branch to include ticket number;
  • merged current trunk version into branch;
  • updated Build.cmd and Build.ps1

comment:15 Changed 2 days ago by gkronber

r16842: merged r16565:16796 from trunk/HeuristicLab.Algorithms.DataAnalysis to branch

comment:16 Changed 2 days ago by gkronber

r16847: made some minor changes while reviewing

comment:17 Changed 2 days ago by gkronber

r16848: renamed folder Spliting -> Splitting

comment:18 Changed 2 days ago by gkronber

r16849: renamed ImpurityCalculator

comment:19 Changed 2 days ago by gkronber

r16850: deleted all files which are not referenced/used in the project

comment:20 Changed 2 days ago by gkronber

r16852: fixed some issues that produced errors when testing

comment:21 Changed 2 days ago by gkronber

r16853: merged back M5 branch to trunk (using old style merge because of issues with automatic merge).

comment:22 Changed 2 days ago by gkronber

r16855: moved M5 regression into a separate plugin as it depends on HL.DataAnalysis.Algorithms.Glmnet plugin

comment:23 Changed 2 days ago by gkronber

r16856: svn:ignore

comment:24 Changed 2 days ago by gkronber

r16858: refactored LinearModelToTreeConverter to make it work with M5 regression

comment:25 Changed 2 days ago by gkronber

Pause, Save, Load, Continue does not work yet.

To reproduce:

  • Load SARCOS problem instance
  • Start and Pause after approx. 3 seconds
  • Save and Load the file
  • Press play --> NullException
Note: See TracTickets for help on using tickets.