Opened 14 months ago

Last modified 4 months ago

#2847 reviewing feature request

Implement M5'-(Meta-)-Regression

Reported by: bwerth Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.16
Component: Algorithms.DataAnalysis Version: branch
Keywords: Cc:

Description

M5' regression can provide tree based and rule based regression models without relying on a holdout set for pruning. An interesting property of M5' is that rule/tree nodes contain individual linear models, which could be replaced with arbitrary complex regression models (eg. GP, KernelRidgeRegression,...), allowing for the calculation of confidence values

Change History (13)

comment:1 Changed 14 months ago by bwerth

r15428 created initial branch

comment:2 Changed 14 months ago by bwerth

  • Status changed from new to accepted

comment:3 Changed 14 months ago by bwerth

r15430 first implementation of M5'-regression

comment:4 Changed 13 months ago by bwerth

r15470 worked on M5Regression

  • merged PCA into M5Regression as Helperclass
  • reworked ImpurityTypes to more general SplitTypes
  • some minor cleanups
  • added IDataAnalysisAlgorithm- interface to FixedDataAnalysisAlgorithm (required for ComplexLeaf)

comment:5 Changed 13 months ago by bwerth

  • Owner changed from bwerth to gkronber
  • Status changed from accepted to reviewing

comment:6 Changed 12 months ago by gkronber

Review ongoing, my notes for the code review:

  • Why are different casts (throwing and non-throwing) for Parameters?
  • PrincipleComponentTransformation contains a lot of code, whereby it is not clear why all of this is necessary (Reverse)?
  • Description of OrderSplitType class and it's parameter does not match.
  • Folder is called 'Spliting' instead of 'Splitting'
  • It seems overly complicated to extract the different variants of splitting into classes. Is there an easier solution?
  • Made a change where field names where e.g. called Random1 because of the property already used the name Random. Fields should start with a lower-case letter, properties should start with upper-case letter.
  • It would be great to have correctly pre-configured algorithm variants (e.g. M5') where the LeafType, MetaModel, PruningType and SplitType are set correctly.
  • I'm not quite sure how feasible it really is to combine LeafType, MetaModel, PruningType and SplitType freely. Does this even work?
  • The interface ISplitType has a method Split(data, size, splitAttr, splitValue) can we say objects represent a type? Same for the interfaces LeafType and PruningType; the members of these interfaces do not really relate to members of types.

comment:7 Changed 12 months ago by gkronber

r15549:15550: made some changes while reviewing

comment:8 Changed 12 months ago by jkarder

  • Version set to branch

comment:9 Changed 11 months ago by bwerth

r15614 made changes to M5 according to review comments

comment:10 Changed 9 months ago by bwerth

r15830:

  • changed Splitter and Pruner to actually do splitting and pruning
  • made algorithm pausable
  • added new splitting approach
  • reworked pruning to be more easily controlled
  • added regression tree view
  • ...

comment:11 Changed 9 months ago by bwerth

r15833 improved handling of empty or underdetermined datasets

comment:12 Changed 6 months ago by bwerth

r15967 added logistic dampening and some minor changes

comment:13 Changed 4 months ago by bwerth

r16069 fixed serialization bug

Note: See TracTickets for help on using tickets.