Opened 7 months ago

Last modified 8 days ago

#2697 readytorelease enhancement

Refactor certain parts for data-based modeling

Reported by: gkronber Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.15
Component: Problems.DataAnalysis Version: 3.3.14
Keywords: Cc:

Description

  • Move translation to AutoDiff out of evaluator
  • Create ISymbolicExpressionTree for linear model from vector of coefficients
  • Create two-dimensional array from ProblemData

Change History (25)

comment:1 Changed 7 months ago by gkronber

  • Owner set to gkronber
  • Status changed from new to accepted

comment:2 Changed 7 months ago by gkronber

r14378:

  • created a folder for all classes related to transformation from and to trees
  • created a transformator which takes a tree and uses AutoDiff to produce a function and gradient function for the tree.
  • moved code from SymbolicRegressionConstantOptimizationEvaluator to TreeToAutoDiffTermTransformator to make AutoDiff for trees more accessible

comment:3 Changed 6 months ago by gkronber

r14390:

  • renaming of folder "Transformation" to "Converters" to distinguish between transformations for variables (from data preprocessing) and classes for transformation of trees.
  • renamed SymbolicDataAnalysisExpressionTreeSimplifier -> TreeSimplifier
  • Implemented a converter to create a linar model as a symbolic expression tree

comment:4 Changed 6 months ago by gkronber

r14391:

  • extended converter for linear models to support lagged variables and changed AR(k) to use this method

comment:5 Changed 6 months ago by gkronber

Todo: remove class scaling (needs persistence backwards compat in GPR)

comment:6 Changed 6 months ago by gkronber

r14393:

  • removed AlglibUtil.cs and added an extension method .ToArray(names, rows) to IDataset instead.
  • refactored transformation so that it is possible to apply an transformation without resetting the parameters
  • Used transformations instead of Scaling as far as possible.
  • Moved TakeEvery extension method to HL.Common

comment:7 Changed 6 months ago by gkronber

r14394: renamed TreeSimplifier in unit test

comment:8 Changed 6 months ago by gkronber

r14396: added methods to get training and test input matrices from ProblemData

comment:9 Changed 6 months ago by gkronber

r14400: reverse merge of r14378, r14390, r14391, r14393, r14394, r14396 because the changes cannot be merged over to the #2650 branch

comment:10 Changed 6 months ago by gkronber

This is needed by #745.

However, #2650 should be merged to the trunk first before we apply the changes reverted in r14400 because there are many conflicting changes.

comment:11 Changed 5 months ago by mkommend

r14507: Added caching of parameters and removed resetting of training and test partition in AdjustProblemDataProperties (used for changing the problem data of a solution) in DataAnalysisProblemData.

comment:12 Changed 7 weeks ago by gkronber

r14840: applied changes from r14378 again

comment:13 Changed 7 weeks ago by gkronber

r14843: applied r14390, r14391, r14393, r14394, r14396 again (resolving conflicts)

comment:14 Changed 7 weeks ago by gkronber

r14845: fixed build fail of test solution

Last edited 7 weeks ago by gkronber (previous) (diff)

comment:15 Changed 7 weeks ago by gkronber

  • Owner changed from gkronber to mkommend
  • Status changed from accepted to reviewing

r14851: removed unnecessary variables

comment:16 Changed 7 weeks ago by gkronber

r14854: introduced original code for scaling from AlglibUtil into GaussianProcessModel to guarantee backwards compatibility regarding unit test results

comment:17 Changed 3 weeks ago by mkommend

#2697: Made symbol fields readonly in TreeSimplifier.

comment:18 follow-ups: Changed 3 weeks ago by mkommend

Review comments

Enumerable Extensions

  • Why has the ext. method TakeEvery been added, although it is never used?
    gkronber: Good question. I traced this back to r7154. Seems we never needed this. Can be removed.
    mkommend: Addressed with comment:19.

HL.Problems.DataAnalysis.Symbolic

  • Delete empty and unused folder Transformation
    mkommend: Addressed with comment:20.

HeuristicLab.Problems.DataAnalysis Transformations

  • Why do we need transformations?!?!
  • Can't we delete all of them?
  • If not, why don't we move them to the DataPreprocessing and forget about them?

gkronber: They are used in DataPreprocessing. After the changes in this ticket they are also used for scaling of input variables (instead of the Scaling class). Any symbolic regression model can have transformations. If we want to keep support for this at least the interface is necessary in the symbolic namespace.

Edit: After discussion we decided to move all transformations to the DataPreprocessing namespace and remove support for transformations from symbolic models.

Converters

  • I don't really get the benefit of the Convert class, besides listing somehow related conversion operations.
    gkronber: addressed with comment:22
  • AFAIK the TreeSimplifier is stateless? If that's the case why don't we provide a static method for simplification. e.g. var simplifiedTree = TreeSimplifier.Simplify(originalTree);.
    gkronber: addressed with comment:23
  • TreeToAutoDiffTermConverter encapsulates the functionality nicely. This new refactoring would also enable to use exception for control flow handling in TryConvertToAutoDiff(node,term) instead of always checking the transformation result (true or false).
    gkronber: addressed with comment:24

Finished reviewing the changes in this tickets.

Last edited 3 weeks ago by gkronber (previous) (diff)

comment:19 in reply to: ↑ 18 Changed 3 weeks ago by mkommend

Replying to mkommend:

Enumerable Extensions

  • Why has the ext. method TakeEvery been added, although it is never used?

gkronber: Good question. I traced this back to r7154. Seems we never needed this. Can be removed.

r14944: Removed TakeEvery methods from EnumerableExtensions.cs

comment:20 in reply to: ↑ 18 Changed 3 weeks ago by mkommend

Replying to mkommend:

HL.Problems.DataAnalysis.Symbolic

  • Delete empty and unused folder Transformation

r14945: Removed empty folder HL.Problems.DataAnalysis.Symbolic\Transformation.

comment:21 Changed 3 weeks ago by mkommend

  • Owner changed from mkommend to gkronber

comment:22 Changed 3 weeks ago by gkronber

r14948: removed static Convert class

comment:23 Changed 3 weeks ago by gkronber

r14949: made TreeSimplifier static

comment:24 Changed 3 weeks ago by gkronber

r14950: code improvement in TreeToAutoDiffTermConverter

comment:25 Changed 8 days ago by gkronber

  • Status changed from reviewing to readytorelease
Note: See TracTickets for help on using tickets.