Opened 8 years ago
Closed 7 years ago
#2697 closed enhancement (done)
Refactor certain parts for data-based modeling
Reported by: | gkronber | Owned by: | gkronber |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.15 |
Component: | Problems.DataAnalysis | Version: | 3.3.14 |
Keywords: | Cc: |
Description
- Move translation to AutoDiff out of evaluator
- Create ISymbolicExpressionTree for linear model from vector of coefficients
- Create two-dimensional array from ProblemData
Change History (34)
comment:1 Changed 8 years ago by gkronber
- Owner set to gkronber
- Status changed from new to accepted
comment:2 Changed 8 years ago by gkronber
comment:3 Changed 8 years ago by gkronber
- renaming of folder "Transformation" to "Converters" to distinguish between transformations for variables (from data preprocessing) and classes for transformation of trees.
- renamed SymbolicDataAnalysisExpressionTreeSimplifier -> TreeSimplifier
- Implemented a converter to create a linar model as a symbolic expression tree
comment:4 Changed 8 years ago by gkronber
- extended converter for linear models to support lagged variables and changed AR(k) to use this method
comment:5 Changed 8 years ago by gkronber
Todo: remove class scaling (needs persistence backwards compat in GPR)
comment:6 Changed 8 years ago by gkronber
- removed AlglibUtil.cs and added an extension method .ToArray(names, rows) to IDataset instead.
- refactored transformation so that it is possible to apply an transformation without resetting the parameters
- Used transformations instead of Scaling as far as possible.
- Moved TakeEvery extension method to HL.Common
comment:7 Changed 8 years ago by gkronber
r14394: renamed TreeSimplifier in unit test
comment:8 Changed 8 years ago by gkronber
r14396: added methods to get training and test input matrices from ProblemData
comment:9 Changed 8 years ago by gkronber
comment:10 Changed 8 years ago by gkronber
comment:11 Changed 8 years ago by mkommend
r14507: Added caching of parameters and removed resetting of training and test partition in AdjustProblemDataProperties (used for changing the problem data of a solution) in DataAnalysisProblemData.
comment:12 Changed 7 years ago by gkronber
comment:13 Changed 7 years ago by gkronber
comment:14 Changed 7 years ago by gkronber
r14845: fixed build fail of test solution
comment:15 Changed 7 years ago by gkronber
- Owner changed from gkronber to mkommend
- Status changed from accepted to reviewing
r14851: removed unnecessary variables
comment:16 Changed 7 years ago by gkronber
r14854: introduced original code for scaling from AlglibUtil into GaussianProcessModel to guarantee backwards compatibility regarding unit test results
comment:17 Changed 7 years ago by mkommend
#2697: Made symbol fields readonly in TreeSimplifier.
comment:18 follow-ups: ↓ 19 ↓ 20 Changed 7 years ago by mkommend
Review comments
Enumerable Extensions
Why has the ext. method TakeEvery been added, although it is never used?
gkronber: Good question. I traced this back to r7154. Seems we never needed this. Can be removed.
mkommend: Addressed with comment:19.
HL.Problems.DataAnalysis.Symbolic
Delete empty and unused folder Transformation
mkommend: Addressed with comment:20.
HeuristicLab.Problems.DataAnalysis Transformations
- Why do we need transformations?!?!
- Can't we delete all of them?
- If not, why don't we move them to the DataPreprocessing and forget about them?
gkronber: They are used in DataPreprocessing. After the changes in this ticket they are also used for scaling of input variables (instead of the Scaling class). Any symbolic regression model can have transformations. If we want to keep support for this at least the interface is necessary in the symbolic namespace.
Edit: After discussion we decided to move all transformations to the DataPreprocessing namespace and remove support for transformations from symbolic models.
Converters
I don't really get the benefit of the Convert class, besides listing somehow related conversion operations.
gkronber: addressed with comment:22AFAIK the TreeSimplifier is stateless? If that's the case why don't we provide a static method for simplification. e.g. var simplifiedTree = TreeSimplifier.Simplify(originalTree);.
gkronber: addressed with comment:23TreeToAutoDiffTermConverter encapsulates the functionality nicely. This new refactoring would also enable to use exception for control flow handling in TryConvertToAutoDiff(node,term) instead of always checking the transformation result (true or false).
gkronber: addressed with comment:24
Finished reviewing the changes in this tickets.
comment:19 in reply to: ↑ 18 Changed 7 years ago by mkommend
comment:20 in reply to: ↑ 18 Changed 7 years ago by mkommend
comment:21 Changed 7 years ago by mkommend
- Owner changed from mkommend to gkronber
comment:22 Changed 7 years ago by gkronber
r14948: removed static Convert class
comment:23 Changed 7 years ago by gkronber
r14949: made TreeSimplifier static
comment:24 Changed 7 years ago by gkronber
r14950: code improvement in TreeToAutoDiffTermConverter
comment:25 Changed 7 years ago by gkronber
- Status changed from reviewing to readytorelease
comment:26 Changed 7 years ago by gkronber
Since MCTS has been removed (already merged to stable #2581) there are conflicts when merging this ticket to stable.
comment:27 Changed 7 years ago by gkronber
comment:28 Changed 7 years ago by gkronber
comment:29 Changed 7 years ago by gkronber
comment:30 Changed 7 years ago by gkronber
comment:31 Changed 7 years ago by gkronber
comment:32 Changed 7 years ago by gkronber
r14938 also belongs to this ticket
comment:33 Changed 7 years ago by gkronber
comment:34 Changed 7 years ago by gkronber
- Resolution set to done
- Status changed from readytorelease to closed
r14378: