Opened 3 years ago
Closed 2 years ago
#3136 closed feature request (done)
Structure templates for symbolic regression
Reported by: | dpiringe | Owned by: | gkronber |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.17 |
Component: | Problems.DataAnalysis.Symbolic | Version: | trunk |
Keywords: | Cc: |
Description (last modified by gkronber)
Implementing a new problem type with a structure template parameter for symbolic regression. The following components are necessary:
- MultiEncoding
- Views
- Parser for structure: to parse an infix expression
- Evaluator: based on IABoundEstimator
- Interpreter: builds a full expression, based on structure template parameter and (multiple) evolved sub-expressions
Change History (83)
comment:1 Changed 3 years ago by dpiringe
- Status changed from new to accepted
comment:2 Changed 3 years ago by dpiringe
- Summary changed from GP ProblemType with Structural Parameters to Structural GP
- Version changed from 3.3.16 to branch
comment:3 Changed 3 years ago by dpiringe
- added a new type of problem called StructuredSymbolicRegressionSingleObjectiveProblem, represents a first construct for future implementations
comment:4 Changed 3 years ago by dpiringe
- added a new Symbol SubFunctionSymbol for sub functions
- modified InfixExpressionParser to support SubFunctionSymbol (parsing of variableNames still in work)
- modified StructuredSymbolicRegressionSingleObjectiveProblem to extract sub functions and add them to MultiEncoding
comment:5 Changed 3 years ago by dpiringe
- added view components and classes for sub functions
comment:6 Changed 3 years ago by dpiringe
- modified InfixExpressionParser to fully support SubFunctionSymbol
- created a SubFunctionTreeNode to store the function arguments
- modified StructureTemplateView to regenerate the content state
- first implementation for the main tree build up logic
comment:7 Changed 3 years ago by dpiringe
- added a simple way of evaluation (using r2 evaluator)
- added a simple analyzing logic for "Best Tree"
- added a connection to SubFunction in SubFunctionTreeNode
comment:8 Changed 3 years ago by dpiringe
- changed the StructureTemplateView -> nodes of type SubFunctionTreeNode are now clickable
- still need to overhaul the UI elements
- added a way to parse SubFunctionTreeNode with a unique name in InfixExpressionParser
comment:9 Changed 3 years ago by dpiringe
- modified the StructureTemplateView to enable colorful tree nodes of type SubFunctionTreeNode
- refactored SubFunctionTreeNode, SubFunction and StructureTemplate
comment:10 Changed 3 years ago by dpiringe
- added linear scaling support for structure template parameter
comment:11 Changed 3 years ago by dpiringe
- added linear scaling logic in Evaluate and (for UI reasons) Analyze
- added logic forSubFunctionSymbol (modified OpCodes) -> the SubFunctionTreeNode is display in the tree but has no effect on evaluation (works like a flag)
- works now with SymbolicDataAnalysisExpressionTreeInterpreter
- default grammar for SubFunction is now ArithmeticExpressionGrammar instead of LinearScalingGrammar
comment:12 Changed 3 years ago by chaider
- Added info text in StructureTemplateView
- Fixed cloning constructors
- Added check if linear scaling nodes are set
comment:13 Changed 3 years ago by dpiringe
- fixed a bug: parsing an expression now resets the viewHost content, this prevents to view old content of non-existing sub-functions
comment:14 Changed 3 years ago by chaider
-Fixed cloning of StructureTemplate
comment:15 Changed 3 years ago by dpiringe
- added a hidden interpreter parameter for StructuredSymbolicRegressionSingleObjectiveProblem
- fixed a bug which crashed the application by changing ProblemData with different variables
- fixed a bug which crashed the application by running the problem with an empty StructureTemplate
- added a better output of exceptions of type AggregateException
- added and resize event handler to repaint nodes of type SubFunctionTreeNode
- code cleanup
comment:16 Changed 3 years ago by gkronber
- Description modified (diff)
- Milestone changed from HeuristicLab 4.0 to HeuristicLab 3.3.17
- Summary changed from Structural GP to Structure templates for symbolic regression
comment:17 Changed 3 years ago by dpiringe
- added new parameters
- added the builded tree into the scope, this allows operators to use the final tree
- added new operators
comment:18 Changed 3 years ago by dpiringe
- changed the visibility of the following parameters: EstimationLimitsParameter, EvaluatorParameter and BestTrainingSolutionParameter
- added first steps to set an evaluator as parameter
- added a new parameter TreeEvaluatorParameter
- added a temporary logic to static evaluator method Calculate
- tried to change a lot of necessary parameters to use the method Evaluate, this caused a lot of problems -> reverted all changes
comment:19 Changed 3 years ago by dpiringe
- added a new problem data provider AsadzadehProvider and the correspondig instance Asadzadeh1
- implements the test setup of paper Symbolic regression based hybrid semiparametric modelling of processes: An example case of a bending process
- used the Asadzadeh1 instance in StructuredSymbolicRegressionSingleObjectiveProblem for default setup
- added the SubFunctionSymbol in DerivativeCalculator and IntervalArithBoundsEstimator
comment:20 Changed 3 years ago by dpiringe
comment:21 Changed 3 years ago by dpiringe
- set the default template to f(_) when loading a new problem data
- fixed a bug which caused the drawing of uncolored SubFunctionTreeNodes after using the window splitter
- implemented a method to paint nodes of SubFunctionTreeNode as colored nodes for ISymbolicDataAnalysisModel
comment:22 Changed 3 years ago by dpiringe
- recreated problem instance Asdzadeh1 as SheetBendingProcess
- SheetBendingProcess is located in Physics and provided by Physics/PhysicsInstanceProvider
comment:23 Changed 3 years ago by dpiringe
- refactor the evaluation logic of NMSESingleObjectiveConstraintsEvaluator
- refactor the new method Evaluate for PearsonRSquaredAverageSimilarityEvaluator
- change the parameter order of some evaluate/calculate methods
comment:24 Changed 3 years ago by dpiringe
- overrode the method GetActualValue in ValueLookupParameter to get the default value when the execution context is null
- reverted the linear scaling logic for NMSESingleObjectiveConstraintsEvaluator
- in SymbolicRegressionConstantOptimizationEvaluator: removed the usage of GenerateRowsToEvaluate because it uses lookup parameters
- set the value of RelativeNumberOfEvaluatedSamplesParameter for SymbolicRegressionConstantOptimizationEvaluator in StructuredSymbolicRegressionSingleObjectiveProblem if Maximization = true and the SymbolicRegressionConstantOptimizationEvaluator is configured as evaluator
- added the SubFunctionSymbol in TreeToAutoDiffTermConverter
comment:25 Changed 3 years ago by dpiringe
- updated interpeters of type ISymbolicDataAnalysisExpressionTreeInterpreter to support symbols of type SubFunctionSymbol
comment:26 Changed 3 years ago by dpiringe
- added a new information box for StructureTemplate in StructureTemplateView with an extended description about structure templates
comment:27 Changed 3 years ago by dpiringe
- changed the item name of SubFunctionSymbol from SubFunctionSymbol to SubFunction
comment:28 Changed 3 years ago by mkommend
r18146: Merged trunk changes into branch.
comment:29 Changed 3 years ago by mkommend
r18149: Merged trunk changes into branch.
comment:30 Changed 3 years ago by mkommend
r18150: Merged trunk changes into branch.
comment:31 Changed 3 years ago by dpiringe
- fixed eventhandler reregister after deserialisazion/cloning
- added a test case for StructuredSymbolicRegressionSingleObjectiveProblem
- changed the usage of a Dictionary to List
comment:32 Changed 3 years ago by dpiringe
- removed the calculation of EstimationLimits and set the interval [-inf, inf] as default
- this parameter was never adjusted after problem construction -> caused bugs with the change of problem data
- created two new method to setup/create the MultiEncoding and SymbolicExpressionTreeEncoding
- configured the default template f(_) for a structure template
comment:33 Changed 3 years ago by dpiringe
- overwrote the method SetEnabledStateOfControls for StructureTemplateView
- fixed the wrong usage of infoLabel in StructureTemplateView -> added a new label errorLabel for textual output
- deleted the resource file for StructureTemplateView
comment:34 Changed 3 years ago by dpiringe
- merged trunk into branch
comment:35 Changed 3 years ago by dpiringe
- adapted the unit test RunStructureTemplateRegressionSampleTest to match the results
- added the sample to the optimizer start page
comment:36 Changed 3 years ago by dpiringe
- adapted formatters to support SubFunctionSymbol
comment:37 Changed 3 years ago by gkronber
r18158: fixed "essential" unit tests
comment:38 follow-up: ↓ 41 Changed 3 years ago by gkronber
@dpiringe, could you please also add support for "Number" to the native interpreter? I do not have the necessary dev environment installed.
comment:39 follow-up: ↓ 42 Changed 3 years ago by gkronber
Please use BatchInterpreter instead of TreeInterpreter because it is more efficient.
comment:40 Changed 3 years ago by dpiringe
- merged trunk into branch
comment:41 in reply to: ↑ 38 Changed 3 years ago by dpiringe
comment:42 in reply to: ↑ 39 Changed 3 years ago by dpiringe
comment:43 follow-up: ↓ 45 Changed 3 years ago by gkronber
Bugs / suggestion for improvments:
values for num are not optimized / changed (tested with paramopt and batchInterpreter)-> #3140num values cannot be initialized to negative values (<num=-1.5>). Only via workaround: -<num=1.5>.-> #3140- Configuration for sub-functions (grammar, size limits) should be not be lost when reparsing (especially when only a minor detail in the structure expression is changed).
- power-symbol: maybe it would be better to throw in the interpreter only when problematic arguments are actually detected (negative base with real-valued power). This would simplify entry of structure-templates. (we should create a new ticket for this.)
comment:44 Changed 3 years ago by dpiringe
- fixed a bug in StructureTemplateView -> only nodes of type SubFunctionTreeNode are selectable
- added a way to keep old sub functions after parsing a new expression
- overwrote some basic object methods for SubFunction to keep it simple
- only old sub functions, which match the name and signature of the new ones, are saved; examples:
- old: f(x), new: f(x) -> keep old
- old: f(x1), new: f(x1, x2) -> use new
- old: f1(x), new f2(x) -> use new
comment:45 in reply to: ↑ 43 Changed 3 years ago by dpiringe
Replying to gkronber:
Bugs / suggestion for improvments:
values for num are not optimized / changed (tested with paramopt and batchInterpreter)-> #3140num values cannot be initialized to negative values (<num=-1.5>). Only via workaround: -<num=1.5>.-> #3140- Configuration for sub-functions (grammar, size limits) should be not be lost when reparsing (especially when only a minor detail in the structure expression is changed).
- power-symbol: maybe it would be better to throw in the interpreter only when problematic arguments are actually detected (negative base with real-valued power). This would simplify entry of structure-templates. (we should create a new ticket for this.)
added a way to keep old sub functions, see r18164
comment:46 Changed 3 years ago by gkronber
r18176: merged r18165:18174 from trunk to branch (resolving conflicts in the parser)
comment:47 Changed 3 years ago by gkronber
r18177: removed a special case from the Evaluate method because it can never be true. Maximization is fixed to false in this problem and therefore the ParamOptEvaluator is not available as an evaluator.
comment:48 Changed 3 years ago by gkronber
r18178: copied code with minor modifications from ParameterOptimizationEvaluator into the NMSEConstraintsEvaluator because the code in ParameterOptimizationEvaluator uses R² internally and is incompatible to the NMSEEvaluator.
comment:49 Changed 3 years ago by gkronber
r18179: improved parameter optimization for NMSEConstraintsEvaluator. Use LM directly instead of lsfit to improve efficiency by using vectorized callbacks.
comment:50 Changed 3 years ago by gkronber
Open issue: parameters are not written back to the trees stored in the individuals after optimization. The reason is that trees from the individual are cloned and combined with the (cloned) template to construct the full tree. We should think about a way to update optimized parameters in the trees (including parameters which occur in the template).
private ISymbolicExpressionTree BuildTree(Individual individual) { if (StructureTemplate.Tree == null) throw new ArgumentException("No structure template defined!"); var clonedTemplate = (ISymbolicExpressionTree)StructureTemplate.Tree.Clone(); // build main tree foreach (var subFunctionTreeNode in clonedTemplate.IterateNodesPrefix().OfType<SubFunctionTreeNode>()) { var subFunctionTree = individual.SymbolicExpressionTree(subFunctionTreeNode.Name); // add new tree var subTree = subFunctionTree.Root.GetSubtree(0) // Start .GetSubtree(0); // Offset subTree = (ISymbolicExpressionTreeNode)subTree.Clone(); subFunctionTreeNode.AddSubtree(subTree); } return clonedTemplate; }
comment:51 Changed 3 years ago by dpiringe
- fixed missing/wrong event registration for SubFunction and StructuredSymbolicRegressionSingleObjectiveProblem
comment:52 Changed 3 years ago by mkommend
r18183: Fixed bug in parameter optimization code of NMSE evaluator (tree has never been updated).
comment:53 Changed 3 years ago by mkommend
r18184: Refactored structured GP problem.
comment:54 Changed 3 years ago by mkommend
r18185: Removed unused view SubFunctionListView.
comment:55 Changed 3 years ago by mkommend
r18187: Refactored saved trees in structure template.
comment:56 Changed 3 years ago by mkommend
r18188: Fixed backwards compatibility of StructureTemplate
comment:57 Changed 3 years ago by mkommend
r18189: Fixed name of MultiEncodingCreator.
comment:58 Changed 3 years ago by mkommend
- Added parameters for parameter optimization / linear scaling in StructeredSymRegProblem.
- Added license headers to StructureTemplate and StructeredSymRegProblem.
- Fixed type in NMSEConstraint evaluator.
comment:59 Changed 3 years ago by mkommend
r18191: Extracted linear scaling functionality in a dedicated helper class.
comment:60 Changed 3 years ago by mkommend
comment:61 Changed 3 years ago by mkommend
- Added handling of numeric parameters in structur GP problem by using a real vector encoding.
- Configured grammar in sub function.
- Added property for numeric parameters in structure template.
comment:62 Changed 3 years ago by mkommend
r18195: Refactored creation of subfunctions in StructureTemplate.
comment:63 Changed 3 years ago by mkommend
r18196: Fixed bug in adjustment of linear scaling terms.
comment:64 Changed 3 years ago by mkommend
r18197: Omitted parameter optimization of variable weights in the template part of the tree.
comment:65 Changed 3 years ago by mkommend
- Owner changed from dpiringe to gkronber
- Status changed from accepted to reviewing
comment:66 follow-ups: ↓ 68 ↓ 70 Changed 3 years ago by gkronber
Review comments:
- RealVector cross-over raise exception when the template contains only one numeric parameters
- Entering an expression which uses an non-existant input variables (e.g. "f(xyz)") shows error when parse is pressed the first time, but succeeds if it is pressed again without change to the structure.
comment:67 Changed 3 years ago by mkommend
r18198: Fixed error in structure GP if only one numeric parameter is present in the template by providing a new copy crossover for real vectors.
comment:68 in reply to: ↑ 66 Changed 3 years ago by mkommend
comment:69 Changed 3 years ago by mkommend
r18199: Fixed parsing of variables in subfunctions.
comment:70 in reply to: ↑ 66 Changed 3 years ago by mkommend
Replying to gkronber:
- Entering an expression which uses an non-existant input variables (e.g. "f(xyz)") shows error when parse is pressed the first time, but succeeds if it is pressed again without change to the structure.
addressed in r18199
The presence of variables cannot be checked during parsing of the template. Thus, this functionality has been removed and if a non existing variable is used in the template a runtime error is raised.
comment:71 Changed 3 years ago by gkronber
The ItemName of the problem is ... Single Objective Problem (single-ojective). --> simplify
comment:72 Changed 3 years ago by mkommend
r18200: Improved item name and description of structured symreg problem.
comment:73 Changed 3 years ago by gkronber
Todo gkronber: review and merge back to trunk. Todo David: rename class, change default grammar (and configuration), fix unit test.
Topics for future development:
- Performance tuning (evaluate).
- Instead of multi-encoding use a specific new encoding.
- Improve GUI: don't show sub-functions right side of the tree but in the problem instead.
- Shape-constraints for sub-functions.
- Partial dependence plots / visualization / impact-calculation for sub-functions
- Use (restricted) type-coherent grammar instead of arithmetic grammar as a default for sub-functions
- Configuration of number of sub-trees for grammar symbols (currently 1 - 3 arguments).
Not directly related:
- Exclusion of nodes in parameter optimization should be improved, remove code duplication for parameter optimization.
- Remove code duplication for linear scaling.
comment:74 Changed 3 years ago by dpiringe
- changed visibility of string constants in TypeCoherentExpressionGrammar from private to public
- changed default grammar for SubFunction
comment:75 Changed 3 years ago by dpiringe
- renamed StructuredSymbolicRegressionSingleObjectiveProblem to StructureTemplateSymbolicRegressionProblem
comment:76 Changed 3 years ago by dpiringe
- updated test cases for StructureTemplateSymbolicRegressionProblem
- updated optimizer template for StructureTemplateSymbolicRegressionProblem
- updated project file for HeuristicLab.Problems.DataAnalysis.Symbolic.Regression (forgot to include last commit)
comment:77 Changed 3 years ago by gkronber
r18216: merged r18203:18211 from trunk to branch. Merged changes, fixed compile problem ('is not')
comment:78 Changed 3 years ago by gkronber
r18220: reintegrated structure-template GP branch into trunk
comment:79 Changed 3 years ago by gkronber
- Version changed from branch to trunk
comment:80 Changed 3 years ago by gkronber
r18221: deleted branch which was reintegrated into trunk
comment:81 Changed 3 years ago by dpiringe
- set an empty enumerable for arguments to prevent a nullable enumerable
comment:82 Changed 2 years ago by gkronber
- Status changed from reviewing to readytorelease
Closed on github.
comment:83 Changed 2 years ago by gkronber
- Resolution set to done
- Status changed from readytorelease to closed
r18054