Opened 17 months ago
Closed 8 months ago
#3136 closed feature request (done)
Structure templates for symbolic regression
Reported by: | dpiringe | Owned by: | gkronber |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.17 |
Component: | Problems.DataAnalysis.Symbolic | Version: | trunk |
Keywords: | Cc: |
Description (last modified by gkronber)
Implementing a new problem type with a structure template parameter for symbolic regression. The following components are necessary:
- MultiEncoding
- Views
- Parser for structure: to parse an infix expression
- Evaluator: based on IABoundEstimator
- Interpreter: builds a full expression, based on structure template parameter and (multiple) evolved sub-expressions
Change History (83)
comment:1 Changed 17 months ago by dpiringe
- Status changed from new to accepted
comment:2 Changed 17 months ago by dpiringe
- Summary changed from GP ProblemType with Structural Parameters to Structural GP
- Version changed from 3.3.16 to branch
comment:3 Changed 17 months ago by dpiringe
- added a new type of problem called StructuredSymbolicRegressionSingleObjectiveProblem, represents a first construct for future implementations
comment:4 Changed 16 months ago by dpiringe
- added a new Symbol SubFunctionSymbol for sub functions
- modified InfixExpressionParser to support SubFunctionSymbol (parsing of variableNames still in work)
- modified StructuredSymbolicRegressionSingleObjectiveProblem to extract sub functions and add them to MultiEncoding
comment:5 Changed 16 months ago by dpiringe
- added view components and classes for sub functions
comment:6 Changed 16 months ago by dpiringe
- modified InfixExpressionParser to fully support SubFunctionSymbol
- created a SubFunctionTreeNode to store the function arguments
- modified StructureTemplateView to regenerate the content state
- first implementation for the main tree build up logic
comment:7 Changed 16 months ago by dpiringe
- added a simple way of evaluation (using r2 evaluator)
- added a simple analyzing logic for "Best Tree"
- added a connection to SubFunction in SubFunctionTreeNode
comment:8 Changed 16 months ago by dpiringe
- changed the StructureTemplateView -> nodes of type SubFunctionTreeNode are now clickable
- still need to overhaul the UI elements
- added a way to parse SubFunctionTreeNode with a unique name in InfixExpressionParser
comment:9 Changed 16 months ago by dpiringe
- modified the StructureTemplateView to enable colorful tree nodes of type SubFunctionTreeNode
- refactored SubFunctionTreeNode, SubFunction and StructureTemplate
comment:10 Changed 16 months ago by dpiringe
- added linear scaling support for structure template parameter
comment:11 Changed 16 months ago by dpiringe
- added linear scaling logic in Evaluate and (for UI reasons) Analyze
- added logic forSubFunctionSymbol (modified OpCodes) -> the SubFunctionTreeNode is display in the tree but has no effect on evaluation (works like a flag)
- works now with SymbolicDataAnalysisExpressionTreeInterpreter
- default grammar for SubFunction is now ArithmeticExpressionGrammar instead of LinearScalingGrammar
comment:12 Changed 16 months ago by chaider
- Added info text in StructureTemplateView
- Fixed cloning constructors
- Added check if linear scaling nodes are set
comment:13 Changed 16 months ago by dpiringe
- fixed a bug: parsing an expression now resets the viewHost content, this prevents to view old content of non-existing sub-functions
comment:14 Changed 16 months ago by chaider
-Fixed cloning of StructureTemplate
comment:15 Changed 16 months ago by dpiringe
- added a hidden interpreter parameter for StructuredSymbolicRegressionSingleObjectiveProblem
- fixed a bug which crashed the application by changing ProblemData with different variables
- fixed a bug which crashed the application by running the problem with an empty StructureTemplate
- added a better output of exceptions of type AggregateException
- added and resize event handler to repaint nodes of type SubFunctionTreeNode
- code cleanup
comment:16 Changed 15 months ago by gkronber
- Description modified (diff)
- Milestone changed from HeuristicLab 4.0 to HeuristicLab 3.3.17
- Summary changed from Structural GP to Structure templates for symbolic regression
comment:17 Changed 15 months ago by dpiringe
- added new parameters
- added the builded tree into the scope, this allows operators to use the final tree
- added new operators
comment:18 Changed 15 months ago by dpiringe
- changed the visibility of the following parameters: EstimationLimitsParameter, EvaluatorParameter and BestTrainingSolutionParameter
- added first steps to set an evaluator as parameter
- added a new parameter TreeEvaluatorParameter
- added a temporary logic to static evaluator method Calculate
- tried to change a lot of necessary parameters to use the method Evaluate, this caused a lot of problems -> reverted all changes
comment:19 Changed 15 months ago by dpiringe
- added a new problem data provider AsadzadehProvider and the correspondig instance Asadzadeh1
- implements the test setup of paper Symbolic regression based hybrid semiparametric modelling of processes: An example case of a bending process
- used the Asadzadeh1 instance in StructuredSymbolicRegressionSingleObjectiveProblem for default setup
- added the SubFunctionSymbol in DerivativeCalculator and IntervalArithBoundsEstimator
comment:20 Changed 15 months ago by dpiringe
comment:21 Changed 14 months ago by dpiringe
- set the default template to f(_) when loading a new problem data
- fixed a bug which caused the drawing of uncolored SubFunctionTreeNodes after using the window splitter
- implemented a method to paint nodes of SubFunctionTreeNode as colored nodes for ISymbolicDataAnalysisModel
comment:22 Changed 14 months ago by dpiringe
- recreated problem instance Asdzadeh1 as SheetBendingProcess
- SheetBendingProcess is located in Physics and provided by Physics/PhysicsInstanceProvider
comment:23 Changed 14 months ago by dpiringe
- refactor the evaluation logic of NMSESingleObjectiveConstraintsEvaluator
- refactor the new method Evaluate for PearsonRSquaredAverageSimilarityEvaluator
- change the parameter order of some evaluate/calculate methods
comment:24 Changed 14 months ago by dpiringe
- overrode the method GetActualValue in ValueLookupParameter to get the default value when the execution context is null
- reverted the linear scaling logic for NMSESingleObjectiveConstraintsEvaluator
- in SymbolicRegressionConstantOptimizationEvaluator: removed the usage of GenerateRowsToEvaluate because it uses lookup parameters
- set the value of RelativeNumberOfEvaluatedSamplesParameter for SymbolicRegressionConstantOptimizationEvaluator in StructuredSymbolicRegressionSingleObjectiveProblem if Maximization = true and the SymbolicRegressionConstantOptimizationEvaluator is configured as evaluator
- added the SubFunctionSymbol in TreeToAutoDiffTermConverter
comment:25 Changed 14 months ago by dpiringe
- updated interpeters of type ISymbolicDataAnalysisExpressionTreeInterpreter to support symbols of type SubFunctionSymbol
comment:26 Changed 14 months ago by dpiringe
- added a new information box for StructureTemplate in StructureTemplateView with an extended description about structure templates
comment:27 Changed 14 months ago by dpiringe
- changed the item name of SubFunctionSymbol from SubFunctionSymbol to SubFunction
comment:28 Changed 14 months ago by mkommend
r18146: Merged trunk changes into branch.
comment:29 Changed 14 months ago by mkommend
r18149: Merged trunk changes into branch.
comment:30 Changed 14 months ago by mkommend
r18150: Merged trunk changes into branch.
comment:31 Changed 14 months ago by dpiringe
- fixed eventhandler reregister after deserialisazion/cloning
- added a test case for StructuredSymbolicRegressionSingleObjectiveProblem
- changed the usage of a Dictionary to List
comment:32 Changed 14 months ago by dpiringe
- removed the calculation of EstimationLimits and set the interval [-inf, inf] as default
- this parameter was never adjusted after problem construction -> caused bugs with the change of problem data
- created two new method to setup/create the MultiEncoding and SymbolicExpressionTreeEncoding
- configured the default template f(_) for a structure template
comment:33 Changed 14 months ago by dpiringe
- overwrote the method SetEnabledStateOfControls for StructureTemplateView
- fixed the wrong usage of infoLabel in StructureTemplateView -> added a new label errorLabel for textual output
- deleted the resource file for StructureTemplateView
comment:34 Changed 14 months ago by dpiringe
- merged trunk into branch
comment:35 Changed 14 months ago by dpiringe
- adapted the unit test RunStructureTemplateRegressionSampleTest to match the results
- added the sample to the optimizer start page
comment:36 Changed 14 months ago by dpiringe
- adapted formatters to support SubFunctionSymbol
comment:37 Changed 14 months ago by gkronber
r18158: fixed "essential" unit tests
comment:38 follow-up: ↓ 41 Changed 14 months ago by gkronber
@dpiringe, could you please also add support for "Number" to the native interpreter? I do not have the necessary dev environment installed.
comment:39 follow-up: ↓ 42 Changed 14 months ago by gkronber
Please use BatchInterpreter instead of TreeInterpreter because it is more efficient.
comment:40 Changed 14 months ago by dpiringe
- merged trunk into branch
comment:41 in reply to: ↑ 38 Changed 14 months ago by dpiringe
comment:42 in reply to: ↑ 39 Changed 14 months ago by dpiringe
comment:43 follow-up: ↓ 45 Changed 14 months ago by gkronber
Bugs / suggestion for improvments:
values for num are not optimized / changed (tested with paramopt and batchInterpreter)-> #3140num values cannot be initialized to negative values (<num=-1.5>). Only via workaround: -<num=1.5>.-> #3140- Configuration for sub-functions (grammar, size limits) should be not be lost when reparsing (especially when only a minor detail in the structure expression is changed).
- power-symbol: maybe it would be better to throw in the interpreter only when problematic arguments are actually detected (negative base with real-valued power). This would simplify entry of structure-templates. (we should create a new ticket for this.)
comment:44 Changed 14 months ago by dpiringe
- fixed a bug in StructureTemplateView -> only nodes of type SubFunctionTreeNode are selectable
- added a way to keep old sub functions after parsing a new expression
- overwrote some basic object methods for SubFunction to keep it simple
- only old sub functions, which match the name and signature of the new ones, are saved; examples:
- old: f(x), new: f(x) -> keep old
- old: f(x1), new: f(x1, x2) -> use new
- old: f1(x), new f2(x) -> use new
comment:45 in reply to: ↑ 43 Changed 14 months ago by dpiringe
Replying to gkronber:
Bugs / suggestion for improvments:
values for num are not optimized / changed (tested with paramopt and batchInterpreter)-> #3140num values cannot be initialized to negative values (<num=-1.5>). Only via workaround: -<num=1.5>.-> #3140- Configuration for sub-functions (grammar, size limits) should be not be lost when reparsing (especially when only a minor detail in the structure expression is changed).
- power-symbol: maybe it would be better to throw in the interpreter only when problematic arguments are actually detected (negative base with real-valued power). This would simplify entry of structure-templates. (we should create a new ticket for this.)
added a way to keep old sub functions, see r18164
comment:46 Changed 13 months ago by gkronber
r18176: merged r18165:18174 from trunk to branch (resolving conflicts in the parser)
comment:47 Changed 13 months ago by gkronber
r18177: removed a special case from the Evaluate method because it can never be true. Maximization is fixed to false in this problem and therefore the ParamOptEvaluator is not available as an evaluator.
comment:48 Changed 13 months ago by gkronber
r18178: copied code with minor modifications from ParameterOptimizationEvaluator into the NMSEConstraintsEvaluator because the code in ParameterOptimizationEvaluator uses R² internally and is incompatible to the NMSEEvaluator.
comment:49 Changed 13 months ago by gkronber
r18179: improved parameter optimization for NMSEConstraintsEvaluator. Use LM directly instead of lsfit to improve efficiency by using vectorized callbacks.
comment:50 Changed 13 months ago by gkronber
Open issue: parameters are not written back to the trees stored in the individuals after optimization. The reason is that trees from the individual are cloned and combined with the (cloned) template to construct the full tree. We should think about a way to update optimized parameters in the trees (including parameters which occur in the template).
private ISymbolicExpressionTree BuildTree(Individual individual) { if (StructureTemplate.Tree == null) throw new ArgumentException("No structure template defined!"); var clonedTemplate = (ISymbolicExpressionTree)StructureTemplate.Tree.Clone(); // build main tree foreach (var subFunctionTreeNode in clonedTemplate.IterateNodesPrefix().OfType<SubFunctionTreeNode>()) { var subFunctionTree = individual.SymbolicExpressionTree(subFunctionTreeNode.Name); // add new tree var subTree = subFunctionTree.Root.GetSubtree(0) // Start .GetSubtree(0); // Offset subTree = (ISymbolicExpressionTreeNode)subTree.Clone(); subFunctionTreeNode.AddSubtree(subTree); } return clonedTemplate; }
comment:51 Changed 13 months ago by dpiringe
- fixed missing/wrong event registration for SubFunction and StructuredSymbolicRegressionSingleObjectiveProblem
comment:52 Changed 13 months ago by mkommend
r18183: Fixed bug in parameter optimization code of NMSE evaluator (tree has never been updated).
comment:53 Changed 13 months ago by mkommend
r18184: Refactored structured GP problem.
comment:54 Changed 13 months ago by mkommend
r18185: Removed unused view SubFunctionListView.
comment:55 Changed 13 months ago by mkommend
r18187: Refactored saved trees in structure template.
comment:56 Changed 13 months ago by mkommend
r18188: Fixed backwards compatibility of StructureTemplate
comment:57 Changed 13 months ago by mkommend
r18189: Fixed name of MultiEncodingCreator.
comment:58 Changed 13 months ago by mkommend
- Added parameters for parameter optimization / linear scaling in StructeredSymRegProblem.
- Added license headers to StructureTemplate and StructeredSymRegProblem.
- Fixed type in NMSEConstraint evaluator.
comment:59 Changed 13 months ago by mkommend
r18191: Extracted linear scaling functionality in a dedicated helper class.
comment:60 Changed 13 months ago by mkommend
comment:61 Changed 13 months ago by mkommend
- Added handling of numeric parameters in structur GP problem by using a real vector encoding.
- Configured grammar in sub function.
- Added property for numeric parameters in structure template.
comment:62 Changed 13 months ago by mkommend
r18195: Refactored creation of subfunctions in StructureTemplate.
comment:63 Changed 13 months ago by mkommend
r18196: Fixed bug in adjustment of linear scaling terms.
comment:64 Changed 13 months ago by mkommend
r18197: Omitted parameter optimization of variable weights in the template part of the tree.
comment:65 Changed 13 months ago by mkommend
- Owner changed from dpiringe to gkronber
- Status changed from accepted to reviewing
comment:66 follow-ups: ↓ 68 ↓ 70 Changed 13 months ago by gkronber
Review comments:
- RealVector cross-over raise exception when the template contains only one numeric parameters
- Entering an expression which uses an non-existant input variables (e.g. "f(xyz)") shows error when parse is pressed the first time, but succeeds if it is pressed again without change to the structure.
comment:67 Changed 13 months ago by mkommend
r18198: Fixed error in structure GP if only one numeric parameter is present in the template by providing a new copy crossover for real vectors.
comment:68 in reply to: ↑ 66 Changed 13 months ago by mkommend
comment:69 Changed 13 months ago by mkommend
r18199: Fixed parsing of variables in subfunctions.
comment:70 in reply to: ↑ 66 Changed 13 months ago by mkommend
Replying to gkronber:
- Entering an expression which uses an non-existant input variables (e.g. "f(xyz)") shows error when parse is pressed the first time, but succeeds if it is pressed again without change to the structure.
addressed in r18199
The presence of variables cannot be checked during parsing of the template. Thus, this functionality has been removed and if a non existing variable is used in the template a runtime error is raised.
comment:71 Changed 13 months ago by gkronber
The ItemName of the problem is ... Single Objective Problem (single-ojective). --> simplify
comment:72 Changed 13 months ago by mkommend
r18200: Improved item name and description of structured symreg problem.
comment:73 Changed 13 months ago by gkronber
Todo gkronber: review and merge back to trunk. Todo David: rename class, change default grammar (and configuration), fix unit test.
Topics for future development:
- Performance tuning (evaluate).
- Instead of multi-encoding use a specific new encoding.
- Improve GUI: don't show sub-functions right side of the tree but in the problem instead.
- Shape-constraints for sub-functions.
- Partial dependence plots / visualization / impact-calculation for sub-functions
- Use (restricted) type-coherent grammar instead of arithmetic grammar as a default for sub-functions
- Configuration of number of sub-trees for grammar symbols (currently 1 - 3 arguments).
Not directly related:
- Exclusion of nodes in parameter optimization should be improved, remove code duplication for parameter optimization.
- Remove code duplication for linear scaling.
comment:74 Changed 13 months ago by dpiringe
- changed visibility of string constants in TypeCoherentExpressionGrammar from private to public
- changed default grammar for SubFunction
comment:75 Changed 13 months ago by dpiringe
- renamed StructuredSymbolicRegressionSingleObjectiveProblem to StructureTemplateSymbolicRegressionProblem
comment:76 Changed 13 months ago by dpiringe
- updated test cases for StructureTemplateSymbolicRegressionProblem
- updated optimizer template for StructureTemplateSymbolicRegressionProblem
- updated project file for HeuristicLab.Problems.DataAnalysis.Symbolic.Regression (forgot to include last commit)
comment:77 Changed 12 months ago by gkronber
r18216: merged r18203:18211 from trunk to branch. Merged changes, fixed compile problem ('is not')
comment:78 Changed 12 months ago by gkronber
r18220: reintegrated structure-template GP branch into trunk
comment:79 Changed 12 months ago by gkronber
- Version changed from branch to trunk
comment:80 Changed 12 months ago by gkronber
r18221: deleted branch which was reintegrated into trunk
comment:81 Changed 11 months ago by dpiringe
- set an empty enumerable for arguments to prevent a nullable enumerable
comment:82 Changed 8 months ago by gkronber
- Status changed from reviewing to readytorelease
Closed on github.
comment:83 Changed 8 months ago by gkronber
- Resolution set to done
- Status changed from readytorelease to closed
r18054