Free cookie consent management tool by TermsFeed Policy Generator

Opened 3 years ago

Closed 3 years ago

#3136 closed feature request (done)

Structure templates for symbolic regression

Reported by: dpiringe Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.17
Component: Problems.DataAnalysis.Symbolic Version: trunk
Keywords: Cc:

Description (last modified by gkronber)

Implementing a new problem type with a structure template parameter for symbolic regression. The following components are necessary:

  • MultiEncoding
  • Views
  • Parser for structure: to parse an infix expression
  • Evaluator: based on IABoundEstimator
  • Interpreter: builds a full expression, based on structure template parameter and (multiple) evolved sub-expressions

Change History (83)

comment:1 Changed 3 years ago by dpiringe

  • Status changed from new to accepted

comment:2 Changed 3 years ago by dpiringe

  • Summary changed from GP ProblemType with Structural Parameters to Structural GP
  • Version changed from 3.3.16 to branch

r18054

  • branched trunk

comment:3 Changed 3 years ago by dpiringe

r18061

  • added a new type of problem called StructuredSymbolicRegressionSingleObjectiveProblem, represents a first construct for future implementations

comment:4 Changed 3 years ago by dpiringe

r18062

  • added a new Symbol SubFunctionSymbol for sub functions
  • modified InfixExpressionParser to support SubFunctionSymbol (parsing of variableNames still in work)
  • modified StructuredSymbolicRegressionSingleObjectiveProblem to extract sub functions and add them to MultiEncoding

comment:5 Changed 3 years ago by dpiringe

r18063

  • added view components and classes for sub functions

comment:6 Changed 3 years ago by dpiringe

r18065

  • modified InfixExpressionParser to fully support SubFunctionSymbol
    • created a SubFunctionTreeNode to store the function arguments
  • modified StructureTemplateView to regenerate the content state
  • first implementation for the main tree build up logic

comment:7 Changed 3 years ago by dpiringe

r18066

  • added a simple way of evaluation (using r2 evaluator)
  • added a simple analyzing logic for "Best Tree"
  • added a connection to SubFunction in SubFunctionTreeNode

comment:8 Changed 3 years ago by dpiringe

r18067

  • changed the StructureTemplateView -> nodes of type SubFunctionTreeNode are now clickable
    • still need to overhaul the UI elements
  • added a way to parse SubFunctionTreeNode with a unique name in InfixExpressionParser

comment:9 Changed 3 years ago by dpiringe

r18068

  • modified the StructureTemplateView to enable colorful tree nodes of type SubFunctionTreeNode
  • refactored SubFunctionTreeNode, SubFunction and StructureTemplate

comment:10 Changed 3 years ago by dpiringe

r18069

  • added linear scaling support for structure template parameter

comment:11 Changed 3 years ago by dpiringe

r18071

  • added linear scaling logic in Evaluate and (for UI reasons) Analyze
  • added logic forSubFunctionSymbol (modified OpCodes) -> the SubFunctionTreeNode is display in the tree but has no effect on evaluation (works like a flag)
    • works now with SymbolicDataAnalysisExpressionTreeInterpreter
  • default grammar for SubFunction is now ArithmeticExpressionGrammar instead of LinearScalingGrammar

comment:12 Changed 3 years ago by chaider

r18072

  • Added info text in StructureTemplateView
  • Fixed cloning constructors
  • Added check if linear scaling nodes are set

comment:13 Changed 3 years ago by dpiringe

r18073

  • fixed a bug: parsing an expression now resets the viewHost content, this prevents to view old content of non-existing sub-functions

comment:14 Changed 3 years ago by chaider

r18074

-Fixed cloning of StructureTemplate

comment:15 Changed 3 years ago by dpiringe

r18075

  • added a hidden interpreter parameter for StructuredSymbolicRegressionSingleObjectiveProblem
  • fixed a bug which crashed the application by changing ProblemData with different variables
  • fixed a bug which crashed the application by running the problem with an empty StructureTemplate
  • added a better output of exceptions of type AggregateException
  • added and resize event handler to repaint nodes of type SubFunctionTreeNode
  • code cleanup

comment:16 Changed 3 years ago by gkronber

  • Description modified (diff)
  • Milestone changed from HeuristicLab 4.0 to HeuristicLab 3.3.17
  • Summary changed from Structural GP to Structure templates for symbolic regression

comment:17 Changed 3 years ago by dpiringe

r18076

  • added new parameters
  • added the builded tree into the scope, this allows operators to use the final tree
  • added new operators

comment:18 Changed 3 years ago by dpiringe

r18081

  • changed the visibility of the following parameters: EstimationLimitsParameter, EvaluatorParameter and BestTrainingSolutionParameter
  • added first steps to set an evaluator as parameter
    • added a new parameter TreeEvaluatorParameter
    • added a temporary logic to static evaluator method Calculate
    • tried to change a lot of necessary parameters to use the method Evaluate, this caused a lot of problems -> reverted all changes

comment:19 Changed 3 years ago by dpiringe

r18084

  • added a new problem data provider AsadzadehProvider and the correspondig instance Asadzadeh1
    • implements the test setup of paper Symbolic regression based hybrid semiparametric modelling of processes: An example case of a bending process
  • used the Asadzadeh1 instance in StructuredSymbolicRegressionSingleObjectiveProblem for default setup
  • added the SubFunctionSymbol in DerivativeCalculator and IntervalArithBoundsEstimator

comment:20 Changed 3 years ago by dpiringe

r18095

  • added a Evaluate method, which uses the static method Calculate and evaluates a ISymbolicExpressionTree without the need of an ExecutionContext
    • implemented this new method in all single objective SymReg evaluators

comment:21 Changed 3 years ago by dpiringe

r18099

  • set the default template to f(_) when loading a new problem data
  • fixed a bug which caused the drawing of uncolored SubFunctionTreeNodes after using the window splitter
  • implemented a method to paint nodes of SubFunctionTreeNode as colored nodes for ISymbolicDataAnalysisModel

comment:22 Changed 3 years ago by dpiringe

r18101

  • recreated problem instance Asdzadeh1 as SheetBendingProcess
  • SheetBendingProcess is located in Physics and provided by Physics/PhysicsInstanceProvider

comment:23 Changed 3 years ago by dpiringe

r18103

  • refactor the evaluation logic of NMSESingleObjectiveConstraintsEvaluator
  • refactor the new method Evaluate for PearsonRSquaredAverageSimilarityEvaluator
  • change the parameter order of some evaluate/calculate methods

comment:24 Changed 3 years ago by dpiringe

r18104

  • overrode the method GetActualValue in ValueLookupParameter to get the default value when the execution context is null
  • reverted the linear scaling logic for NMSESingleObjectiveConstraintsEvaluator
  • in SymbolicRegressionConstantOptimizationEvaluator: removed the usage of GenerateRowsToEvaluate because it uses lookup parameters
  • set the value of RelativeNumberOfEvaluatedSamplesParameter for SymbolicRegressionConstantOptimizationEvaluator in StructuredSymbolicRegressionSingleObjectiveProblem if Maximization = true and the SymbolicRegressionConstantOptimizationEvaluator is configured as evaluator
  • added the SubFunctionSymbol in TreeToAutoDiffTermConverter

comment:25 Changed 3 years ago by dpiringe

r18133

  • updated interpeters of type ISymbolicDataAnalysisExpressionTreeInterpreter to support symbols of type SubFunctionSymbol

comment:26 Changed 3 years ago by dpiringe

r18134

  • added a new information box for StructureTemplate in StructureTemplateView with an extended description about structure templates

comment:27 Changed 3 years ago by dpiringe

r18139

  • changed the item name of SubFunctionSymbol from SubFunctionSymbol to SubFunction

comment:28 Changed 3 years ago by mkommend

r18146: Merged trunk changes into branch.

comment:29 Changed 3 years ago by mkommend

r18149: Merged trunk changes into branch.

comment:30 Changed 3 years ago by mkommend

r18150: Merged trunk changes into branch.

comment:31 Changed 3 years ago by dpiringe

r18151

  • fixed eventhandler reregister after deserialisazion/cloning
  • added a test case for StructuredSymbolicRegressionSingleObjectiveProblem
  • changed the usage of a Dictionary to List

comment:32 Changed 3 years ago by dpiringe

r18152

  • removed the calculation of EstimationLimits and set the interval [-inf, inf] as default
    • this parameter was never adjusted after problem construction -> caused bugs with the change of problem data
  • created two new method to setup/create the MultiEncoding and SymbolicExpressionTreeEncoding
  • configured the default template f(_) for a structure template

comment:33 Changed 3 years ago by dpiringe

r18154

  • overwrote the method SetEnabledStateOfControls for StructureTemplateView
  • fixed the wrong usage of infoLabel in StructureTemplateView -> added a new label errorLabel for textual output
  • deleted the resource file for StructureTemplateView

comment:34 Changed 3 years ago by dpiringe

r18155

  • merged trunk into branch

comment:35 Changed 3 years ago by dpiringe

r18156

  • adapted the unit test RunStructureTemplateRegressionSampleTest to match the results
  • added the sample to the optimizer start page

comment:36 Changed 3 years ago by dpiringe

r18157

  • adapted formatters to support SubFunctionSymbol

comment:37 Changed 3 years ago by gkronber

r18158: fixed "essential" unit tests

comment:38 follow-up: Changed 3 years ago by gkronber

@dpiringe, could you please also add support for "Number" to the native interpreter? I do not have the necessary dev environment installed.

Last edited 3 years ago by gkronber (previous) (diff)

comment:39 follow-up: Changed 3 years ago by gkronber

Please use BatchInterpreter instead of TreeInterpreter because it is more efficient.

comment:40 Changed 3 years ago by dpiringe

r18161

  • merged trunk into branch

comment:41 in reply to: ↑ 38 Changed 3 years ago by dpiringe

Replying to gkronber:

@dpiringe, could you please also add support for "Number" to the native interpreter? I do not have the necessary dev environment installed.

implemented it in ticket #3140 and merged it back into this branch as well as trunk

comment:42 in reply to: ↑ 39 Changed 3 years ago by dpiringe

Replying to gkronber:

Please use BatchInterpreter instead of TreeInterpreter because it is more efficient.

r18162: changed the parameter Interpreter in StructuredSymbolicRegressionSingleObjectiveProblem to use SymbolicDataAnalysisExpressionTreeBatchInterpreter as default interpreter

comment:43 follow-up: Changed 3 years ago by gkronber

Bugs / suggestion for improvments:

  • values for num are not optimized / changed (tested with paramopt and batchInterpreter) -> #3140
  • num values cannot be initialized to negative values (<num=-1.5>). Only via workaround: -<num=1.5>. -> #3140
  • Configuration for sub-functions (grammar, size limits) should be not be lost when reparsing (especially when only a minor detail in the structure expression is changed).
  • power-symbol: maybe it would be better to throw in the interpreter only when problematic arguments are actually detected (negative base with real-valued power). This would simplify entry of structure-templates. (we should create a new ticket for this.)
Last edited 3 years ago by gkronber (previous) (diff)

comment:44 Changed 3 years ago by dpiringe

r18164

  • fixed a bug in StructureTemplateView -> only nodes of type SubFunctionTreeNode are selectable
  • added a way to keep old sub functions after parsing a new expression
    • overwrote some basic object methods for SubFunction to keep it simple
    • only old sub functions, which match the name and signature of the new ones, are saved; examples:
      • old: f(x), new: f(x) -> keep old
      • old: f(x1), new: f(x1, x2) -> use new
      • old: f1(x), new f2(x) -> use new
Last edited 3 years ago by dpiringe (previous) (diff)

comment:45 in reply to: ↑ 43 Changed 3 years ago by dpiringe

Replying to gkronber:

Bugs / suggestion for improvments:

  • values for num are not optimized / changed (tested with paramopt and batchInterpreter) -> #3140
  • num values cannot be initialized to negative values (<num=-1.5>). Only via workaround: -<num=1.5>. -> #3140
  • Configuration for sub-functions (grammar, size limits) should be not be lost when reparsing (especially when only a minor detail in the structure expression is changed).
  • power-symbol: maybe it would be better to throw in the interpreter only when problematic arguments are actually detected (negative base with real-valued power). This would simplify entry of structure-templates. (we should create a new ticket for this.)

added a way to keep old sub functions, see r18164

comment:46 Changed 3 years ago by gkronber

r18176: merged r18165:18174 from trunk to branch (resolving conflicts in the parser)

comment:47 Changed 3 years ago by gkronber

r18177: removed a special case from the Evaluate method because it can never be true. Maximization is fixed to false in this problem and therefore the ParamOptEvaluator is not available as an evaluator.

comment:48 Changed 3 years ago by gkronber

r18178: copied code with minor modifications from ParameterOptimizationEvaluator into the NMSEConstraintsEvaluator because the code in ParameterOptimizationEvaluator uses R² internally and is incompatible to the NMSEEvaluator.

comment:49 Changed 3 years ago by gkronber

r18179: improved parameter optimization for NMSEConstraintsEvaluator. Use LM directly instead of lsfit to improve efficiency by using vectorized callbacks.

comment:50 Changed 3 years ago by gkronber

Open issue: parameters are not written back to the trees stored in the individuals after optimization. The reason is that trees from the individual are cloned and combined with the (cloned) template to construct the full tree. We should think about a way to update optimized parameters in the trees (including parameters which occur in the template).

    private ISymbolicExpressionTree BuildTree(Individual individual) {
      if (StructureTemplate.Tree == null)
        throw new ArgumentException("No structure template defined!");

      var clonedTemplate = (ISymbolicExpressionTree)StructureTemplate.Tree.Clone();

      // build main tree
      foreach (var subFunctionTreeNode in clonedTemplate.IterateNodesPrefix().OfType<SubFunctionTreeNode>()) {
        var subFunctionTree = individual.SymbolicExpressionTree(subFunctionTreeNode.Name);

        // add new tree
        var subTree = subFunctionTree.Root.GetSubtree(0)  // Start
                                          .GetSubtree(0); // Offset
        subTree = (ISymbolicExpressionTreeNode)subTree.Clone();
        subFunctionTreeNode.AddSubtree(subTree);

      }
      return clonedTemplate;
    }

comment:51 Changed 3 years ago by dpiringe

r18182

  • fixed missing/wrong event registration for SubFunction and StructuredSymbolicRegressionSingleObjectiveProblem

comment:52 Changed 3 years ago by mkommend

r18183: Fixed bug in parameter optimization code of NMSE evaluator (tree has never been updated).

Last edited 3 years ago by mkommend (previous) (diff)

comment:53 Changed 3 years ago by mkommend

r18184: Refactored structured GP problem.

comment:54 Changed 3 years ago by mkommend

r18185: Removed unused view SubFunctionListView.

comment:55 Changed 3 years ago by mkommend

r18187: Refactored saved trees in structure template.

comment:56 Changed 3 years ago by mkommend

r18188: Fixed backwards compatibility of StructureTemplate

comment:57 Changed 3 years ago by mkommend

r18189: Fixed name of MultiEncodingCreator.

comment:58 Changed 3 years ago by mkommend

r18190:

  • Added parameters for parameter optimization / linear scaling in StructeredSymRegProblem.
  • Added license headers to StructureTemplate and StructeredSymRegProblem.
  • Fixed type in NMSEConstraint evaluator.

comment:59 Changed 3 years ago by mkommend

r18191: Extracted linear scaling functionality in a dedicated helper class.

comment:60 Changed 3 years ago by mkommend

r18192:

  • Extracted parameter optimization into dedicated helper utility.
  • Implemented evaluation in the structured SymReg problem directly.

comment:61 Changed 3 years ago by mkommend

r18194:

  • Added handling of numeric parameters in structur GP problem by using a real vector encoding.
  • Configured grammar in sub function.
  • Added property for numeric parameters in structure template.

comment:62 Changed 3 years ago by mkommend

r18195: Refactored creation of subfunctions in StructureTemplate.

comment:63 Changed 3 years ago by mkommend

r18196: Fixed bug in adjustment of linear scaling terms.

comment:64 Changed 3 years ago by mkommend

r18197: Omitted parameter optimization of variable weights in the template part of the tree.

comment:65 Changed 3 years ago by mkommend

  • Owner changed from dpiringe to gkronber
  • Status changed from accepted to reviewing

comment:66 follow-ups: Changed 3 years ago by gkronber

Review comments:

  • RealVector cross-over raise exception when the template contains only one numeric parameters
  • Entering an expression which uses an non-existant input variables (e.g. "f(xyz)") shows error when parse is pressed the first time, but succeeds if it is pressed again without change to the structure.

comment:67 Changed 3 years ago by mkommend

r18198: Fixed error in structure GP if only one numeric parameter is present in the template by providing a new copy crossover for real vectors.

comment:68 in reply to: ↑ 66 Changed 3 years ago by mkommend

Replying to gkronber:

Review comments:

  • RealVector cross-over raise exception when the template contains only one numeric parameters

addressed in r18198

Last edited 3 years ago by mkommend (previous) (diff)

comment:69 Changed 3 years ago by mkommend

r18199: Fixed parsing of variables in subfunctions.

comment:70 in reply to: ↑ 66 Changed 3 years ago by mkommend

Replying to gkronber:

  • Entering an expression which uses an non-existant input variables (e.g. "f(xyz)") shows error when parse is pressed the first time, but succeeds if it is pressed again without change to the structure.

addressed in r18199

The presence of variables cannot be checked during parsing of the template. Thus, this functionality has been removed and if a non existing variable is used in the template a runtime error is raised.

comment:71 Changed 3 years ago by gkronber

The ItemName of the problem is ... Single Objective Problem (single-ojective). --> simplify

comment:72 Changed 3 years ago by mkommend

r18200: Improved item name and description of structured symreg problem.

comment:73 Changed 3 years ago by gkronber

Todo gkronber: review and merge back to trunk. Todo David: rename class, change default grammar (and configuration), fix unit test.

Topics for future development:

  • Performance tuning (evaluate).
  • Instead of multi-encoding use a specific new encoding.
  • Improve GUI: don't show sub-functions right side of the tree but in the problem instead.
  • Shape-constraints for sub-functions.
  • Partial dependence plots / visualization / impact-calculation for sub-functions
  • Use (restricted) type-coherent grammar instead of arithmetic grammar as a default for sub-functions
  • Configuration of number of sub-trees for grammar symbols (currently 1 - 3 arguments).

Not directly related:

  • Exclusion of nodes in parameter optimization should be improved, remove code duplication for parameter optimization.
  • Remove code duplication for linear scaling.
Last edited 3 years ago by gkronber (previous) (diff)

comment:74 Changed 3 years ago by dpiringe

r18205

  • changed visibility of string constants in TypeCoherentExpressionGrammar from private to public
  • changed default grammar for SubFunction

comment:75 Changed 3 years ago by dpiringe

r18206

  • renamed StructuredSymbolicRegressionSingleObjectiveProblem to StructureTemplateSymbolicRegressionProblem

comment:76 Changed 3 years ago by dpiringe

r18207

  • updated test cases for StructureTemplateSymbolicRegressionProblem
  • updated optimizer template for StructureTemplateSymbolicRegressionProblem
  • updated project file for HeuristicLab.Problems.DataAnalysis.Symbolic.Regression (forgot to include last commit)

comment:77 Changed 3 years ago by gkronber

r18216: merged r18203:18211 from trunk to branch. Merged changes, fixed compile problem ('is not')

comment:78 Changed 3 years ago by gkronber

r18220: reintegrated structure-template GP branch into trunk

comment:79 Changed 3 years ago by gkronber

  • Version changed from branch to trunk

comment:80 Changed 3 years ago by gkronber

r18221: deleted branch which was reintegrated into trunk

comment:81 Changed 3 years ago by dpiringe

r18224

  • set an empty enumerable for arguments to prevent a nullable enumerable

comment:82 Changed 3 years ago by gkronber

  • Status changed from reviewing to readytorelease

Closed on github.

comment:83 Changed 3 years ago by gkronber

  • Resolution set to done
  • Status changed from readytorelease to closed
Note: See TracTickets for help on using tickets.