Free cookie consent management tool by TermsFeed Policy Generator

Changes between Version 2 and Version 3 of Features/Genetic Programming


Ignore:
Timestamp:
12/24/11 14:34:02 (12 years ago)
Author:
gkronber
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Features/Genetic Programming

    v2 v3  
    1616 * Full-tree shaker (uniform mutation, only parameters)
    1717
     18= Genetic Programming Tasks =
     19
    1820The symbolic expression tree encoding contains the implementation of the problem independent data structure (a tree of symbols) and operators. Based on the encoding we implemented three well-known GP problems.
    1921
     
    2325 * Artificial Ant problem (`HeuristicLab.Problems.ArtificialAnt`)
    2426
    25 Samples for all three of these problems are available on the `Start page` in HeuristicLab. Additional samples are available here: [[http://dev.heuristiclab.com/trac/hl/core/wiki/UsersSamples#GeneticProgramming]]
     27Samples for all three of these problems are available on the `start page` in HeuristicLab. Additional samples are available here: [[http://dev.heuristiclab.com/trac/hl/core/wiki/UsersSamples#GeneticProgramming]]
     28
     29== Symbolic regression ==
     30Symbolic regression is a particular variant of data analysis. The goal is to find a functional expression of any form which estimates the values of a specified target variable (continuous) based on values of a set of input variables (discrete/continuous). It is thus similar to linear regression, however, in linear regression the structure of the model is predefined (linear combination of the input variables) and only the parameters for the model have to be estimated while in symbolic regression the structure of the model is unconstrained and the structure as well as the parameters have to be determined. This is a problem that can be solved by genetic programming.
     31
     32The symbols that can be used in symbolic regression models is problem dependent. So these symbols (mathematical operators: +,-,/,* ... and operands: constants, variables) are implemented in the problem specific plugin `HeuristicLab.Problems.DataAnalysis.Symbolic`. Evaluators for these problem-specific instances of symbolic expression trees are also implemented in this plugin.
     33
     34HeuristicLab provides many possibilities to analyse the resulting symbolic regression solutions. Some of them are shown in the screenshot.
     35
     36Symbolic regression features:
     37 * Tree visualization of models
     38 * Large range of accuracy metrics: MSE, MAE, relative error, R², NMSE
     39 * Visualization of model output: scatter plot, line chart, error characteristic curve
     40 * Automatic simplification of models by transformation into a canonical form
     41 * Visualization of the impact strength of branches
     42 * Manual simplification (pruning)
     43 * Model export for LaTeX publications
     44
     45[[Image(http://dev.heuristiclab.com/trac/hl/core/raw-attachment/wiki/UsersSamples/GP_TowerResponse-screenshot.png, width=500)]]
     46
     47== Symbolic classification ==
     48The goal in symbolic classification is to find a model that estimates the target class value (discrete) from the value of input variables (discrete, continuous). In our implementation the models for symbolic regression and symbolic classification can contain the same symbols (operators and operands). Thus we actually evolve a discriminating function for the classes and later any threshold value can be used to map the continuous output of the discriminating function to the actual class values.
     49
     50However, we provide a number of additional accuracy metrics and output visualizations that are useful for classification solutions. Some of them are shown in the screenshot on the right side.
     51
     52Symbolic classification features:
     53 * Classification specific accuracy metrics: accuracy, Gini coefficient, AUC,
     54 * Visualization of model output: classification output with threshold, receiver operating characteristic curve
     55 * Misclassification matrix
     56
     57Of course the following features are also available for symbolic classification solutions:
     58 * Tree visualization of models
     59 * Automatic simplification of models by transformation into a canonical form
     60 * Visualization of the impact strength of branches
     61 * Manual simplification (pruning)
     62 * Model export for LaTeX publications
     63
     64[[Image(http://dev.heuristiclab.com/trac/hl/core/raw-attachment/wiki/UsersSamples/GP_WisconsinDiagnostic-screenshot.png, width=500)]]
     65
     66
     67== Artificial Ant ==
     68The artificial ant problem is a very simple implementation of a genetic programming problem. Right now we only provide the Santa-Fe ant trail as problem instance. The implementation of the artificial ant problem can be used as a reference implementation when you want to implement your own plugin for genetic programming. It shows how to define the problem-specific symbol set and a grammar which defines valid tree structures. It also contains a problem specific evaluator and solution   visualization. On the right you can see the solution visualization for the artifical ant problem.
     69
     70[[Image(http://dev.heuristiclab.com/trac/hl/core/raw-attachment/wiki/UsersSamples/SantaFe Result.png, width=500)]]
     71
     72== Algorithms for Genetic Programming ==
     73Because of the strict separation of the implementations of algorithms, problems, and encodings in HeuristicLab. It is possible to solve symbolic regression of symbolic classification problems with all population based algorithms using either a crossover or manipulation operator to evolve solutions. The algorithm that comes closest to the original formulation of genetic programming by Koza is the genetic algorithm.
     74
     75Available algorithms for genetic programming:
     76* genetic algorithm
     77* evolution strategy
     78* island genetic algorithm
     79* offspring selection genetic algorithm (and even SASEGASA) 
     80
     81In the future we plan to implement move operators for the symbolic expression tree encoding. This would make it possible to solve genetic programming problems also with trajectory-based algorithms like tabu-search, particle swarm optimization, or variable neighborhood search.   
     82
     83
     84= Advanced Topics =
     85
     86== Automatically Defined Functions (ADF) ==
     87We also provide an implementation for automatically defined functions (ADFs) as described by Koza. Automatically defined functions allow to extract useful code fragments into a separate function which can be called multiple times either while executing the result producing branch or another ADF. ADFs are created and manipulated by architecture altering operators while evolving genetic programming solutions. This means ADFs underlie selection pressure in the same way as the main result producing branch and the evolutionary process determines if ADFs survive in the final population and which code fragments are moved to ADFs.   
     88
     89Supported architecture altering operators (3.3.6)
     90 * Subroutine creater
     91 * Subroutine deleter
     92 * Subrouting duplicater 
     93 * Argument creater
     94 * Argument deleter
     95 * Argument duplicater
     96
     97Supporting ADFs fully and correctly is not easy and lead to many non-trivial changes in the data types and operators for symbolic expression trees.