Posts for the month of June 2012

Symbolic Regression Benchmark Functions

At last year's GECCO in Dublin a discussion revolved around the fact that the genetic programming community needs a set of suitable benchmark problems. Many experiments presented in the GP literature are based on very simple toy problems and thus the results are often unconvincing. The whole topic is summarized on ​http://gpbenchmarks.org.

This page also lists benchmark problems for symbolic regression from a number of different papers. Thanks to our new developer Stefan Forstenlechner most of these problems are now available in HeuristicLab and will be included in the next release. The benchmark problems can be easily loaded directly in the GUI through problem instance providers. Additionally, it is very simple to create an experiment to execute an algorithm on all instances using the new 'Create Experiment' dialog implemented by Andreas (see the ​ previous blog post)

I used these new features to quickly apply a random forest regression algorithm (R=0.7, Number of trees=50) on all regression benchmark problems and got the following results. Let's see how symbolic regression with GP will perform...

Problem instance Avg. R² (test)
Keijzer 4 f(x) = 0.3 * x *sin(2 * PI * x) 0.984
Keijzer 5 f(x) = x ^ 3 * exp(-x) * cos(x) * sin(x) * (sin(x) ^ 2 * cos(x) - 1) 1.000
Keijzer 6 f(x) = (30 * x * z) / ((x - 10) * y^2) 0.956
Keijzer 7 f(x) = Sum(1 / i) From 1 to X 0.911
Keijzer 8 f(x) = log(x) 1.000
Keijzer 9 f(x) = sqrt(x) 1.000
Keijzer 11 f(x, y) = x ^ y 0.957
Keijzer 12 f(x, y) = xy + sin((x - 1)(y - 1)) 0.267
Keijzer 13 f(x, y) = x^4 - x^3 + y^2 / 2 - y 0.610
Keijzer 14 f(x, y) = 6 * sin(x) * cos(y) 0.321
Keijzer 15 f(x, y) = 8 / (2 + x^2 + y^2) 0.484
Keijzer 16 f(x, y) = x^3 / 5 + y^3 / 2 - y - x 0.599
Korns 1 y = 1.57 + (24.3 * X3) 0.998
Korns 2 y = 0.23 + (14.2 * ((X3 + X1) / (3.0 * X4))) 0.009
Korns 3 y = -5.41 + (4.9 * (((X3 - X0) + (X1 / X4)) / (3 * X4))) 0.023
Korns 4 y = -2.3 + (0.13 * sin(X2)) 0.384
Korns 5 y = 3.0 + (2.13 * log(X4)) 0.977
Korns 6 y = 1.3 + (0.13 * sqrt(X0)) 0.997
Korns 7 y = 213.80940889 - (213.80940889 * exp(-0.54723748542 * X0)) 0.000
Korns 8 y = 6.87 + (11 * sqrt(7.23 * X0 * X3 * X4)) 0.993
Korns 9 y = ((sqrt(X0) / log(X1)) * (exp(X2) / square(X3))) 0.000
Korns 10 y = 0.81 + (24.3 * (((2.0 * X1) + (3.0 * square(X2))) / ((4.0 * cube(X3)) + (5.0 * quart(X4))))) 0.003
Korns 11 y = 6.87 + (11 * cos(7.23 * X0 * X0 * X0)) 0.000
Korns 12 y = 2.0 - (2.1 * (cos(9.8 * X0) * sin(1.3 * X4))) 0.001
Korns 13 y = 32.0 - (3.0 * ((tan(X0) / tan(X1)) * (tan(X2) / tan(X3)))) 0.000
Korns 14 y = 22.0 + (4.2 * ((cos(X0) - tan(X1)) * (tanh(X2) / sin(X3)))) 0.000
Korns 15 y = 12.0 - (6.0 * ((tan(X0) / exp(X1)) * (log(X2) - tan(X3)))) 0.000
Nguyen F1 = x^3 + x^2 + x 0.944
Nguyen F2 = x^4 + x^3 + x^2 + x 0.992
Nguyen F3 = x^5 + x^4 + x^3 + x^2 + x 0.983
Nguyen F4 = x^6 + x^5 + x^4 + x^3 + x^2 + x 0.960
Nguyen F5 = sin(x^2)cos(x) - 1 0.975
Nguyen F6 = sin(x) + sin(x + x^2) 0.997
Nguyen F7 = log(x + 1) + log(x^2 + 1) 0.977
Nguyen F8 = Sqrt(x) 0.966
Nguyen F9 = sin(x) + sin(y^2) 0.988
Nguyen F10 = 2sin(x)cos(y) 0.986
Nguyen F11 = x^y 0.961
Nguyen F12 = x^4 - x^3 + y^2/2 - y 0.979
Spatial co-evolution F(x,y) = 1/(1+power(x,-4)) + 1/(1+pow(y,-4)) 0.983
TowerData 0.972
Vladislavleva Kotanchek 0.854
Vladislavleva RatPol2D 0.785
Vladislavleva RatPol3D 0.795
Vladislavleva Ripple 0.951
Vladislavleva Salutowicz 0.996
Vladislavleva Salutowicz2D 0.960
Vladislavleva UBall5D 0.892
  • Posted: 2012-06-22 11:32 (Updated: 2012-07-08 05:16)
  • Author: gkronber
  • Categories: (none)
  • Comments (0)

Parameter Variation Experiments in Upcoming HeuristicLab Release

We've included a new feature in the upcoming release of HeuristicLab 3.3.7 that will make it more comfortable to create parameter variation experiments.

Metaheuristics and data analysis methods often have a number of parameters which highly influence their behavior and thus the quality that you obtain in applying them on a certain problem. The best parameters are usually not known a priori and you can either use metaoptimization (available from the download page under "Additional packages") or create a set of experiments where each parameter is varied accordingly. In the upcoming release we've made this variation task a lot easier.

We have enhanced the "Create Experiments" dialog that is available through the Edit menu. To try out the new feature you can obtain the latest daily build from the Download page and load one of the samples. The dialog allows you to specify the values of several parameters and allows you to create an experiment where all configurations are enumerated.

We have also included the new problem instance infrastructure in this dialog which further allows you to test certain configurations on a number of benchmark instances from several benchmark libraries.

Finally, here are a couple of points that you should be aware of to make effective use of this feature. You can view this as a kind of checklist, before creating and executing experiments:

  • Before creating an experiment make sure you prepare the algorithm accordingly, set all parameter that you do not want to vary to the value that you intend. If the algorithm contains any runs, clear them first.
  • Review the selected analyzers carefully, maybe you want to exclude the quality chart and some other analyzers that would produce too much data for a large experiment. Or maybe you want to output the analyzers only every xth iteration.
  • Make sure you check to include in the run only those problem and algorithm parameters that you need. Think twice before showing a parameter in the run that requires a lot of memory.
  • Make sure SetSeedRandomly (if available) is set to true if you intend to repeat each configuration.
  • When you make experiments with dependent parameters you have to resolve the dependencies and create separate experiments. For example, when you have one parameter that specifies a lower bound and another that specifies an upper bound you should create separate experiments for each lower bound so that you don't obtain configurations where the upper bound is lower than the lower bound.
  • Finally, while you vary the parameters keep an eye on the number of variations. HeuristicLab doesn't prevent you from creating very large experiments, but if there are many variations you might want to create separate experiments.