HeuristicLab: Ticket Query
https://dev.heuristiclab.com/trac.fcgi/query?status=accepted&milestone=HeuristicLab+3.3.17&col=id&col=component&col=summary&col=owner&col=version&col=changetime&order=version
HeuristicLab - A Paradigm-Independent and Extensible Environment for Heuristic Optimizationen-USHeuristicLabhttps://dev.heuristiclab.com/trac.fcgi/chrome/site/HeuristicLabBanner.png
https://dev.heuristiclab.com/trac.fcgi/query?status=accepted&milestone=HeuristicLab+3.3.17&col=id&col=component&col=summary&col=owner&col=version&col=changetime&order=version
Trac 1.0.12
https://dev.heuristiclab.com/trac.fcgi/ticket/2389
https://dev.heuristiclab.com/trac.fcgi/ticket/2389#2389: Implement lexicase selection for GPThu, 21 May 2015 18:17:42 GMTgkronber<p>
The GECCO paper in which eps-Lexicase has been first described is a bit short on detail.
</p>
<p>
A much better (bug-fixed) description is available in <a class="ext-link" href="https://arxiv.org/pdf/1709.05394.pdf"><span class="icon"></span>https://arxiv.org/pdf/1709.05394.pdf</a>
</p>
<p>
In this paper different variants are described and compared. Semi-dynamic eps-Lexicase seems to perform well.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2389#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2635
https://dev.heuristiclab.com/trac.fcgi/ticket/2635#2635: Predict if a child should be rejected during OSGA offspring selectionThu, 14 Jul 2016 15:24:13 GMTbburlacu<p>
The idea is to try to predict if a child should be rejected by evaluating its fitness on a smaller partition of the training data. It would then make sense to save computation time when the child is already very weak compared to its parents and has no chance of being selected.
</p>
<p>
A new plugin should be implemented containing a specialized evaluator together with an analyzer.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2635#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2679
https://dev.heuristiclab.com/trac.fcgi/ticket/2679#2679: Trunk integration of GoalSeekingProblemTue, 04 Oct 2016 10:11:27 GMTbburlacu<p>
The <tt>GoalSeekingProblem</tt> (formerly known as the <tt>ProcessParameterOptimizationProblem</tt>) was initially developed for tuning the parameters of a physical process in collaboration with an industry partner. The methodology should be refactored towards the more general purpose of goal seeking and the design should be improved.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2679#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2704
https://dev.heuristiclab.com/trac.fcgi/ticket/2704#2704: Generate random regression benchmark instancesWed, 23 Nov 2016 13:50:21 GMTbburlacu<p>
The ability to randomly generate symbolic regression benchmarks according to user-defined rules would be useful for testing new algorithms and for the development of knowledge networks.
</p>
<p>
A previous effort exists (ticket <a class="accepted ticket" href="https://dev.heuristiclab.com/trac.fcgi/ticket/2083" title="feature request: HeuristicLab Benchmark Problem Generator (accepted)">#2083</a>) for generating data according to user specified formulas. This ticket has a different scope as the focus is shifted towards generation of random instances based on user-defined templates.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2704#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2719
https://dev.heuristiclab.com/trac.fcgi/ticket/2719#2719: Datastream AnalysisWed, 14 Dec 2016 16:18:12 GMTjzenisek<p>
Adding a new optimizer which enables simulating a sequential datastream and matching pre-built ensemble models against it. Outcome of this matching mechanism should be the online classification of incoming (unseen) data series in a predictive fashion.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2719#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2817
https://dev.heuristiclab.com/trac.fcgi/ticket/2817#2817: Improve speed of bin packingWed, 02 Aug 2017 14:50:14 GMTabeham<p>
The <tt>GenerateNewExtremePointsForNewItem</tt> method uses a rather primitive way to generate new points. Intersection between vector and plane should be used to calculate the extreme points.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2817#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2906
https://dev.heuristiclab.com/trac.fcgi/ticket/2906#2906: Variable-Transformations for Data AnalysisTue, 06 Mar 2018 10:32:44 GMTpfleck<p>
The current transformation feature (implemented during the first version of the data preprocessing) is neither practical nor functioning satisfactorily.
Originally, the transformation feature was intended to support data analysis by being able to specify transformations to the data to make the training process easier for the learning algorithms.
</p>
<p>
Possible usage scenarios are:
</p>
<ul><li>Scale variables to a given range (e.g. 0 - 1)
</li><li>Z-Normalize a variable
</li><li>log-Transform a variable
</li></ul><p>
After training with transformed variables, an intermediate step is required, that performs the data transformation on the original values before feeding them to the actual model.
This creates two options for calculating the model accuracy (R², MSE, ...), depending on whether the calculation is based on
</p>
<ul><li>the transformed variables or
</li><li>the original variables.
</li></ul><p>
While the first describes the model-accuracy in terms of the training algorithm, the later describes how the model actually performs in real use.
Currently, we are not sure which option is better; therefore, we want to support both options.
</p>
<h4 id="Additionalthoughts">Additional thoughts</h4>
<ul><li>Performing the intermediate step of transforming the original variables before feeding them to the actual model could be done with a "Transformation-Model" that wraps the original model.
</li></ul><ul><li>From the users' perspective, the transformations could be done "explicitly" or "hidden", i.e. actually showing the transformed variables in the Dataset and displaying them as additional input-or as target variable, or showing the original Dataset and performing the transformation hidden from the user. Currently, we want to make transformations explicitly visible to the user.
</li></ul><ul><li>Each transformation must also specify an inverse transformation that has to be applied in case a transformation is performed on the target variable. For instance, if the target variable is log-transformed, the intermediate model must use the exponential function to transform the target back to its original value range.
</li></ul><ul><li>For symbolic regression, the intermediate model can be also applied by directly changing the model tree.
</li></ul>Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2906#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2922
https://dev.heuristiclab.com/trac.fcgi/ticket/2922#2922: DataCompletenessChart is still slowWed, 23 May 2018 05:06:39 GMTgkronber<p>
In <a class="closed ticket" href="https://dev.heuristiclab.com/trac.fcgi/ticket/2393" title="defect: DataCompleteness chart is slow (closed: done)">#2393</a> the performance of the data completeness chart has been improved.
</p>
<p>
However, when I used it with a dataset (30000 rows 100 variables) it still locked up. It should be faster.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2922#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2968
https://dev.heuristiclab.com/trac.fcgi/ticket/2968#2968: Add a new formatter for symbolic expressions, which produces an infix expression as well as a list of all coefficient valuesTue, 04 Dec 2018 12:25:16 GMTchaider<p>
Representation of coefficients of the mathematical representation for solution models. Display coefficients as strings to be able to copy/paste values into other programs for further analysis.
</p>
<p>
e.g.
</p>
<p>
Mathematical Representation:
</p>
<p>
Y = (c0 * x + c1 * y + c2 * z)
</p>
<p>
Coefficients:
</p>
<p>
c0 = -0.03129
</p>
<p>
c1 = 1.5293
</p>
<p>
c2 = -0.93221
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2968#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2989
https://dev.heuristiclab.com/trac.fcgi/ticket/2989#2989: Moving Peaks BenchmarkWed, 13 Feb 2019 21:14:21 GMTswagner<p>
Implement the Moving Peaks Benchmark Problem, introduced by J. Branke.
</p>
<p>
Branke, J.: Memory Enhanced Evolutionary Algorithms for Changing Optimization Problems. In: Proceedings of the IEEE Congress on Evolutionary Computation (CEC 1999), pp. 1875-1882, 1999.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2989#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2990
https://dev.heuristiclab.com/trac.fcgi/ticket/2990#2990: Variable-Impact-based Feature SelectionSun, 17 Feb 2019 11:02:43 GMTpfleck<p>
Using the variable impacts for iterative feature selection can currently only be done manually.
Having an automatic way would be nice.
</p>
<p>
General workflow of this Variable-Impact-based Feature Selection:
</p>
<ol><li>Train Model
</li><li>Calculate Variable Impacts
</li><li>Remove Features with low impact
</li><li>Repeat until model accuracy drops too much
</li></ol>Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2990#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/3026
https://dev.heuristiclab.com/trac.fcgi/ticket/3026#3026: Integration into SymSpaceMon, 02 Sep 2019 14:11:09 GMTdpiringe<p>
STP4.4 Interfacing beyond Mechatronics / K24401 - SymSpace
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/3026#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2938
https://dev.heuristiclab.com/trac.fcgi/ticket/2938#2938: The parser for symbolic expressions in infix form parses negative numbers and fractions stranglyThu, 16 Aug 2018 16:03:22 GMTgkronber<p>
Some examples:
</p>
<ul><li>Input: <tt>-1.0*x + 2.0</tt>
</li><li>Expected:
<pre class="wiki"> (+
(* -1.0 x)
2.0)
</pre></li><li>Output:
<pre class="wiki">(+
(- (* 1.0 x))
2.0)
</pre></li></ul><ul><li>Input: <tt>exp(-1.0*x - 2.0)</tt>
</li><li>Output:
<pre class="wiki">(exp (*
-1.0
(+
(* 1.0 x)
2.0)))
</pre></li><li>Expected:
<pre class="wiki">(exp (-
(* -1.0 x)
2.0))
</pre></li></ul><ul><li>Input: 3/3+2/2+1/1
</li><li>Output:
<pre class="wiki"> ((3 * 1/3) + (2 * 1/2) + (1 * 1/1))
</pre></li></ul><p>
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2938#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/3039
https://dev.heuristiclab.com/trac.fcgi/ticket/3039#3039: BTC (Balanced Tree Creator) for HeuristicLabFri, 25 Oct 2019 11:29:14 GMTbburlacu<p>
I would like to propose a new tree creation algorithm with two main advantages over <tt>PTC2</tt> (Probabilistic Tree Creator):
</p>
<ul><li>much simpler implementation (tree creation logic fits in 30 lines of code)
</li><li>produces trees of minimal depth (very well-balanced trees), no need to tune the maximum depth parameter
</li></ul><p>
Like the <tt>PTC2</tt>, the <tt>BTC</tt> is able to follow a user-specified distribution of tree lengths as well as desired symbol frequencies.
</p>
<p>
The idea behind the algorithm is to keep expanding a "horizon" of open slots (expansion points) in breadth-wise fashion while keeping track of the difference to the target length.
</p>
<p>
Sample implementation:
</p>
<div class="code"><pre><span class="k">public</span> <span class="k">static</span> ISymbolicExpressionTree <span class="nf">CreateExpressionTree</span><span class="p">(</span>IRandom random<span class="p">,</span> ISymbolicExpressionGrammar grammar<span class="p">,</span> <span class="kt">int</span> targetLength<span class="p">,</span> <span class="kt">int</span> maxDepth<span class="p">)</span> <span class="p">{</span>
<span class="kt">var</span> allowedSymbols <span class="p">=</span> grammar<span class="p">.</span>AllowedSymbols<span class="p">.</span>Where<span class="p">(</span>x <span class="p">=></span> x<span class="p">.</span>InitialFrequency <span class="p">></span> <span class="m">0.0</span><span class="p">).</span>ToList<span class="p">();</span>
<span class="kt">var</span> tree <span class="p">=</span> MakeStump<span class="p">(</span>random<span class="p">,</span> grammar<span class="p">);</span>
<span class="kt">var</span> tuples <span class="p">=</span> <span class="k">new</span> List<span class="p"><</span>NodeInfo<span class="p">>(</span>targetLength<span class="p">)</span> <span class="p">{</span>
<span class="k">new</span> NodeInfo <span class="p">{</span> Node <span class="p">=</span> tree<span class="p">.</span>Root<span class="p">,</span> Depth <span class="p">=</span> <span class="m">0</span><span class="p">,</span> Arity <span class="p">=</span> <span class="m">1</span> <span class="p">},</span>
<span class="k">new</span> NodeInfo <span class="p">{</span> Node <span class="p">=</span> tree<span class="p">.</span>Root<span class="p">.</span>GetSubtree<span class="p">(</span><span class="m">0</span><span class="p">),</span> Depth <span class="p">=</span> <span class="m">1</span><span class="p">,</span> Arity <span class="p">=</span> <span class="m">1</span> <span class="p">}</span>
<span class="p">};</span>
<span class="kt">int</span> remLength <span class="p">=</span> targetLength <span class="p">-</span> <span class="m">2</span><span class="p">;</span> <span class="c1">// remaining length; I already have a root and start node
</span> <span class="kt">int</span> openSlots <span class="p">=</span> <span class="m">1</span><span class="p">;</span> <span class="c1">// startNode has arity 1
</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> i <span class="p">=</span> <span class="m">1</span><span class="p">;</span> i <span class="p"><</span> tuples<span class="p">.</span>Count<span class="p">;</span> <span class="p">++</span>i<span class="p">)</span> <span class="p">{</span>
<span class="kt">var</span> t <span class="p">=</span> tuples<span class="p">[</span>i<span class="p">];</span>
<span class="kt">var</span> node <span class="p">=</span> t<span class="p">.</span>Node<span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> j <span class="p">=</span> <span class="m">0</span><span class="p">;</span> j <span class="p"><</span> t<span class="p">.</span>Arity<span class="p">;</span> <span class="p">++</span>j<span class="p">)</span> <span class="p">{</span>
<span class="c1">// min and max arity here refer to the required arity limits for the child node
</span> <span class="kt">int</span> maxChildArity <span class="p">=</span> t<span class="p">.</span>Depth <span class="p">==</span> maxDepth <span class="p">-</span> <span class="m">1</span> <span class="p">?</span> <span class="m">0</span> <span class="p">:</span> Math<span class="p">.</span>Min<span class="p">(</span>GetMaxChildArity<span class="p">(</span>grammar<span class="p">,</span> allowedSymbols<span class="p">,</span> node<span class="p">.</span>Symbol<span class="p">),</span> remLength <span class="p">-</span> openSlots<span class="p">);</span>
<span class="kt">int</span> minChildArity <span class="p">=</span> Math<span class="p">.</span>Min<span class="p">(</span><span class="m">1</span><span class="p">,</span> maxChildArity<span class="p">);</span>
<span class="kt">var</span> child <span class="p">=</span> SampleNode<span class="p">(</span>random<span class="p">,</span> grammar<span class="p">,</span> allowedSymbols<span class="p">,</span> node<span class="p">.</span>Symbol<span class="p">,</span> minChildArity<span class="p">,</span> maxChildArity<span class="p">);</span>
<span class="kt">var</span> childArity <span class="p">=</span> random<span class="p">.</span>Next<span class="p">(</span>grammar<span class="p">.</span>GetMinimumSubtreeCount<span class="p">(</span>child<span class="p">.</span>Symbol<span class="p">),</span> grammar<span class="p">.</span>GetMaximumSubtreeCount<span class="p">(</span>child<span class="p">.</span>Symbol<span class="p">)</span> <span class="p">+</span> <span class="m">1</span><span class="p">);</span>
<span class="kt">var</span> childDepth <span class="p">=</span> t<span class="p">.</span>Depth <span class="p">+</span> <span class="m">1</span><span class="p">;</span>
node<span class="p">.</span>AddSubtree<span class="p">(</span>child<span class="p">);</span>
tuples<span class="p">.</span>Add<span class="p">(</span><span class="k">new</span> NodeInfo <span class="p">{</span> Node <span class="p">=</span> child<span class="p">,</span> Depth <span class="p">=</span> childDepth<span class="p">,</span> Arity <span class="p">=</span> childArity <span class="p">});</span>
remLength<span class="p">--;</span>
openSlots <span class="p">+=</span> childArity <span class="p">-</span> <span class="m">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">return</span> tree<span class="p">;</span>
<span class="p">}</span>
</pre></div><p>
Preliminary test: <tt>BTC</tt> vs <tt>PTC2</tt> with 100K trees with uniform length from <tt>U(1,100)</tt> and max depth 100, and a grammar <tt>G={+,-,*,/,exp,log,constant}</tt>:
</p>
<ul><li><tt>BTC</tt>:
<pre class="wiki">Function distribution:
ProgramRootSymbol: 3.22%
StartSymbol: 3.22%
Logarithm: 16.23%
Division: 15.30%
Exponential: 16.22%
Subtraction: 15.29%
Addition: 15.27%
Multiplication: 15.25%
Distribution of function arities:
01:00 23.66%
02:00 37.19%
00:00 39.15%
Terminal distribution:
Constant: 100.00%
Average tree depth: 9
Average tree length: 51.02238
Total nodes created: 5102238
Average shape: 404.66003
</pre></li></ul><ul><li><tt>PTC2</tt>:
<pre class="wiki">Function distribution:
ProgramRootSymbol: 3.16%
StartSymbol: 3.16%
Exponential: 16.22%
Division: 15.32%
Logarithm: 16.25%
Addition: 15.28%
Multiplication: 15.29%
Subtraction: 15.34%
Distribution of function arities:
01:00 23.59%
02:00 37.24%
00:00 39.16%
Terminal distribution:
Constant: 100.00%
Average tree depth: 13
Average tree length: 52.05025
Total nodes created: 5205025
Average shape: 507.90366
</pre></li></ul><ul><li>The numbers above are nearly identical between <tt>BTC</tt> and <tt>PTC2</tt>
</li></ul><ul><li>The average shape calculated as the nested tree length averaged over all generated trees shows that <tt>BTC</tt> produces more balanced trees
</li></ul><ul><li>Performance wise, <tt>BTC</tt> with a rudimentary cache for grammar queries achieves a speed of <tt>~7200 trees/s</tt>, while <tt>PTC2</tt> achieves a speed of <tt>~1600 trees/s</tt>.
</li></ul><ul><li>Distribution of tree lengths:
</li></ul><p>
<a style="padding:0; border:none" href="https://dev.heuristiclab.com/trac.fcgi/attachment/ticket/3039/BTC.png"><img src="https://dev.heuristiclab.com/trac.fcgi/raw-attachment/ticket/3039/BTC.png" alt="BTC Tree Length Distribution" title="BTC Tree Length Distribution" /></a>
<a style="padding:0; border:none" href="https://dev.heuristiclab.com/trac.fcgi/attachment/ticket/3039/PTC2.png"><img src="https://dev.heuristiclab.com/trac.fcgi/raw-attachment/ticket/3039/PTC2.png" alt="PTC2 Tree Length Distribution" title="PTC2 Tree Length Distribution" /></a>
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/3039#changelog
https://dev.heuristiclab.com/trac.fcgi/ticket/2907
https://dev.heuristiclab.com/trac.fcgi/ticket/2907#2907: Refactoring of Datapreprocessing-ManipulationsWed, 14 Mar 2018 10:47:18 GMTfholzing<p>
The current implementation of the Datapreprocessing-Manipulations gives a hard time writing additional manipulations. No discovery of additional manipulations is possible and the ui and logic is highly interweaved. The refactoring should simplify the implementation of additional manipulations.
</p>
Resultshttps://dev.heuristiclab.com/trac.fcgi/ticket/2907#changelog