Free cookie consent management tool by TermsFeed Policy Generator

Opened 5 years ago

Last modified 3 years ago

#3039 closed feature request

BTC (Balanced Tree Creator) for HeuristicLab — at Initial Version

Reported by: bburlacu Owned by: bburlacu
Priority: medium Milestone: HeuristicLab 3.3.17
Component: Encodings.SymbolicExpressionTreeEncoding Version: trunk
Keywords: genetic programming, tree initialization Cc:

Description

I would like to propose a new tree creation algorithm with two main advantages over PTC2 (Probabilistic Tree Creator):

  • much simpler implementation (tree creation logic fits in 30 lines of code)
  • produces trees of minimal depth (very well-balanced trees), no need to tune the maximum depth parameter

Like the PTC2, the BTC is able to follow a user-specified distribution of tree lengths as well as desired symbol frequencies.

The idea behind the algorithm is to keep expanding a "horizon" of open slots (expansion points) in breadth-wise fashion while keeping track of the difference to the target length.

Sample implementation:

public static ISymbolicExpressionTree CreateExpressionTree(IRandom random, ISymbolicExpressionGrammar grammar, int targetLength, int maxDepth) {
  var allowedSymbols = grammar.AllowedSymbols.Where(x => x.InitialFrequency > 0.0).ToList();
  var tree = MakeStump(random, grammar);
  var tuples = new List<NodeInfo>(targetLength) {
    new NodeInfo { Node = tree.Root, Depth = 0, Arity = 1 },
    new NodeInfo { Node = tree.Root.GetSubtree(0), Depth = 1, Arity = 1 }
  };
  int remLength = targetLength - 2; // remaining length; I already have a root and start node
  int openSlots = 1; // startNode has arity 1

  for (int i = 1; i < tuples.Count; ++i) {
    var t = tuples[i];
    var node = t.Node;
    for (int j = 0; j < t.Arity; ++j) {
      // min and max arity here refer to the required arity limits for the child node
      int maxChildArity = t.Depth == maxDepth - 1 ? 0 : Math.Min(GetMaxChildArity(grammar, allowedSymbols, node.Symbol), remLength - openSlots);
      int minChildArity = Math.Min(1, maxChildArity);
      var child = SampleNode(random, grammar, allowedSymbols, node.Symbol, minChildArity, maxChildArity);
      var childArity = random.Next(grammar.GetMinimumSubtreeCount(child.Symbol), grammar.GetMaximumSubtreeCount(child.Symbol) + 1);
      var childDepth = t.Depth + 1;

      node.AddSubtree(child);
      tuples.Add(new NodeInfo { Node = child, Depth = childDepth, Arity = childArity });

      remLength--;
      openSlots += childArity - 1;
    }
  }
  return tree;
}  

Preliminary test: BTC vs PTC2 with 100K trees with uniform length from U(1,100) and max depth 100, and a grammar G={+,-,*,/,exp,log,constant}:

  • BTC:
    Function distribution:	
    ProgramRootSymbol:	3.22%
    StartSymbol:	3.22%
    Logarithm:	16.23%
    Division:	15.30%
    Exponential:	16.22%
    Subtraction:	15.29%
    Addition:	15.27%
    Multiplication:	15.25%
    
    Distribution of function arities:	
    01:00	23.66%
    02:00	37.19%
    00:00	39.15%
    
    Terminal distribution:
    Constant:	100.00%
    Average tree depth:	9
    Average tree length:	51.02238
    Total nodes created:	5102238
    Average shape:	404.66003
    
  • PTC2:
    Function distribution:	
    ProgramRootSymbol:	3.16%
    StartSymbol:	3.16%
    Exponential:	16.22%
    Division:	15.32%
    Logarithm:	16.25%
    Addition:	15.28%
    Multiplication:	15.29%
    Subtraction:	15.34%
    
    Distribution of function arities:	
    01:00	23.59%
    02:00	37.24%
    00:00	39.16%
    
    Terminal distribution:
    Constant:	100.00%
    Average tree depth:	13
    Average tree length:	52.05025
    Total nodes created:	5205025
    Average shape:	507.90366
    
  • The numbers above are nearly identical between BTC and PTC2
  • The average shape calculated as the nested tree length averaged over all generated trees shows that BTC produces more balanced trees
  • Performance wise, BTC with a rudimentary cache for grammar queries achieves a speed of ~7200 trees/s, while PTC2 achieves a speed of ~1600 trees/s.

Change History (2)

Changed 5 years ago by bburlacu

BTC Tree Length Distribution

Changed 5 years ago by bburlacu

PTC2 Tree Length Distribution

Note: See TracTickets for help on using tickets.