Version 7 (modified by bburlacu, 9 years ago) (diff) 

Crossover Operators
All the GP1 crossover operators implemented in HeuristicLab take two parents (parent0 and parent1) and perform subtree swapping in place, returning the first parent as the crossover result. The runtime complexity of each crossover operator is expressed in terms of the number of tree evaluations (for those crossover algorithms that implement evaluationbased heuristics), but in reality performance is also dependent on problem size and quality measurement formula used.
1 Sizefair crossover
The sizefair crossover operator takes two parents and produces a single offspring, in a number of steps:
 It selects a crossover point (i.e., a node for crossover) from parent0
 It selects a compatible subtree in parent1 (subject to size limits and grammar constraints)
 It performs the swap and returns the (now different) parent0
The selection (step 2) above is influenced by a configurable bias towards internal or leaf nodes, given by the InternalCrossoverPointProbability parameter.
2 Probabilistic functional crossover
This crossover is an implementation of the operator described by C. Bongard in [1] and denoted PFXO (Probabilistic Functional Crossover). Bongard argues that crossover should be controlled in such a way that the original information in the genome would not be so greatly altered. Another argument behind Bongard's approach is that crossover is more likely to produce better offspring if based on the behavioral rather than structural similarity between nodes in the two parents (the socalled “benefic exchange of genetic information”). In order to avoid neutral crosses (that would have zero behavioral effect on the new trees), the procedure is made nondeterministic:
 A crossover point, corresponding to a node i is randomly chosen from parent0
 The behavioral distances between node i and every node j from parent1 are computed using a formula that takes into account the minimum and maximum values computed by the two nodes during evaluation:
 Once all the behavioral distances between node i and every node j have been calculated, they are normalized:
and turned into selection probabilities (weights):
 A node from the second parent is probabilistically chosen and swapped with node i from the first parent
This crossover has to evaluate one subtree from the first parent and n compatible subtrees from the second parent, giving it a runtime complexity of O(n).
3 Semantic similarity crossover
This crossover also follows up on the idea of beneficial exchanges of genetic material, based on a measure of “semantic similarity” as described by Nguyen, et al. in [3, 4].
 The sampling semantics of a subtree is defined as the collection of its evaluated values over a sequence of points from the dataset.
 The sampling semantic distance (SSD) between two nodes (subtrees) is defined as the absolute mean between their corresponding sampling semantics:
where S1 and S2 are the two subtrees, and U, V are their respective sampling semantics.
 Two nodes are deemed similar if their SSD falls within a predefined interval [alfa,beta]:
The crossover procedure chooses a crossover point from parent0, then picks the first node in parent1 that satisfies the similarity condition (5). As far as performance is concerned, we assume that parent1 has n subtrees that could be swapped into parent0:
 In the best case scenario the first subtree in parent1 will match the similarity condition, so the algorithm will only perform two computations of the sampling semantics, so it will return in constant time, the runtime complexity being .
 In the worst case scenario, the last subtree will match (or none), so the total number of sampling semantics computations will be n, resulting in O(n) complexity.
Therefore, on average, the runtime complexity in terms of the number of times the sampling semantics need to be computed will be O(n/2) (linear time).
4 Contextaware crossover
Given a randomly selected crossover point from parent1, contextaware crossover will always find the best location to place the subtree into parent0. Because of the initial random step, the contextaware crossover does guarantee that the resulting offspring will be better than its parents. It makes the best choice given a specific piece of genetic material. The complexity depends on the number of nodes in the first parent, since the algorithm will perform a swap followed by a tree evaluation for every available position, so that it can find out the best insertion point. This means n evaluations for n available positions, giving an O(n) complexity. A discussion of this crossover can be found in [2].
5 Deterministicbest crossover
As opposed to the contextaware crossover, where the insertion point is random and the initial genetic material available for swapping is fixed (the subtree chosen from the second parent), in the case of the deterministicbest crossover, the insertion point in parent0 is fixed, while the algorithm will find the best match from the available nodes in parent1. This also implies n swaps followed by n tree evaluations, with n being the number of available nodes in the second parent, therefore the complexity is also O(n).
6 Depthconstrained crossover
The depthconstrained crossover only acts within a specific depth range, in accordance with the DepthRange parameter that can take three values (for a tree of depth d)
 HighLevel (acting on the upper tree levels between 0 and 0.25*d )
 Standard (acting on the midarea of the tree between depth levels 0.25*d and 0.75*d )
 LowLevel (acting on the lower levels between 0.75*d and d)
References
[1] Bongard, Josh C., "A probabilistic functional crossover operator for genetic programming", in Proceedings of the 12th annual conference on Genetic and evolutionary computation (New York, NY, USA: ACM, 2010), pp. 925932.
[2] Majeed, Hammad and Ryan, Conor, "Using contextaware crossover to improve the performance of GP", in Proceedings of the 8th annual conference on Genetic and evolutionary computation (New York, NY, USA: ACM, 2006), pp. 847854.
[3] Uy, Nguyen Quang and O'Neill, Michael and Hoai, Nguyen Xuan and Mckay, Bob and GalvánLópez, Edgar, "Semantic similarity based crossover in GP: the case for realvalued function regression", in Proceedings of the 9th international conference on Artificial evolution (Berlin, Heidelberg: SpringerVerlag, 2010), pp. 170181.
[4] Uy, Nguyen Quang and Hoai, Nguyen Xuan and O'Neill, Michael and McKay, Bob, "The role of syntactic and semantic locality of crossover in genetic programming", in Proceedings of the 11th international conference on Parallel problem solving from nature: Part II (Berlin, Heidelberg: SpringerVerlag, 2010), pp. 533542.
Attachments (10)

71lyxpreview1.png
(2.9 KB) 
added by bburlacu 9 years ago.
01behavioraldistance

72lyxpreview1.png
(2.5 KB) 
added by bburlacu 9 years ago.
02normalizedbehavioraldistance

73lyxpreview1.png
(2.9 KB) 
added by bburlacu 9 years ago.
03behavioralselectionweights

77lyxpreview1.png
(3.0 KB) 
added by bburlacu 9 years ago.
04semanticsimilaritydistance

83lyxpreview1.png
(4.2 KB) 
added by bburlacu 9 years ago.
05semanticsimilaritycondition
 01PFXbehavioraldistance.png (2.9 KB)  added by bburlacu 9 years ago.
 02PFXnormalizedbehavioraldistance.png (2.5 KB)  added by bburlacu 9 years ago.
 03PFXselectionprobabilities.png (2.9 KB)  added by bburlacu 9 years ago.
 04SSXsamplingsemanticdistance.png (3.0 KB)  added by bburlacu 9 years ago.
 05SSXsimilaritycondition.png (4.2 KB)  added by bburlacu 9 years ago.
Download all attachments as: .zip