Opened 2 years ago

Closed 2 years ago

#2319 closed feature request (done)

Add LinearLinkageEncoding (LLE)

Reported by: abeham Owned by: abeham
Priority: medium Milestone: HeuristicLab 3.3.12
Component: Encodings.LinearLinkageEncoding Version: 3.3.11
Keywords: Cc:

Description (last modified by abeham)

LLE is a better suited encoding for grouping problems than a simple IntegerVector (= group number encoding or GNE) or a more complex IntegerVector with items that may appear only once and others that function as separators (= permutation with repetition or PWR).

The idea of LLE is that the group is defined such that one item references another. LLE can be implemented in the form of a vector where the index denotes the item and the value denotes the item that is referenced, e.g.

Item# 1 2 3 4 5 6 7 8 9
Link 2 4 6 9 7 6 8 8 9

It is further important that an item must only link to an item with higher or equal index. If an item references itself it is called an ending item which is the last in the group. It is also not allowed that two items link to the same item which would create a tree-like structure. An item must only be linked from exactly one other item (or from itself). The consequence is that except for ending items all numbers appear at most once in the vector.

The above example identifies the following three groups:

  1. { 1, 2, 4, 9 }
  2. { 3, 6 }
  3. { 5, 7, 8 }

The advantage over GNE or PWR is that the group is an emerging concept of the structure and not directly encoded in the vector. Thus, the ambiguities that are created by GNE and the larger solution space that comes with PWR can be avoided.

There are different variants:

  • LLE is the variant described above
  • LLE-b is the same, but with backward links instead of forward links
  • LLE-e uses the number of the highest element in the group as group number

Even though LLE-e may be similar to GNE the difference is that the group number is not directly encoded and given by the largest element in the group. GNE or PWR would still make sense in the case groups have an identity and cannot be considered equal.

References:

  1. Du, J., Korkmaz, E.E., Alhajj, R., and Barker, K. 2004. Novel Clustering Approach that employs Genetic Algorithm with New Representation Scheme and Multiple Objectives. Data Warehousing and Knowledge Discovery, pp. 219-228. Springer Berlin Heidelberg. (Link).
  2. Ülker, Ö., Özcan, E., and Korkmaz, E.E. 2006. Linear Linkage Encoding in Grouping Problems: Applications on Graph Coloring and Timetabling. In Practice and Theory of Automated Timetabling VI, pp. 347-363. Springer Berlin Heidelberg. (PDF).
  3. Korkmaz, E.E. 2010. Multi-objective Genetic Algorithms for grouping problems. Applied Intelligence, 33, pp. 179-192. Springer Berlin Heidelberg. (Link).
  4. Yilmaz, B., and Korkmaz, E.E. 2010. Representation Issue In Graph Coloring. In Proceedings of the 10th International Conference on Intelligent Systems Design and Applications, pp. 1171-1176. IEEE. (Link).

Attachments (3)

Genetic Algorithm LLE Test.hl (47.4 KB) - added by abeham 2 years ago.
A genetic algorithm that solves a grouping problem
Grouping Benchmark Problem LLE.cs (2.0 KB) - added by abeham 2 years ago.
Code for a benchmark grouping problem using LLE
Grouping Benchmark Problem GNE.cs (1.9 KB) - added by abeham 2 years ago.
Code for a benchmark grouping problem using GNE (group number encoding) using an IntegerVector

Download all attachments as: .zip

Change History (20)

comment:1 Changed 2 years ago by abeham

  • Owner set to abeham
  • Status changed from new to accepted

r12285: Added branch for linear linkage encoding (LLE-e)

comment:2 Changed 2 years ago by abeham

  • Description modified (diff)

comment:3 Changed 2 years ago by abeham

  • Description modified (diff)
  • Summary changed from Add LinearLinkageEncoding to Add LinearLinkageEncoding (LLE)

comment:4 Changed 2 years ago by abeham

  • Description modified (diff)

comment:5 Changed 2 years ago by abeham

r12286:

  • Removed LargestGroupFirstCrossover (slow and not mentioned in the literature)
  • Added manipulators
  • Added multi-operators

comment:6 Changed 2 years ago by abeham

r12288:

  • Changed encoding to represent linkages in LLE (as opposed to LLE-e)
  • Added GraftManipulator
  • Added repair procedure
  • Improved performance of some crossovers

comment:7 Changed 2 years ago by abeham

r12396: Added SinglePointCrossover

comment:8 Changed 2 years ago by abeham

r12643:

  • Added two new solution creators
  • Added swap2 neighborhood
  • Added license headers
  • Pruned usings
  • Fixed namespaces

comment:9 Changed 2 years ago by abeham

  • Component changed from ### Undefined ### to Encodings.LinearLinkageEncoding

comment:10 Changed 2 years ago by abeham

r12648: Fixed single-point crosssover

Changed 2 years ago by abeham

A genetic algorithm that solves a grouping problem

Changed 2 years ago by abeham

Code for a benchmark grouping problem using LLE

Changed 2 years ago by abeham

Code for a benchmark grouping problem using GNE (group number encoding) using an IntegerVector

comment:11 Changed 2 years ago by abeham

  • Owner changed from abeham to mkommend
  • Status changed from accepted to reviewing

r12650: integrated into trunk r12651: removed branch

comment:12 Changed 2 years ago by mkommend

Review Comments

LinearLinkage

  • Lots of code duplication (e.g, GetGroup could reuse GetGroupForward)
  • GetGroupBackward is missing
  • Currently the linear linkage encoding only works for items starting at 0 and being sequentially numbered without missing values. It is not possible to represent items that do not adhere to this rule (e.g. 5,10,99,-3,"abc").
  • Validate method is obsolete. The functionality could be taken over by GetGroups. Again lots of code duplication

MultiLLECrossover

  • IMHO multi-operators should only discover operators defined in the same assembly to avoid the discovery of problem-dependent and not applicable operators.

The other code looks good and is really nicely documented.

Last edited 2 years ago by mkommend (previous) (diff)

comment:13 Changed 2 years ago by mkommend

  • Owner changed from mkommend to abeham
  • Status changed from reviewing to assigned

comment:14 Changed 2 years ago by abeham

  • Status changed from assigned to accepted

comment:15 follow-up: Changed 2 years ago by abeham

  • Owner changed from abeham to mkommend
  • Status changed from accepted to reviewing

r12701:

  • Some code cleanup in LinearLinkage and updates to the documentation
  • Fixed operator discovery in multi*operators
  • Fixed bug in MultiLLEManipulator with respect to successful offspring analysis

Thank you for the short-termed extensive review!

Regarding your third point: The consecutive zero-based numbering is by design (c.f. Permutation). But you raise a valid point with your critique, to be honest that would also be on my wish-list. Nevertheless, there are some obstacles that I would like to raise and which would require more discussion before we could go into such a direction:

  1. Solution objects would be generic types, the generic type class would likely only be a HeuristicLab type and not one of the user's domain (which however would make most sense)
  2. Solutions are often cloned and of which many instances exist during an optimization run. Storing semantic information together with the solution requires additional memory. I favor more compact data structures as solutions.
  3. Some operators require the consecutive zero-based structure for operation (e.g. ERX for Permutation or GroupCrossover for LLE). Having a List<T> instead of a Permutation and a AnonymousGrouping<T> instead of a LinearLinkage would complicate the implementation of these crossover or manipulation operators. I think that the semantic could rather be an additional object/layer.

Regarding your fourth remark, it is true that Validate() is not currently used - I would not say obsolete. I'd like to provide that as a public method. I looked into merging the code with GetGroups since they're both similar, but GetGroups allocates lists which is unnecessary for validation. While code duplication is worth worrying about when implementing this I considered it to be less harmful in this special case for two reasons:

  1. Solution encoding code is pretty stable and unlikely to change
  2. The duplicate code parts are still located very close together and not distributed over classes or assemblies.

Some cases are however unreasonable as you noticed and I did reduce them thanks to your suggestions.

If you have additional objections, please let me know, otherwise please forward this ticket to release.

comment:16 in reply to: ↑ 15 Changed 2 years ago by mkommend

  • Owner changed from mkommend to abeham
  • Status changed from reviewing to readytorelease

Replying to abeham: Reviewed r12701 and it addressed most of the issues raised in my review.

Regarding your third point: The consecutive zero-based numbering is by design (c.f. Permutation). But you raise a valid point with your critique, to be honest that would also be on my wish-list. Nevertheless, there are some obstacles that I would like to raise and which would require more discussion before we could go into such a direction:

  1. Solution objects would be generic types, the generic type class would likely only be a HeuristicLab type and not one of the user's domain (which however would make most sense)
  2. Solutions are often cloned and of which many instances exist during an optimization run. Storing semantic information together with the solution requires additional memory. I favor more compact data structures as solutions.
  3. Some operators require the consecutive zero-based structure for operation (e.g. ERX for Permutation or GroupCrossover for LLE). Having a List<T> instead of a Permutation and a AnonymousGrouping<T> instead of a LinearLinkage would complicate the implementation of these crossover or manipulation operators. I think that the semantic could rather be an additional object/layer.

I haven't meant to extend the individual class LL. Instead I thought about a ctor in the encoding itself that gets a collection of items passed and that the encoding could translate a concrete individual back to group of items (GetGroup => translate back). Therefore, solutions do not have to be a generic type (1.), no overhead in cloning would occur (2.) and all operators would still work (3.). However, this feature could be implemented after the release.

Regarding your fourth remark, it is true that Validate() is not currently used - I would not say obsolete. I'd like to provide that as a public method. I looked into merging the code with GetGroups since they're both similar, but GetGroups allocates lists which is unnecessary for validation. While code duplication is worth worrying about when implementing this I considered it to be less harmful in this special case for two reasons:

  1. Solution encoding code is pretty stable and unlikely to change
  2. The duplicate code parts are still located very close together and not distributed over classes or assemblies.

OK.

comment:17 Changed 2 years ago by abeham

  • Resolution set to done
  • Status changed from readytorelease to closed

r12737: merged 12650,12701 to stable

Note: See TracTickets for help on using tickets.