Solution caching to avoid costly re-evaluations
|Reported by:||abeham||Owned by:|
|Priority:||medium||Milestone:||HeuristicLab 4.x Backlog|
I implemented the u574 TSP instance in form of a programmable problem and a simple caching dictionary and hit counter in the Evaluate method.
Using a GA with the following settings:
- Crossover: OrderCrossover2
- Elites: 1
- MaximumGenerations: 1000
- MutationProbability: 5 %
- Mutator: InversionManipulator
- PopulationSize: 500
- Selector: TournamentSelector (Group Size: 4)
- ReevaluateElites: False
Out of the 499,500 evaluated solutions 286,503 or about 57% were cache hits. This is of course heavily dependent on the algorithm instance as well as on the problem instance. With an OSGA instance cache hits were slightly below 4%, but in applications where evaluations are really costly even 4% are a substantial amount. Still, having dictionaries with millions of entries is not really efficient. But I think it would be doable to forget about solutions, have a fixed-size cache of reasonable size and still achieve a large parts of the hits as the most recent solutions are probably most likely to achieve a hit than very old ones.
Caches could be implemented on the level of the algorithm or on the level of the problem. However a few remarks:
- The cache would need to be reset between runs
- We will need EqualityComparers
This is probably mostly an issue in discrete combinatorial optimization (binary vectors, integer vectors, permutations, linear linkage arrays) and simulation-based optimization using these encodings.