Opened 11 years ago
Closed 11 years ago
#2386 closed defect (done)
Poor hash function affects performance in the Parameter-less Population Pyramid
| Reported by: | bburlacu | Owned by: | mkommend |
|---|---|---|---|
| Priority: | medium | Milestone: | HeuristicLab 3.3.12 |
| Component: | Algorithms.ParameterlessPopulationPyramid | Version: | 3.3.11 |
| Keywords: | Cc: | bgoldman |
Description
The C# implementation of PPP uses a HashSet<BinaryVector> to keep track of the set of solutions in the pyramid, which depends on the EnumerableBoolEqualityComparer that implements Equals and GetHashCode.
However, the GetHashCode implementation in EnumerableBoolEqualityComparer produces collisions, leading to uneven buckets in the hashset and to more calls to the Equals method which is obviously much slower than a hash check. The collisions unit test shows a collision rate of aprox. 3.5% for the original GetHashCode implementation.
Replacing the code with the following simple implementation reduces the collision rate to zero in the unit test and greatly improves runtime performance (6 seconds versus 16 seconds for the OneMax problem).
public int GetHashCode(IEnumerable<bool> obj) {
unchecked {
int hash = 17;
foreach (var bit in obj) {
hash = hash * 29 + (bit ? 1231 : 1237);
}
return hash;
}
}
Change History (4)
comment:1 Changed 11 years ago by bburlacu
- Status changed from new to accepted
comment:2 Changed 11 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from accepted to reviewing
comment:3 Changed 11 years ago by mkommend
- Status changed from reviewing to readytorelease
comment:4 Changed 11 years ago by mkommend
- Resolution set to done
- Status changed from readytorelease to closed



r12392: Updated GetHashCode method in the EnumerableBoolEqualityComparer.