Opened 10 years ago
Closed 9 years ago
#2386 closed defect (done)
Poor hash function affects performance in the Parameter-less Population Pyramid
Reported by: | bburlacu | Owned by: | mkommend |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.12 |
Component: | Algorithms.ParameterlessPopulationPyramid | Version: | 3.3.11 |
Keywords: | Cc: | bgoldman |
Description
The C# implementation of PPP uses a HashSet<BinaryVector> to keep track of the set of solutions in the pyramid, which depends on the EnumerableBoolEqualityComparer that implements Equals and GetHashCode.
However, the GetHashCode implementation in EnumerableBoolEqualityComparer produces collisions, leading to uneven buckets in the hashset and to more calls to the Equals method which is obviously much slower than a hash check. The collisions unit test shows a collision rate of aprox. 3.5% for the original GetHashCode implementation.
Replacing the code with the following simple implementation reduces the collision rate to zero in the unit test and greatly improves runtime performance (6 seconds versus 16 seconds for the OneMax problem).
public int GetHashCode(IEnumerable<bool> obj) { unchecked { int hash = 17; foreach (var bit in obj) { hash = hash * 29 + (bit ? 1231 : 1237); } return hash; } }
Change History (4)
comment:1 Changed 10 years ago by bburlacu
- Status changed from new to accepted
comment:2 Changed 10 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from accepted to reviewing
comment:3 Changed 10 years ago by mkommend
- Status changed from reviewing to readytorelease
comment:4 Changed 9 years ago by mkommend
- Resolution set to done
- Status changed from readytorelease to closed
r12392: Updated GetHashCode method in the EnumerableBoolEqualityComparer.