Last change
on this file since 15861 was
15840,
checked in by gkronber, 7 years ago
|
#2886 added utility console program for clustering of expressions
|
File size:
679 bytes
|
Rev | Line | |
---|
[15840] | 1 | Goal is to cluster 10 Mio. functions with ~ 100 samples.
|
---|
| 2 |
|
---|
| 3 | Hierarchical clustering (agglomative clustering) seems useful.
|
---|
| 4 |
|
---|
| 5 | Approximate hierarchical clustering methods:
|
---|
| 6 | - Happieclust: terminates with runtime exceptions
|
---|
| 7 | - Twistertries: need to implement a small Java program to test this.
|
---|
| 8 |
|
---|
| 9 |
|
---|
| 10 | We could implement our own clustering if we have a fast method for finding nearest neighbours.
|
---|
| 11 | Approximate nearest neighbours:
|
---|
| 12 | - Benchmarks with many current techniques: https://github.com/erikbern/ann-benchmarks
|
---|
| 13 | - Annoy (Spotify) https://github.com/spotify/annoy
|
---|
| 14 | - https://github.com/FALCONN-LIB/FALCONN
|
---|
| 15 | - Fastest in benchmarks: https://github.com/searchivarius/nmslib |
---|
Note: See
TracBrowser
for help on using the repository browser.