Last change
on this file since 15861 was
15840,
checked in by gkronber, 7 years ago
|
#2886 added utility console program for clustering of expressions
|
File size:
679 bytes
|
Line | |
---|
1 | Goal is to cluster 10 Mio. functions with ~ 100 samples.
|
---|
2 |
|
---|
3 | Hierarchical clustering (agglomative clustering) seems useful.
|
---|
4 |
|
---|
5 | Approximate hierarchical clustering methods:
|
---|
6 | - Happieclust: terminates with runtime exceptions
|
---|
7 | - Twistertries: need to implement a small Java program to test this.
|
---|
8 |
|
---|
9 |
|
---|
10 | We could implement our own clustering if we have a fast method for finding nearest neighbours.
|
---|
11 | Approximate nearest neighbours:
|
---|
12 | - Benchmarks with many current techniques: https://github.com/erikbern/ann-benchmarks
|
---|
13 | - Annoy (Spotify) https://github.com/spotify/annoy
|
---|
14 | - https://github.com/FALCONN-LIB/FALCONN
|
---|
15 | - Fastest in benchmarks: https://github.com/searchivarius/nmslib |
---|
Note: See
TracBrowser
for help on using the repository browser.