Opened 12 years ago
Closed 12 years ago
#1913 closed feature request (done)
Implement Neighborhood Components Analysis (NCA)
Reported by: | abeham | Owned by: | abeham |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.8 |
Component: | Algorithms | Version: | 3.3.8 |
Keywords: | Cc: |
Description
It is a potential improvement on k nearest neighbor proposed by Goldberger et al. in 2004.
Change History (31)
comment:2 Changed 12 years ago by abeham
- Status changed from new to accepted
comment:3 Changed 12 years ago by abeham
comment:4 Changed 12 years ago by abeham
- Worked on NCA
- Added scatter plot view for the model to show training data when it is reduced to two dimensions
It works, but I don't think it works correctly yet. I have randomized the initial matrix, because the starting point influences the achievable quality quite a bit.
comment:5 Changed 12 years ago by abeham
- Reduced memory footprint
- Fixed some bugs
- Added some tests
comment:6 Changed 12 years ago by abeham
r8424: corrected tests and fixed bugs
comment:7 Changed 12 years ago by abeham
r8425: Added several initialization methods (LDA, PCA, and Random)
comment:8 Changed 12 years ago by abeham
- Updated view
- Cloned ProblemData for solution
comment:9 Changed 12 years ago by abeham
- Improved speed of NCA
- Reorganized things
comment:10 Changed 12 years ago by abeham
r8441: added quality output
comment:11 Changed 12 years ago by abeham
- Refactored NCAModel and NeighborhoodComponentsAnalysis algorithm
- Model now includes NearestNeighborModel
- Algorithm has ability to be canceled (basically recreated the optimization loop of mincgoptimize)
- Scaling should work properly now
comment:12 Changed 12 years ago by abeham
r8465: Changed k-NN to move model representation (kdTree) into the model object
comment:13 Changed 12 years ago by abeham
- Refactored classes
- Added parameter for number of iterations
- Added parameter for neighborhood sampling (allows to speed up gradient calculation)
- Adapted to changed k-NN algorithm
comment:14 Changed 12 years ago by abeham
r8467: fixed name of model
comment:15 Changed 12 years ago by abeham
r8470: increased point size and converted K to a fixed value parameter
comment:16 Changed 12 years ago by abeham
r8471: integrated branch into trunk
comment:17 Changed 12 years ago by abeham
- Owner changed from abeham to gkronber
- Status changed from accepted to reviewing
r8472: removed branch
comment:18 follow-up: ↓ 21 Changed 12 years ago by abeham
Just a note, I still want to use the classification penalties in the objective function. I don't think this would be hard to do, but it would probably change the gradient, so I'm waiting on your input on the current implementation first which is true to the paper.
Also I wanted to allow cancellation of the algorithm and output the current result (would only support "Stop" if this is possible). But I didn't find a way to get the CancellationToken for the algorithm. Is this unavailable for FixedDataAnalysisAlgorithms?
comment:19 Changed 12 years ago by abeham
[REVERTED] r8523: added classification penalty into fitness function and gradient
comment:20 Changed 12 years ago by abeham
r8681: Added regularization term introduced by Yang and Laaksonen 2007.
I reverted r8523 and introduced a regularization term which subtracts the Frobenius norm of the transformation matrix from the fitness function (adds it in our case as we're minimizing). The parameter lambda (regularization) controls the strength of this penalization and may be set by the user. If it is set to 0 then the algorithm exhibits the original behavior described by Goldberger et al.
comment:21 in reply to: ↑ 18 Changed 12 years ago by gkronber
Replying to abeham:
Also I wanted to allow cancellation of the algorithm and output the current result (would only support "Stop" if this is possible). But I didn't find a way to get the CancellationToken for the algorithm. Is this unavailable for FixedDataAnalysisAlgorithms?
The best way to support cancellation would be by refactoring the algorithm into smaller operators and combining it with a gradient descent algorithm.
comment:22 Changed 12 years ago by gkronber
Review comments:
- The changes in NearestNeighbourModel are OK
- should we move the class Matrix into one of the base plugins (e.g., HeuristicLab.Common)?
- In NcaModel line 92: shouldn't we use a matrix-multiplication here?
- in NcaAlgorithm line 221: why do we not set a value when normalization <= 0.
- Why does the algorithm use sparse matrix? Are the majority of the values of probabilities = 0?
- I reviewed the Gradient() method (r8681) but have to admit that I'm not familiar enough with NCA to be sure that it is implemented correctly.
Overall, the implementation looks good. However, the algorithm is hard to understand even when reading the referenced paper. Therefore, it would be great if the implementation is tested using unit test cases where the results of our implementation (esp. of the Gradient() method) are compared with the results of a reference implementation (e.g., from Toolkit for Distance Metric Learning or from Dimensionality reduction toolbox)?
comment:23 Changed 12 years ago by gkronber
- Owner changed from gkronber to abeham
comment:24 follow-up: ↓ 27 Changed 12 years ago by gkronber
Further comments from reviewing the NcaInitializers:
- In LdaInitializer line 47: this is effectively equivalent to PrepareInputMatrix()... use the existing method instead (also in the other PcaInitializer).
* RandomInitializer line 76: matrix[i] = random.NextDouble() / range[i / dimensions]; isn't i / dimensions always zero since both are ints?
comment:25 Changed 12 years ago by gkronber
- Status changed from reviewing to assigned
comment:26 Changed 12 years ago by abeham
r9270: Changed NCA to use LM-BFGS optimization algorithm, added model/solution creators, added operator for gradient calculation
comment:27 in reply to: ↑ 24 Changed 12 years ago by abeham
- Owner changed from abeham to gkronber
- Status changed from assigned to reviewing
r9272: removed scaling of data
Replying to gkronber:
Further comments from reviewing the NcaInitializers:
- In LdaInitializer line 47: this is effectively equivalent to PrepareInputMatrix()... use the existing method instead (also in the other PcaInitializer).
I simplified the initializers.
Replying to gkronber:
Review comments:
- The changes in NearestNeighbourModel are OK
- should we move the class Matrix into one of the base plugins (e.g., HeuristicLab.Common)?
Maybe, but it should be checked more thoroughly. I would definitely favor something that allows to do vector/matrix operations and interfaces with double[,], double[], and maybe also DoubleMatrix/DoubleArray. But I don't know if this implementation is the way to go.
- In NcaModel line 92: shouldn't we use a matrix-multiplication here?
It is a matrix multiplication already ;-) The Matrix class unfortunately isn't flexible enough. The reduced dataset contains the projected dimensions plus the target (in the last column).
- in NcaAlgorithm line 221: why do we not set a value when normalization <= 0.
The projected point is too far away from any other point, which means that we don't effectively have any neighbor for classification. This is because influence decreases exponentially with distance.
- Why does the algorithm use sparse matrix? Are the majority of the values of probabilities = 0?
Because the matrix would be instances times instances which is potentially huge, and yes, most of the values are 0.
- I reviewed the Gradient() method (r8681) but have to admit that I'm not familiar enough with NCA to be sure that it is implemented correctly.
Overall, the implementation looks good. However, the algorithm is hard to understand even when reading the referenced paper. Therefore, it would be great if the implementation is tested using unit test cases where the results of our implementation (esp. of the Gradient() method) are compared with the results of a reference implementation (e.g., from Toolkit for Distance Metric Learning or from Dimensionality reduction toolbox)?
I guess I need to install MatLab after all...
comment:28 Changed 12 years ago by gkronber
- Status changed from reviewing to readytorelease
comment:29 Changed 12 years ago by abeham
- Owner changed from gkronber to abeham
- Status changed from readytorelease to assigned
There is a problem remaining
comment:30 Changed 12 years ago by abeham
- Status changed from assigned to reviewing
r9451: fixed handling of non-sequential natural numbered class values in LDA initializer
comment:31 Changed 12 years ago by abeham
- Status changed from reviewing to readytorelease
comment:32 Changed 12 years ago by swagner
- Resolution set to done
- Status changed from readytorelease to closed
- Version changed from 3.3.7 to 3.3.8
r8412: imported branch