Opened 6 months ago
Last modified 3 days ago
#2699 reviewing feature request
Radial Basis Function Regression
Reported by: | bwerth | Owned by: | bwerth |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.15 |
Component: | Algorithms.DataAnalysis | Version: | branch |
Keywords: | Cc: |
Description
Radial Basis Functions are yet another type of regression model often used in surrogate modelling. A main benefit is their extensibility to non-continuous domains.
A Radial Basis Function Regression alongside suitable Kernel functions and Distance metrics shall be added to the existing DataAnalysis solution
Attachments (1)
Change History (18)
comment:1 Changed 6 months ago by bwerth
- Owner set to bwerth
- Status changed from new to assigned
comment:2 Changed 6 months ago by bwerth
- Status changed from assigned to accepted
- Version changed from 3.3.14 to branch
comment:3 Changed 6 months ago by bwerth
comment:4 Changed 6 months ago by bwerth
r14386 moved RadialBasisFunctions from Problems.SurrogateProblem to Algorithms.DataAnalysis
comment:5 Changed 4 months ago by gkronber
r14500: fixed a small typo and added .sln file while reviewing
comment:6 Changed 4 months ago by gkronber
I tested with the 'Chemical-I' problem instance and Euclidean norm with beta=2 (see attachment). The predictions seem to be nice on training and test set but the estimated variances and the approximated error for LOO CV seems to be way off. Could you please check whether the calculation of both are correct?
comment:7 Changed 4 months ago by gkronber
- Status changed from accepted to assigned
comment:8 Changed 2 weeks ago by gkronber
r14869: merged changesets from trunk to branch
comment:9 Changed 2 weeks ago by gkronber
r14870: merged changesets from trunk to branch
comment:10 Changed 2 weeks ago by gkronber
Review comments:
- LINQ is used a lot in combination with matrix operations. This is often slow because of memory allocations required for enumerators. (DONE)
- There should be an option to scale the input variables (scaling should be active by default) (DONE)
- RBF regression does not support noise. If there are duplicate x vectors, model building fails. An option would be to add a diagonal matrix to the gram matrix (leading to kernel ridge regression?) (DONE)
- I have not found a source for the calculation of variance and LOO error (DONE, removed LOO calculation)
- Don't know how to best unify covariance functions and kernel functions (there is some duplication) (DONE).
- The calculation of the covariance matrix takes a lot of time (10x longer than the equivalent calculation when using an equivalent covariance matrix). I suspect that the reason is the rather general implementation for distance calculation. (DONE)
- Beta should be a parameter of the algorithm instead of the kernel to make it easier to run a grid test. (DONE)
- Multiple of the implemented kernels are only conditionally positive definite. See http://num.math.uni-goettingen.de/schaback/teaching/sc.pdf for a definition of the kernels and valid beta-values. Additionally, it is necessary to extend the basis functions for these kernels depending on the value of beta.
Made a number of changes in r14872 mainly refactoring RBF model.
comment:11 Changed 5 days ago by bwerth
r14883 checked and reformulated gradient functions for kernels
comment:12 Changed 5 days ago by gkronber
r14884: renamed folder. RBF regression can be seen as a special case of kernelized ridge regression.
comment:13 Changed 5 days ago by gkronber
r14885: renamed and moved files
comment:14 Changed 5 days ago by gkronber
r14887: worked on kernel ridge regression.
- moved beta parameter to algorithm.
- reintroduced IKernel interface to restrict choice of kernel in kernel ridge regression.
- speed-up by cholesky decomposition and
- optimization of the calculation of the covariance matrix.
comment:15 Changed 4 days ago by gkronber
- Status changed from assigned to reviewing
r14888: re-added calculation of leave one out cv estimate
comment:16 Changed 3 days ago by bwerth
r14891 reworked kernel functions (beta is always a scaling factor now) added LU-Decomposition as a fall-back if Cholesky-decomposition fails
comment:17 Changed 3 days ago by gkronber
r14892: made some adjustments after bwerth's changes (mainly formatting)
r14385 created branch for radial basis functions regression