Opened 11 years ago
Last modified 6 years ago
#2057 assigned feature request
Parameter-tuning for support vector machine
Reported by: | gkronber | Owned by: | gkronber |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 4.x Backlog |
Component: | Algorithms.DataAnalysis | Version: | 3.3.8 |
Keywords: | Cc: |
Description
With CMAES or maybe BFGS using estimated gradients.
Attachments (3)
Change History (10)
Changed 10 years ago by bburlacu
comment:1 Changed 10 years ago by bburlacu
The BFGS does not seem to work.
Source:
private double[] GradientSearch(int n, int iterations, int nf, IRegressionProblemData problemData) { var svmProblem = SupportVectorMachineUtil.CreateSvmProblem(problemData.Dataset, problemData.TargetVariable, problemData.AllowedInputVariables, problemData.TrainingIndices); RangeTransform rangeTransform = RangeTransform.Compute(svmProblem); svm_problem scaledProblem = rangeTransform.Scale(svmProblem); var initialPoint = new double[n]; alglib.minlbfgs.minlbfgsstate state = new alglib.minlbfgs.minlbfgsstate(); alglib.minlbfgs.minlbfgscreatef(n, Math.Min(n, 10), initialPoint, 1E-1, state); alglib.minlbfgs.minlbfgssetcond(state, 0.0, 0, 0, iterations); alglib.minlbfgs.minlbfgssetxrep(state, true); alglib.minlbfgs.minlbfgssetgradientcheck(state, 0); var svmParameter = SupportVectorMachineUtil.DefaultParameters(); svmParameter.kernel_type = svm_parameter.RBF; svmParameter.svm_type = svm_parameter.EPSILON_SVR; var pd = (IRegressionProblemData)problemData.Clone(); while (alglib.minlbfgs.minlbfgsiteration(state)) { svmParameter.C = Math.Exp(state.x[0]); vars["C"] = svmParameter.C; svmParameter.gamma = Math.Exp(state.x[1]); vars["g"] = svmParameter.gamma; svmParameter.eps = Math.Exp(state.x[2]); vars["p"] = svmParameter.eps; //var svmModel = svm.svm_train(scaledProblem, svmParameter); //nSv = svmModel.SV.Length; //var model = new SupportVectorMachineModel(svmModel, rangeTransform, problemData.TargetVariable, problemData.AllowedInputVariables); //var solution = new SupportVectorRegressionSolution(model, pd); //vars["Training R2"] = solution.TrainingRSquared; //vars["Test R2"] = solution.TestRSquared; state.f = SupportVectorMachineUtil.CrossValidate(problemData, svmParameter, nf, false); // minimize crossvalidation mse vars["F"] = state.f; } var x = new double[n]; alglib.minlbfgs.minlbfgsreport rep = new alglib.minlbfgs.minlbfgsreport(); alglib.minlbfgs.minlbfgsresults(state, ref x, rep); vars["Result"] = rep.terminationtype; return x.Select(Math.Exp).ToArray(); }
comment:2 Changed 10 years ago by gkronber
We should try if the CMAES is a viable option. However, we'd first need to implement an API for our CMAES implementation that could be easily used in scripts.
comment:3 Changed 10 years ago by gkronber
I implemented a prototype for a SVM-parameter-tuning-problem. See the patch attached.
Changed 10 years ago by gkronber
comment:4 Changed 10 years ago by gkronber
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.11
- Owner changed from gkronber to mkommend
- Status changed from new to assigned
comment:5 Changed 10 years ago by bburlacu
- Owner changed from mkommend to gkronber
This was inspired by this post: http://stats.stackexchange.com/questions/9253/nu-svm-parameter-selection
The Nelder-Mead search method or amoeba method (1) is a well defined numerical method for problems for which derivatives may not be known. It can work better than the alblib BFGS in some cases (for example it can solve the Poly-10 or some of the Keijszer problems given a reasonable starting point - random initialization of the simplex).
The attached script (2) has some pitfalls:
- it fails to map parameters correctly: svm parameters like *nu* should be mapped from (0,1) to (-inf,+inf) using the logit function (the simplex search should use the logit/logistic function for direct/inverse mapping)
- random simplex initialization is unreliable. instead one point should be a reasonable initial guess (maybe the svm defaults) and the rest should be randomly deviated from the initial guess point
- it calls the CrossValidate method which is slower than the internal methods because it regenerates the folds (and range transforms them) on each call
- it probably works best with smooth unimodal objective functions
(1) http://en.wikipedia.org/wiki/Nelder-Mead_simplex_method (2) http://dev.heuristiclab.com/trac.fcgi/attachment/ticket/2057/simplex-svm-search-regression.hl
comment:6 Changed 9 years ago by gkronber
- Milestone changed from HeuristicLab 3.3.11 to HeuristicLab 3.3.x Backlog
comment:7 Changed 6 years ago by gkronber
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 4.x Backlog
BFGS SVM parameter search