Opened 10 years ago
Closed 10 years ago
#2234 closed feature request (done)
Implement grid search for LibSVM parameters
Reported by: | bburlacu | Owned by: | jkarder |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.11 |
Component: | Algorithms.DataAnalysis | Version: | 3.3.10 |
Keywords: | svm, crossvalidation, grid search | Cc: |
Description
- The grid search should find the optimal set of parameters by minimizing the k-fold cross validation error.
- A user-specified number of folds should be generated from the training data without performing unnecessary cloning
- The method should accept any combination of SVM parameters together with their respective ranges
Attachments (2)
Change History (32)
comment:1 Changed 10 years ago by bburlacu
- Status changed from new to accepted
comment:2 Changed 10 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from accepted to reviewing
comment:3 Changed 10 years ago by bburlacu
r11309: Added CartesianProduct extension method to HeuristicLab.Common/3.3/EnumerableExtensions.cs, used to generate all possible combinations of parameters for the grid search.
comment:4 Changed 10 years ago by mkommend
- Owner changed from mkommend to bburlacu
- Status changed from reviewing to assigned
comment:5 Changed 10 years ago by bburlacu
r11326: Refactored CrossValidate and GridSearch methods.
comment:6 Changed 10 years ago by bburlacu
r11337: Refactored SVM grid search, added support for symbolic classification.
comment:7 Changed 10 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
comment:8 Changed 10 years ago by mkommend
r11340: Fixed bug in svm classification regarding the parameter extraction.
comment:9 Changed 10 years ago by mkommend
r11339: Minor code changes in SVMUtil to perform cross validation (code reorganization, naming).
comment:10 Changed 10 years ago by mkommend
- Owner changed from mkommend to bburlacu
- Status changed from reviewing to assigned
Review comments:
- use range transform to scale the problem before learning the SVM model
- remove RF parameters from the scripts
- include a demo problem data in the script, so that it can be started directly
- the scripts should also work with problems and not only with problem data objects
- provide some kind of progress indication (e.g. printed dots, number of calculated combinations, ...)
- elapsed time has no unit displayed
- provide sensible default parameter ranges in the script (e.g., AFAIK nu_svc does not use the c parameter)
- attach the scripts to the start page
comment:11 Changed 10 years ago by mkommend
r11342: Corrected locking in SVM Util class.
comment:12 Changed 10 years ago by bburlacu
r11361: Added the option to shuffle the crossvalidation folds (this option is on by default since libsvm does it too). Implemented stratified fold generation for classification data (ensures similar label distribution in each fold). Fixed bug causing incorrect behavior in GenerateFolds<T> when the number of values is less than the requested number of folds.
comment:13 Changed 10 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
comment:14 Changed 10 years ago by gkronber
- Owner changed from mkommend to gkronber
comment:15 Changed 10 years ago by bburlacu
r11427: Fixed thread synchronisation bug.
comment:16 Changed 10 years ago by bburlacu
r11447: Added svm grid search scripts to the HeuristicLab start page.
comment:17 Changed 10 years ago by gkronber
- Owner changed from gkronber to bburlacu
- Status changed from reviewing to assigned
data for input variables should be scaled also in the SvmUtil.CrossValidate method (using RangeTransform).
comment:18 Changed 10 years ago by gkronber
Using a static lock object for the following is not necessary. Instead lock object with method scope and life-time is sufficient.
Parallel.ForEach(crossProduct, new ParallelOptions { MaxDegreeOfParallelism = maxDegreeOfParallelism }, parameterCombination => { var parameters = DefaultParameters(); var parameterValues = parameterCombination.ToList(); for (int i = 0; i < parameterValues.Count; ++i) setters[i](parameters, parameterValues[i]); double testMse = CalculateCrossValidationPartitions(partitions, parameters); lock (locker) { if (testMse < mse.Value) { mse.Value = testMse; bestParam = (svm_parameter)parameters.Clone(); } } }); return bestParam;
comment:19 Changed 10 years ago by bburlacu
r11464: Moved lock object inside the GridSearch method. Added scaling for the svm partitions.
comment:20 Changed 10 years ago by bburlacu
r11542: Fixed bug in CrossValidate method where the OnlineCalculatorError was ignored. Updated GridSearch method to return the crossvalidation mse as an out parameter and to skip nan values.
comment:21 Changed 10 years ago by bburlacu
r11544: Updated sample scripts.
comment:22 Changed 10 years ago by bburlacu
r11545: Updated test resources.
comment:23 Changed 10 years ago by bburlacu
r11547: Updated svm grid search unit tests.
comment:24 Changed 10 years ago by bburlacu
r11548: Updated svm grid search scripts with the versions created by the unit tests.
comment:25 Changed 10 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
comment:26 Changed 10 years ago by mkommend
- Status changed from reviewing to readytorelease
comment:27 Changed 10 years ago by mkommend
- Owner changed from mkommend to jkarder
- Status changed from readytorelease to assigned
comment:28 Changed 10 years ago by mkommend
- Status changed from assigned to reviewing
comment:29 Changed 10 years ago by mkommend
- Status changed from reviewing to readytorelease
r11308: SupportVectorMachineUtil.cs