Opened 10 years ago
Closed 10 years ago
#2237 closed feature request (done)
Implement grid search for random forest parameters
Reported by: | bburlacu | Owned by: | mkommend |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.11 |
Component: | Algorithms.DataAnalysis | Version: | 3.3.10 |
Keywords: | random forest, grid search | Cc: |
Description
This ticket is identical to #2234, only for random forests. Grid search should support searching for any combination of n, m, r parameters where:
- n is the number of trees in the forest
- m is the ratio of features that will be used in the construction of individual trees (0<m<=1)
- r is the ratio of the training set that will be used in the construction of individual trees (0<r<=1)
Attachments (2)
Change History (20)
comment:1 Changed 10 years ago by bburlacu
- Status changed from new to accepted
comment:2 Changed 10 years ago by bburlacu
comment:3 Changed 10 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from accepted to reviewing
comment:4 Changed 10 years ago by bburlacu
r11317: Forgot to commit changes to project file.
comment:5 Changed 10 years ago by mkommend
- Owner changed from mkommend to bburlacu
- Status changed from reviewing to assigned
comment:6 Changed 10 years ago by bburlacu
r11338: Refactored random forest grid search and added support for symbolic classification.
comment:7 Changed 10 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
comment:8 Changed 10 years ago by mkommend
r11343: Corrected newly introduced bug in RandomForestModel and reorganized RandomForestUtil.
comment:9 Changed 10 years ago by mkommend
- Owner changed from mkommend to bburlacu
- Status changed from reviewing to assigned
Review comments:
- every call to createRFModel allocates a double[,] that holds the data => try to reuse the generated data objects
- adapt the method names and signatures to the ones in the SVMUtil class
- remove SVM related methods and objects from the scripts
- include a demo problem data in the script, so that it can be started directly
- the scripts should also work with problems and not only with problem data objects
- provide some kind of progress indication (e.g. printed dots, number of calculated combinations, ...)
- elapsed time has no unit displayed
- attach the scripts to the start page
comment:10 Changed 10 years ago by bburlacu
r11362: Addressed part of the comments above:
- Methods are similar to the ones from SupportVectorMachineUtil
- Cleaned up sample scripts
- Elapsed time is shown in seconds
- Included demo problem
- Added stratified crossvalidation (shuffling is turned off by default)
- Added different GridSearch methods with/without crossvalidation.
- Fixed bug in fold generation when the number of folds is larger than the number of values
comment:11 Changed 10 years ago by bburlacu
r11426: Fixed thread synchronisation bug. Removed unused variables in GridSearch methods.
comment:12 Changed 10 years ago by bburlacu
r11443: Made random forest parameters serializable (by deriving from ParameterCollection).
comment:13 Changed 10 years ago by bburlacu
r11445: Fixed cloning and storable constructor access levels
comment:14 Changed 10 years ago by bburlacu
r11446: Added random forest grid search scripts to the HeuristicLab start page.
comment:15 Changed 10 years ago by bburlacu
r11448: Added script to the startpage (forgot to commit changes to StartPage.cs).
comment:16 Changed 10 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
comment:17 Changed 10 years ago by mkommend
- Status changed from reviewing to readytorelease
r11315: Added RandomForestUtil class implementing fold generation, cross-validation and grid search. Overloaded CreateRegressionModel method to accept a user-specified data partition.