Opened 3 years ago
Last modified 3 months ago
#2898 reviewing enhancement
Generalized additive models (GAM)
Reported by: | gkronber | Owned by: | gkronber |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.17 |
Component: | Algorithms.DataAnalysis | Version: | trunk |
Keywords: | Cc: |
Description
Generalized additive models would be a great addition to the set of data-based modeling algorithms.
Feature wishlist:
- Base-learner for the terms is configurable (default: smoothing spline or penalized regression spline). E.g. it would be great if we could use an efficient symbolic regression solver as base learner.
- Individually adjustable smoothing or regularization parameter for each term.
- Automatic selection of smoothing or regularization parameter for each term ideally based on generalized cross-validation (GCV).
- The variables allowed in each term are configureable.
Idea for a first prototype:
- Only uni-variate terms are allowed
- Use alglib penalized regression spline for each term
- The variables together with penalization parameters for each term are read from a list (algorithm parameter)
Change History (16)
comment:1 Changed 3 years ago by gkronber
- Owner set to gkronber
- Status changed from new to accepted
comment:2 Changed 3 years ago by gkronber
comment:3 Changed 3 years ago by gkronber
- Owner changed from gkronber to lkammere
- Status changed from accepted to reviewing
r15775: added simple implementation of GAM based on uni-variate penalized regression splines with the same penalization factor for each term
comment:4 Changed 3 years ago by gkronber
- Milestone changed from HeuristicLab 4.x Backlog to HeuristicLab 3.3.16
- Owner changed from lkammere to gkronber
- Status changed from reviewing to assigned
comment:5 Changed 2 years ago by gkronber
- Milestone changed from HeuristicLab 3.3.16 to HeuristicLab 3.3.x Backlog
comment:6 Changed 6 months ago by gkronber
r17812: copied implementation from branch to trunk.
comment:7 Changed 6 months ago by gkronber
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.17
- Owner changed from gkronber to mkommend
- Status changed from assigned to reviewing
- Version changed from branch to trunk
comment:8 Changed 6 months ago by gkronber
r17813: delete branch
comment:9 Changed 6 months ago by gkronber
r17815: fix header
comment:10 Changed 4 months ago by mkommend
r17839: Fixed base ctor call in Spline1dModel.
comment:11 Changed 4 months ago by mkommend
Review comments
- What's the point of the ToArray call within GetEstimatedValues (Spline1dModel line 74)? (-> this was an artifact from older code, fixed in r17867)
comment:12 Changed 4 months ago by gkronber
r17867: simplified code in Spline1dModel
comment:13 Changed 3 months ago by mkommend
r17888: Corrected calculation of MSE and RMSE in GAMs by implementing helper methods for their calculation.
The previous implementation used the stddev or variance of the residuals. However, stddev(res) == RMSE and var(res) == MSE only holds iff mean(res) == 0.0. In practice this is not the case due to calculation differences for the training and almost never for the test data (e.g. data shifts), hence the RMSE and MSE have to be calculated the traditional way.
@gkronber I am unsure about the value in the RSS table (line 201). Previously it contained the var(res) and so I changed it to the MSE. However, the name RSS suggest residual sum of squares, thus it should contain MSE * n. Please, comment, correct, rename, on this. Maybe I miss something obvious.
comment:14 Changed 3 months ago by mkommend
r17889: Minor changes in Spline1dModel (added field for inputVariable, and named model and solution more appropriately).
comment:15 Changed 3 months ago by mkommend
- Owner changed from mkommend to gkronber
comment:16 Changed 3 months ago by gkronber
Reviewed r17888:17889.
r15774: created branch