Changeset 15465
- Timestamp:
- 11/08/17 20:23:37 (7 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
branches/MathNetNumerics-Exploration-2789/HeuristicLab.Algorithms.DataAnalysis.Experimental/Splines.cs
r15457 r15465 36 36 using MathNet.Numerics.LinearAlgebra.Double; 37 37 using MathNet.Numerics.LinearAlgebra.Double.Solvers; 38 39 40 /* 41 * Experimenting with various spline implementations. 42 * 43 * There are many different variants of using splines. 44 * - Spline Interpolation 45 * - Natural Spline Interpolation 46 * - Smoothing Splines (Reinsch) 47 * - P-Splines 48 * - B-Splines 49 * - Regression Splines 50 * - Penalized Regression Splines 51 * - 52 * 53 * 54 * Generally, a spline is a piecewise function whereby each piece connects two knots. 55 * In almost all cases splines are univariate. There are extensions to more dimensions 56 * but this is much less convenient. 57 * A linear spline is a piecewise linear function. Usually the following constaints are enforced 58 * for each knot point k and the two piecewise functions fl and fr 59 * fl(k) = fr(k), fl(k)' = fr(k)', fl(k)''=0, fr(k)''=0, additionally constraints for 60 * the boundary knots are often enforced. 61 62 * Spline Interpolation solves the problem of finding a smooth interpolating function 63 * which goes through all knots exactly. 64 * 65 * In spline smoothing one tries to find a function minimizing the squared residuals 66 * and maximizing smoothness. For the smoothing spline g the optimization objective is 67 * || y - g(x) ||² + \lambda integrate (xmin : xmax) g(x)'' dx. 68 * The roughness penality is the integral over the second derivative of the smoothing spline 69 * over the whole domain of x. Lambda is used to balance the amount of smoothing. 70 * If lambda is set to zero, the solution is an interpolating spline. If lambda is set to 71 * infinity the solution is a maximally smooth function (= line). 72 * Interestingly, the estimation of a smoothing spline is a linear operation. As a consequence 73 * a lot of theory from linear models can be reused (e.g. cross-validation and generalized 74 * cross-validation) with minimal additional effort. GCV can also be used to tune the 75 * smoothing parameter automatically. Tuned algorithms are available to solve smoothing 76 * spline estimation efficiently O(n) or O(m²n) where m is the number of knots and n the 77 * number of data points. 78 * It is possible to calculate confidence intervals for the smoothing spline and given 79 * an error model subsequently also confidence intervals for predictions. 80 * 81 * We are primarily interested in spline smoothing / regression. 82 * Alglib and Math.NET provide several different implementations for interpolation 83 * including polynomial (cubic), Hermite, Akima, Linear, Catmull-Rom, Rational, ... 84 * 85 * For spline smoothing or regression there are also many differen variants. 86 * 87 * Smoothing Splines by Reinsch (Reinsch, Smoothing by Spline Functions, 1967). 88 * Pseudo-code for the algorithm is given in the paper. A small change in the algorithm 89 * to improve convergence is discussed in (Reinsch, Smoothing by Spline Functions II, 1970) 90 * The algorithm uses the properties of the smoothing matrix (bandedness) to calculate the 91 * smoothing spline in O(n). Two parameters have to be set (S and rho). If the noise in the 92 * data is known then both parameter can be set directly, otherwise S should be set to n-1 93 * and rho should be found via cross-validation. 94 * It is not easy to calculate a the CV (PRESS) or GCV in the approach by Reinsch and 95 * therefore the algorithm is not ideal. 96 * Wahba and colleagues (1990) introduced CV and GCV estimation. 97 * Hutchinson and colleages improved the algorithm (Fortran implementation available as 98 * algorithm 642.f from ACM). 99 * 100 * CUBGCV (Algorithm 642 Hutchingson) 101 * Several improvments over the Reinsch algorithm. In particular efficient calculation 102 * of the 103 104 * Finbarr O'Sullivan BART 105 * ... 106 * 107 * B-Splines 108 * ... 109 * 110 * P-Splines (Eilers and Marx), use the B-Spline basis in combination with a penality 111 * for large variations of subsequent B-Spline coefficients. Conceptually, if the 112 * coefficients of neighbouring B-Splines are almost the same then the resulting smoothing 113 * spline is less wiggly. It has been critizized that this is rather subjective. 114 * P-Splines also allow calculation of CV and GCV for a given smoothing parameter which 115 * can be used for optimization of lambda. 116 * It is easy to use P-Splines within GAMs (no backfitting necessary). 117 * The paper contains MATLAB code also for the calculation of the B-Spline basis. 118 * The computational effort is O(?). The authors argue that it is efficient. 119 * The number of knots is not critical it is possible to use every data point as knot. 120 * 121 * Alglib: penalized regression spline 122 * ... 123 * 124 * Alglib: penalized regression spline with support for weights 125 * ... 126 */ 38 127 39 128 namespace HeuristicLab.Algorithms.DataAnalysis.Experimental {
Note: See TracChangeset
for help on using the changeset viewer.