Context Navigation

← Previous Changeset
Next Changeset →

Changeset 15465

Timestamp:

11/08/17 20:23:37 (7 years ago)

Author:

gkronber

Message:

Started a summary of findings.

File:

: 1 edited

branches/MathNetNumerics-Exploration-2789/HeuristicLab.Algorithms.DataAnalysis.Experimental/Splines.cs (modified) (1 diff)

Legend:

: Unmodified
: Added
: Removed

branches/MathNetNumerics-Exploration-2789/HeuristicLab.Algorithms.DataAnalysis.Experimental/Splines.cs

-                      r15457
+                      r15465
 using MathNet.Numerics.LinearAlgebra.Double;
 using MathNet.Numerics.LinearAlgebra.Double.Solvers;
+/*
+ * Experimenting with various spline implementations.
+ *
+ * There are many different variants of using splines.
+ *  - Spline Interpolation
+ *  - Natural Spline Interpolation
+ *  - Smoothing Splines (Reinsch)
+ *  - P-Splines
+ *  - B-Splines
+ *  - Regression Splines
+ *  - Penalized Regression Splines
+ *  -
+ *
+ *
+ *  Generally, a spline is a piecewise function whereby each piece connects two knots.
+ *  In almost all cases splines are univariate. There are extensions to more dimensions
+ *  but this is much less convenient.
+ *  A linear spline is a piecewise linear function. Usually the following constaints are enforced
+ *  for each knot point k and the two piecewise functions fl and fr
+ *  fl(k) = fr(k), fl(k)' = fr(k)', fl(k)''=0, fr(k)''=0, additionally constraints for
+ *  the boundary knots are often enforced.
+ *  Spline Interpolation solves the problem of finding a smooth interpolating function
+ *  which goes through all knots exactly.
+ *
+ *  In spline smoothing one tries to find a function minimizing the squared residuals
+ *  and maximizing smoothness. For the smoothing spline g the optimization objective is
+ *  || y - g(x) ||² + \lambda integrate (xmin : xmax) g(x)'' dx.
+ *  The roughness penality is the integral over the second derivative of the smoothing spline
+ *  over the whole domain of x. Lambda is used to balance the amount of smoothing.
+ *  If lambda is set to zero, the solution is an interpolating spline. If lambda is set to
+ *  infinity the solution is a maximally smooth function (= line).
+ *  Interestingly, the estimation of a smoothing spline is a linear operation. As a consequence
+ *  a lot of theory from linear models can be reused (e.g. cross-validation and generalized
+ *  cross-validation) with minimal additional effort. GCV can also be used to tune the
+ *  smoothing parameter automatically. Tuned algorithms are available to solve smoothing
+ *  spline estimation efficiently O(n) or O(m²n) where m is the number of knots and n the
+ *  number of data points.
+ *  It is possible to calculate confidence intervals for the smoothing spline and given
+ *  an error model subsequently also confidence intervals for predictions.
+ *
+ *  We are primarily interested in spline smoothing / regression.
+ *  Alglib and Math.NET provide several different implementations for interpolation
+ *  including polynomial (cubic), Hermite, Akima, Linear, Catmull-Rom, Rational, ...
+ *
+ *  For spline smoothing or regression there are also many differen variants.
+ *
+ *  Smoothing Splines by Reinsch (Reinsch, Smoothing by Spline Functions, 1967).
+ *  Pseudo-code for the algorithm is given in the paper. A small change in the algorithm
+ *  to improve convergence is discussed in (Reinsch, Smoothing by Spline Functions II, 1970)
+ *  The algorithm uses the properties of the smoothing matrix (bandedness) to calculate the
+ *  smoothing spline in O(n). Two parameters have to be set (S and rho). If the noise in the
+ *  data is known then both parameter can be set directly, otherwise S should be set to n-1
+ *  and rho should be found via cross-validation.
+ *  It is not easy to calculate a the CV (PRESS) or GCV in the approach by Reinsch and
+ *  therefore the algorithm is not ideal.
+ *  Wahba and colleagues (1990) introduced CV and GCV estimation.
+ *  Hutchinson and colleages improved the algorithm (Fortran implementation available as
+ *  algorithm 642.f from ACM).
+ *
+ *  CUBGCV (Algorithm 642 Hutchingson)
+ *  Several improvments over the Reinsch algorithm. In particular efficient calculation
+ *  of the
+ *  Finbarr O'Sullivan BART
+ *  ...
+ *
+ *  B-Splines
+ *  ...
+ *
+ *  P-Splines (Eilers and Marx), use the B-Spline basis in combination with a penality
+ *  for large variations of subsequent B-Spline coefficients. Conceptually, if the
+ *  coefficients of neighbouring B-Splines are almost the same then the resulting smoothing
+ *  spline is less wiggly. It has been critizized that this is rather subjective.
+ *  P-Splines also allow calculation of CV and GCV for a given smoothing parameter which
+ *  can be used for optimization of lambda.
+ *  It is easy to use P-Splines within GAMs (no backfitting necessary).
+ *  The paper contains MATLAB code also for the calculation of the B-Spline basis.
+ *  The computational effort is O(?). The authors argue that it is efficient.
+ *  The number of knots is not critical it is possible to use every data point as knot.
+ *
+ *  Alglib: penalized regression spline
+ *  ...
+ *
+ *  Alglib: penalized regression spline with support for weights
+ *  ...
+ */
 namespace HeuristicLab.Algorithms.DataAnalysis.Experimental {

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 15465

Legend:

branches/MathNetNumerics-Exploration-2789/HeuristicLab.Algorithms.DataAnalysis.Experimental/Splines.cs

Download in other formats: