Index: /branches/MathNetNumerics-Exploration-2789/HeuristicLab.Algorithms.DataAnalysis.Experimental/Splines.cs
===================================================================
--- /branches/MathNetNumerics-Exploration-2789/HeuristicLab.Algorithms.DataAnalysis.Experimental/Splines.cs (revision 15464)
+++ /branches/MathNetNumerics-Exploration-2789/HeuristicLab.Algorithms.DataAnalysis.Experimental/Splines.cs (revision 15465)
@@ -36,4 +36,93 @@
using MathNet.Numerics.LinearAlgebra.Double;
using MathNet.Numerics.LinearAlgebra.Double.Solvers;
+
+
+/*
+ * Experimenting with various spline implementations.
+ *
+ * There are many different variants of using splines.
+ * - Spline Interpolation
+ * - Natural Spline Interpolation
+ * - Smoothing Splines (Reinsch)
+ * - P-Splines
+ * - B-Splines
+ * - Regression Splines
+ * - Penalized Regression Splines
+ * -
+ *
+ *
+ * Generally, a spline is a piecewise function whereby each piece connects two knots.
+ * In almost all cases splines are univariate. There are extensions to more dimensions
+ * but this is much less convenient.
+ * A linear spline is a piecewise linear function. Usually the following constaints are enforced
+ * for each knot point k and the two piecewise functions fl and fr
+ * fl(k) = fr(k), fl(k)' = fr(k)', fl(k)''=0, fr(k)''=0, additionally constraints for
+ * the boundary knots are often enforced.
+
+ * Spline Interpolation solves the problem of finding a smooth interpolating function
+ * which goes through all knots exactly.
+ *
+ * In spline smoothing one tries to find a function minimizing the squared residuals
+ * and maximizing smoothness. For the smoothing spline g the optimization objective is
+ * || y - g(x) ||² + \lambda integrate (xmin : xmax) g(x)'' dx.
+ * The roughness penality is the integral over the second derivative of the smoothing spline
+ * over the whole domain of x. Lambda is used to balance the amount of smoothing.
+ * If lambda is set to zero, the solution is an interpolating spline. If lambda is set to
+ * infinity the solution is a maximally smooth function (= line).
+ * Interestingly, the estimation of a smoothing spline is a linear operation. As a consequence
+ * a lot of theory from linear models can be reused (e.g. cross-validation and generalized
+ * cross-validation) with minimal additional effort. GCV can also be used to tune the
+ * smoothing parameter automatically. Tuned algorithms are available to solve smoothing
+ * spline estimation efficiently O(n) or O(m²n) where m is the number of knots and n the
+ * number of data points.
+ * It is possible to calculate confidence intervals for the smoothing spline and given
+ * an error model subsequently also confidence intervals for predictions.
+ *
+ * We are primarily interested in spline smoothing / regression.
+ * Alglib and Math.NET provide several different implementations for interpolation
+ * including polynomial (cubic), Hermite, Akima, Linear, Catmull-Rom, Rational, ...
+ *
+ * For spline smoothing or regression there are also many differen variants.
+ *
+ * Smoothing Splines by Reinsch (Reinsch, Smoothing by Spline Functions, 1967).
+ * Pseudo-code for the algorithm is given in the paper. A small change in the algorithm
+ * to improve convergence is discussed in (Reinsch, Smoothing by Spline Functions II, 1970)
+ * The algorithm uses the properties of the smoothing matrix (bandedness) to calculate the
+ * smoothing spline in O(n). Two parameters have to be set (S and rho). If the noise in the
+ * data is known then both parameter can be set directly, otherwise S should be set to n-1
+ * and rho should be found via cross-validation.
+ * It is not easy to calculate a the CV (PRESS) or GCV in the approach by Reinsch and
+ * therefore the algorithm is not ideal.
+ * Wahba and colleagues (1990) introduced CV and GCV estimation.
+ * Hutchinson and colleages improved the algorithm (Fortran implementation available as
+ * algorithm 642.f from ACM).
+ *
+ * CUBGCV (Algorithm 642 Hutchingson)
+ * Several improvments over the Reinsch algorithm. In particular efficient calculation
+ * of the
+
+ * Finbarr O'Sullivan BART
+ * ...
+ *
+ * B-Splines
+ * ...
+ *
+ * P-Splines (Eilers and Marx), use the B-Spline basis in combination with a penality
+ * for large variations of subsequent B-Spline coefficients. Conceptually, if the
+ * coefficients of neighbouring B-Splines are almost the same then the resulting smoothing
+ * spline is less wiggly. It has been critizized that this is rather subjective.
+ * P-Splines also allow calculation of CV and GCV for a given smoothing parameter which
+ * can be used for optimization of lambda.
+ * It is easy to use P-Splines within GAMs (no backfitting necessary).
+ * The paper contains MATLAB code also for the calculation of the B-Spline basis.
+ * The computational effort is O(?). The authors argue that it is efficient.
+ * The number of knots is not critical it is possible to use every data point as knot.
+ *
+ * Alglib: penalized regression spline
+ * ...
+ *
+ * Alglib: penalized regression spline with support for weights
+ * ...
+ */
namespace HeuristicLab.Algorithms.DataAnalysis.Experimental {