1 | CUBGCV TEST DRIVER RESULTS:
|
---|
2 |
|
---|
3 | IER = 0
|
---|
4 | VAR = 0.0279
|
---|
5 |
|
---|
6 | GENERALIZED CROSS VALIDATION = 0.0318
|
---|
7 | MEAN SQUARE RESIDUAL = 0.0246
|
---|
8 | RESIDUAL DEGREES OF FREEDOM = 43.97
|
---|
9 |
|
---|
10 | INPUT DATA OUTPUT RESULTS
|
---|
11 |
|
---|
12 | I X(I) F(I) Y(I) SE(I) C(I,1) C(I,2) C(I,3)
|
---|
13 | 1 0.0046 0.2222 0.0342 0.1004 0.3630E+01 0.0000E+00 0.2542E+02
|
---|
14 | 2 0.0360 -0.1098 0.1488 0.0750 0.3705E+01 0.2391E+01 -0.9537E+01
|
---|
15 | 3 0.0435 -0.0658 0.1767 0.0707 0.3740E+01 0.2175E+01 -0.4233E+02
|
---|
16 | 4 0.0735 0.3906 0.2900 0.0594 0.3756E+01 -0.1642E+01 -0.2872E+02
|
---|
17 | 5 0.0955 0.6054 0.3714 0.0558 0.3642E+01 -0.3535E+01 0.2911E+01
|
---|
18 | 6 0.1078 0.3034 0.4155 0.0549 0.3557E+01 -0.3428E+01 -0.1225E+02
|
---|
19 | 7 0.1269 0.7386 0.4822 0.0544 0.3412E+01 -0.4131E+01 0.2242E+02
|
---|
20 | 8 0.1565 0.4616 0.5800 0.0543 0.3227E+01 -0.2143E+01 0.6415E+01
|
---|
21 | 9 0.1679 0.4315 0.6165 0.0543 0.3180E+01 -0.1923E+01 -0.1860E+02
|
---|
22 | 10 0.1869 0.5716 0.6762 0.0544 0.3087E+01 -0.2985E+01 -0.3274E+02
|
---|
23 | 11 0.2149 0.6736 0.7595 0.0542 0.2843E+01 -0.5733E+01 -0.4435E+02
|
---|
24 | 12 0.2356 0.7388 0.8155 0.0539 0.2549E+01 -0.8486E+01 -0.5472E+02
|
---|
25 | 13 0.2557 1.1953 0.8630 0.0537 0.2139E+01 -0.1180E+02 -0.9784E+01
|
---|
26 | 14 0.2674 1.0299 0.8864 0.0536 0.1860E+01 -0.1214E+02 0.9619E+01
|
---|
27 | 15 0.2902 0.7981 0.9225 0.0534 0.1322E+01 -0.1149E+02 -0.7202E+01
|
---|
28 | 16 0.3155 0.8973 0.9485 0.0532 0.7269E+00 -0.1203E+02 -0.1412E+02
|
---|
29 | 17 0.3364 1.2695 0.9583 0.0530 0.2040E+00 -0.1292E+02 0.2796E+02
|
---|
30 | 18 0.3557 0.7253 0.9577 0.0527 -0.2638E+00 -0.1130E+02 -0.3453E+01
|
---|
31 | 19 0.3756 1.2127 0.9479 0.0526 -0.7176E+00 -0.1151E+02 0.3235E+02
|
---|
32 | 20 0.3881 0.7304 0.9373 0.0525 -0.9889E+00 -0.1030E+02 0.4381E+01
|
---|
33 | 21 0.4126 0.9810 0.9069 0.0525 -0.1486E+01 -0.9977E+01 0.1440E+02
|
---|
34 | 22 0.4266 0.7117 0.8842 0.0525 -0.1756E+01 -0.9373E+01 -0.8925E+01
|
---|
35 | 23 0.4566 0.7203 0.8227 0.0524 -0.2344E+01 -0.1018E+02 -0.2278E+02
|
---|
36 | 24 0.4704 0.9242 0.7884 0.0524 -0.2637E+01 -0.1112E+02 -0.4419E+01
|
---|
37 | 25 0.4914 0.7345 0.7281 0.0523 -0.3110E+01 -0.1140E+02 -0.3562E+01
|
---|
38 | 26 0.5084 0.7378 0.6720 0.0524 -0.3500E+01 -0.1158E+02 0.5336E+01
|
---|
39 | 27 0.5277 0.7441 0.6002 0.0525 -0.3941E+01 -0.1127E+02 0.2479E+02
|
---|
40 | 28 0.5450 0.5612 0.5286 0.0527 -0.4310E+01 -0.9980E+01 0.2920E+02
|
---|
41 | 29 0.5641 0.5049 0.4429 0.0529 -0.4659E+01 -0.8309E+01 0.3758E+02
|
---|
42 | 30 0.5857 0.4725 0.3390 0.0531 -0.4964E+01 -0.5878E+01 0.5563E+02
|
---|
43 | 31 0.6159 0.1380 0.1850 0.0531 -0.5167E+01 -0.8307E+00 0.4928E+02
|
---|
44 | 32 0.6317 0.1412 0.1036 0.0531 -0.5157E+01 0.1499E+01 0.5437E+02
|
---|
45 | 33 0.6446 -0.1110 0.0371 0.0531 -0.5091E+01 0.3614E+01 0.3434E+02
|
---|
46 | 34 0.6707 -0.2605 -0.0927 0.0532 -0.4832E+01 0.6302E+01 0.1164E+02
|
---|
47 | 35 0.6853 -0.1284 -0.1619 0.0533 -0.4640E+01 0.6812E+01 0.1617E+02
|
---|
48 | 36 0.7064 -0.3452 -0.2564 0.0536 -0.4332E+01 0.7834E+01 0.4164E+01
|
---|
49 | 37 0.7310 -0.5527 -0.3582 0.0538 -0.3939E+01 0.8141E+01 -0.2214E+02
|
---|
50 | 38 0.7531 -0.3459 -0.4415 0.0540 -0.3611E+01 0.6674E+01 -0.9205E+01
|
---|
51 | 39 0.7686 -0.5902 -0.4961 0.0541 -0.3410E+01 0.6245E+01 -0.2193E+02
|
---|
52 | 40 0.7952 -0.7644 -0.5828 0.0541 -0.3125E+01 0.4494E+01 -0.4649E+02
|
---|
53 | 41 0.8087 -0.5392 -0.6242 0.0541 -0.3029E+01 0.2614E+01 -0.3499E+02
|
---|
54 | 42 0.8352 -0.4247 -0.7031 0.0539 -0.2964E+01 -0.1603E+00 0.2646E+01
|
---|
55 | 43 0.8501 -0.6327 -0.7476 0.0538 -0.2967E+01 -0.4132E-01 0.1817E+02
|
---|
56 | 44 0.8726 -0.9983 -0.8139 0.0538 -0.2942E+01 0.1180E+01 -0.6774E+01
|
---|
57 | 45 0.8874 -0.9082 -0.8574 0.0542 -0.2911E+01 0.8778E+00 -0.1364E+02
|
---|
58 | 46 0.9139 -0.8930 -0.9340 0.0566 -0.2893E+01 -0.2044E+00 -0.8094E+01
|
---|
59 | 47 0.9271 -1.0233 -0.9723 0.0593 -0.2903E+01 -0.5258E+00 -0.1498E+02
|
---|
60 | 48 0.9473 -0.8839 -1.0313 0.0665 -0.2942E+01 -0.1433E+01 0.4945E+01
|
---|
61 | 49 0.9652 -1.0172 -1.0843 0.0766 -0.2989E+01 -0.1168E+01 0.1401E+02
|
---|
62 | 50 0.9930 -1.2715 -1.1679 0.0998
|
---|
63 | Documentation:
|
---|
64 | C COMPUTER - VAX/DOUBLE
|
---|
65 | C
|
---|
66 | C AUTHOR - M.F.HUTCHINSON
|
---|
67 | C CSIRO DIVISION OF MATHEMATICS AND STATISTICS
|
---|
68 | C P.O. BOX 1965
|
---|
69 | C CANBERRA, ACT 2601
|
---|
70 | C AUSTRALIA
|
---|
71 | C
|
---|
72 | C LATEST REVISION - 15 AUGUST 1985
|
---|
73 | C
|
---|
74 | C PURPOSE - CUBIC SPLINE DATA SMOOTHER
|
---|
75 | C
|
---|
76 | C USAGE - CALL CUBGCV (X,F,DF,N,Y,C,IC,VAR,JOB,SE,WK,IER)
|
---|
77 | C
|
---|
78 | C ARGUMENTS X - VECTOR OF LENGTH N CONTAINING THE
|
---|
79 | C ABSCISSAE OF THE N DATA POINTS
|
---|
80 | C (X(I),F(I)) I=1..N. (INPUT) X
|
---|
81 | C MUST BE ORDERED SO THAT
|
---|
82 | C X(I) .LT. X(I+1).
|
---|
83 | C F - VECTOR OF LENGTH N CONTAINING THE
|
---|
84 | C ORDINATES (OR FUNCTION VALUES)
|
---|
85 | C OF THE N DATA POINTS (INPUT).
|
---|
86 | C DF - VECTOR OF LENGTH N. (INPUT/OUTPUT)
|
---|
87 | C DF(I) IS THE RELATIVE STANDARD DEVIATION
|
---|
88 | C OF THE ERROR ASSOCIATED WITH DATA POINT I.
|
---|
89 | C EACH DF(I) MUST BE POSITIVE. THE VALUES IN
|
---|
90 | C DF ARE SCALED BY THE SUBROUTINE SO THAT
|
---|
91 | C THEIR MEAN SQUARE VALUE IS 1, AND UNSCALED
|
---|
92 | C AGAIN ON NORMAL EXIT.
|
---|
93 | C THE MEAN SQUARE VALUE OF THE DF(I) IS RETURNED
|
---|
94 | C IN WK(7) ON NORMAL EXIT.
|
---|
95 | C IF THE ABSOLUTE STANDARD DEVIATIONS ARE KNOWN,
|
---|
96 | C THESE SHOULD BE PROVIDED IN DF AND THE ERROR
|
---|
97 | C VARIANCE PARAMETER VAR (SEE BELOW) SHOULD THEN
|
---|
98 | C BE SET TO 1.
|
---|
99 | C IF THE RELATIVE STANDARD DEVIATIONS ARE UNKNOWN,
|
---|
100 | C SET EACH DF(I)=1.
|
---|
101 | C N - NUMBER OF DATA POINTS (INPUT).
|
---|
102 | C N MUST BE .GE. 3.
|
---|
103 | C Y,C - SPLINE COEFFICIENTS. (OUTPUT) Y
|
---|
104 | C IS A VECTOR OF LENGTH N. C IS
|
---|
105 | C AN N-1 BY 3 MATRIX. THE VALUE
|
---|
106 | C OF THE SPLINE APPROXIMATION AT T IS
|
---|
107 | C S(T)=((C(I,3)*D+C(I,2))*D+C(I,1))*D+Y(I)
|
---|
108 | C WHERE X(I).LE.T.LT.X(I+1) AND
|
---|
109 | C D = T-X(I).
|
---|
110 | C IC - ROW DIMENSION OF MATRIX C EXACTLY
|
---|
111 | C AS SPECIFIED IN THE DIMENSION
|
---|
112 | C STATEMENT IN THE CALLING PROGRAM. (INPUT)
|
---|
113 | C VAR - ERROR VARIANCE. (INPUT/OUTPUT)
|
---|
114 | C IF VAR IS NEGATIVE (I.E. UNKNOWN) THEN
|
---|
115 | C THE SMOOTHING PARAMETER IS DETERMINED
|
---|
116 | C BY MINIMIZING THE GENERALIZED CROSS VALIDATION
|
---|
117 | C AND AN ESTIMATE OF THE ERROR VARIANCE IS
|
---|
118 | C RETURNED IN VAR.
|
---|
119 | C IF VAR IS NON-NEGATIVE (I.E. KNOWN) THEN THE
|
---|
120 | C SMOOTHING PARAMETER IS DETERMINED TO MINIMIZE
|
---|
121 | C AN ESTIMATE, WHICH DEPENDS ON VAR, OF THE TRUE
|
---|
122 | C MEAN SQUARE ERROR, AND VAR IS UNCHANGED.
|
---|
123 | C IN PARTICULAR, IF VAR IS ZERO, THEN AN
|
---|
124 | C INTERPOLATING NATURAL CUBIC SPLINE IS CALCULATED.
|
---|
125 | C VAR SHOULD BE SET TO 1 IF ABSOLUTE STANDARD
|
---|
126 | C DEVIATIONS HAVE BEEN PROVIDED IN DF (SEE ABOVE).
|
---|
127 | C JOB - JOB SELECTION PARAMETER. (INPUT)
|
---|
128 | C JOB = 0 SHOULD BE SELECTED IF POINT STANDARD ERROR
|
---|
129 | C ESTIMATES ARE NOT REQUIRED IN SE.
|
---|
130 | C JOB = 1 SHOULD BE SELECTED IF POINT STANDARD ERROR
|
---|
131 | C ESTIMATES ARE REQUIRED IN SE.
|
---|
132 | C SE - VECTOR OF LENGTH N CONTAINING BAYESIAN STANDARD
|
---|
133 | C ERROR ESTIMATES OF THE FITTED SPLINE VALUES IN Y.
|
---|
134 | C SE IS NOT REFERENCED IF JOB=0. (OUTPUT)
|
---|
135 | C WK - WORK VECTOR OF LENGTH 7*(N + 2). ON NORMAL EXIT THE
|
---|
136 | C FIRST 7 VALUES OF WK ARE ASSIGNED AS FOLLOWS:-
|
---|
137 | C
|
---|
138 | C WK(1) = SMOOTHING PARAMETER (= RHO/(RHO + 1))
|
---|
139 | C WK(2) = ESTIMATE OF THE NUMBER OF DEGREES OF
|
---|
140 | C FREEDOM OF THE RESIDUAL SUM OF SQUARES
|
---|
141 | C WK(3) = GENERALIZED CROSS VALIDATION
|
---|
142 | C WK(4) = MEAN SQUARE RESIDUAL
|
---|
143 | C WK(5) = ESTIMATE OF THE TRUE MEAN SQUARE ERROR
|
---|
144 | C AT THE DATA POINTS
|
---|
145 | C WK(6) = ESTIMATE OF THE ERROR VARIANCE
|
---|
146 | C WK(7) = MEAN SQUARE VALUE OF THE DF(I)
|
---|
147 | C
|
---|
148 | C IF WK(1)=0 (RHO=0) AN INTERPOLATING NATURAL CUBIC
|
---|
149 | C SPLINE HAS BEEN CALCULATED.
|
---|
150 | C IF WK(1)=1 (RHO=INFINITE) A LEAST SQUARES
|
---|
151 | C REGRESSION LINE HAS BEEN CALCULATED.
|
---|
152 | C WK(2) IS AN ESTIMATE OF THE NUMBER OF DEGREES OF
|
---|
153 | C FREEDOM OF THE RESIDUAL WHICH REDUCES TO THE
|
---|
154 | C USUAL VALUE OF N-2 WHEN A LEAST SQUARES REGRESSION
|
---|
155 | C LINE IS CALCULATED.
|
---|
156 | C WK(3),WK(4),WK(5) ARE CALCULATED WITH THE DF(I)
|
---|
157 | C SCALED TO HAVE MEAN SQUARE VALUE 1. THE
|
---|
158 | C UNSCALED VALUES OF WK(3),WK(4),WK(5) MAY BE
|
---|
159 | C CALCULATED BY DIVIDING BY WK(7).
|
---|
160 | C WK(6) COINCIDES WITH THE OUTPUT VALUE OF VAR IF
|
---|
161 | C VAR IS NEGATIVE ON INPUT. IT IS CALCULATED WITH
|
---|
162 | C THE UNSCALED VALUES OF THE DF(I) TO FACILITATE
|
---|
163 | C COMPARISONS WITH A PRIORI VARIANCE ESTIMATES.
|
---|
164 | C
|
---|
165 | C IER - ERROR PARAMETER. (OUTPUT)
|
---|
166 | C TERMINAL ERROR
|
---|
167 | C IER = 129, IC IS LESS THAN N-1.
|
---|
168 | C IER = 130, N IS LESS THAN 3.
|
---|
169 | C IER = 131, INPUT ABSCISSAE ARE NOT
|
---|
170 | C ORDERED SO THAT X(I).LT.X(I+1).
|
---|
171 | C IER = 132, DF(I) IS NOT POSITIVE FOR SOME I.
|
---|
172 | C IER = 133, JOB IS NOT 0 OR 1.
|
---|
173 | C
|
---|
174 | C PRECISION/HARDWARE - DOUBLE
|
---|
175 | C
|
---|
176 | C REQUIRED ROUTINES - SPINT1,SPFIT1,SPCOF1,SPERR1
|
---|
177 | C
|
---|
178 | C REMARKS THE NUMBER OF ARITHMETIC OPERATIONS REQUIRED BY THE
|
---|
179 | C SUBROUTINE IS PROPORTIONAL TO N. THE SUBROUTINE
|
---|
180 | C USES AN ALGORITHM DEVELOPED BY M.F. HUTCHINSON AND
|
---|
181 | C F.R. DE HOOG, 'SMOOTHING NOISY DATA WITH SPLINE
|
---|
182 | C FUNCTIONS', NUMER. MATH. (IN PRESS)
|
---|
183 | C
|
---|
184 | C-----------------------------------------------------------------------
|
---|
185 | C
|
---|
186 |
|
---|
187 |
|
---|
188 | ALGORITHM
|
---|
189 |
|
---|
190 | CUBGCV calculates a natural cubic spline curve which smoothes a given set
|
---|
191 | of data points, using statistical considerations to determine the amount
|
---|
192 | of smoothing required, as described in reference 2. If the error variance
|
---|
193 | is known, it should be supplied to the routine in VAR. The degree of
|
---|
194 | smoothing is then determined by minimizing an unbiased estimate of the true
|
---|
195 | mean square error. On the other hand, if the error variance is not known, VAR
|
---|
196 | should be set to -1.0. The routine then determines the degree of smoothing
|
---|
197 | by minimizing the generalized cross validation. This is asymptotically the
|
---|
198 | same as minimizing the true mean square error (see reference 1). In this
|
---|
199 | case, an estimate of the error variance is returned in VAR which may be
|
---|
200 | compared with any a priori approximate estimates. In either case, an
|
---|
201 | estimate of the true mean square error is returned in WK(5). This estimate,
|
---|
202 | however, depends on the error variance estimate, and should only be accepted
|
---|
203 | if the error variance estimate is reckoned to be correct.
|
---|
204 |
|
---|
205 | If job is set to 1, bayesian estimates of the standard error of each
|
---|
206 | smoothed data value are returned in the array SE. These also depend on
|
---|
207 | the error variance estimate and should only be accepted if the error
|
---|
208 | variance estimate is reckoned to be correct. See reference 4.
|
---|
209 |
|
---|
210 | The number of arithmetic operations and the amount of storage required by
|
---|
211 | the routine are both proportional to N, so that very large data sets may be
|
---|
212 | analysed. The data points do not have to be equally spaced or uniformly
|
---|
213 | weighted. The residual and the spline coefficients are calculated in the
|
---|
214 | manner described in reference 3, while the trace and various statistics,
|
---|
215 | including the generalized cross validation, are calculated in the manner
|
---|
216 | described in reference 2.
|
---|
217 |
|
---|
218 | When VAR is known, any value of N greater than 2 is acceptable. It is
|
---|
219 | advisable, however, for N to be greater than about 20 when VAR is unknown.
|
---|
220 | If the degree of smoothing done by CUBGCV when VAR is unknown is not
|
---|
221 | satisfactory, the user should try specifying the degree of smoothing by
|
---|
222 | setting VAR to a reasonable value.
|
---|
223 |
|
---|
224 | References:
|
---|
225 |
|
---|
226 | 1. Craven, Peter and Wahba, Grace, "Smoothing noisy data with spline
|
---|
227 | functions", Numer. Math. 31, 377-403 (1979).
|
---|
228 | 2. Hutchinson, M.F. and de Hoog, F.R., "Smoothing noisy data with spline
|
---|
229 | functions", Numer. Math. (in press).
|
---|
230 | 3. Reinsch, C.H., "Smoothing by spline functions", Numer. Math. 10,
|
---|
231 | 177-183 (1967).
|
---|
232 | 4. Wahba, Grace, "Bayesian 'confidence intervals' for the cross-validated
|
---|
233 | smoothing spline", J.R.Statist. Soc. B 45, 133-150 (1983).
|
---|
234 |
|
---|
235 |
|
---|
236 | Example
|
---|
237 |
|
---|
238 | A sequence of 50 data points are generated by adding a random variable with
|
---|
239 | uniform density in the interval [-0.3,0.3] to the curve y=sin(3*pi*x/2).
|
---|
240 | The abscissae are unequally spaced in [0,1]. Point standard error estimates
|
---|
241 | are returned in SE by setting JOB to 1. The error variance estimate is
|
---|
242 | returned in VAR. It compares favourably with the true value of 0.03.
|
---|
243 | The IMSL function GGUBFS is used to generate sample values of a uniform
|
---|
244 | variable on [0,1].
|
---|
245 |
|
---|
246 |
|
---|
247 | INPUT:
|
---|
248 |
|
---|
249 | INTEGER N,IC,JOB,IER
|
---|
250 | DOUBLE PRECISION X(50),F(50),Y(50),DF(50),C(49,3),WK(400),
|
---|
251 | * VAR,SE(50)
|
---|
252 | DOUBLE PRECISION GGUBFS,DSEED,DN
|
---|
253 | DATA DSEED/1.2345D4/
|
---|
254 | C
|
---|
255 | N=50
|
---|
256 | IC=49
|
---|
257 | JOB=1
|
---|
258 | VAR=-1.0D0
|
---|
259 | DN=N
|
---|
260 | C
|
---|
261 | DO 10 I=1,N
|
---|
262 | X(I)=(I - 0.5)/DN + (2.0*GGUBFS(DSEED) - 1.0)/(3.0*DN)
|
---|
263 | F(I)=DSIN(4.71238*X(I)) + (2.0*GGUBFS(DSEED) - 1.0)*0.3
|
---|
264 | DF(I)=1.0D0
|
---|
265 | 10 CONTINUE
|
---|
266 | CALL CUBGCV(X,F,DF,N,Y,C,IC,VAR,JOB,SE,WK,IER)
|
---|
267 | .
|
---|
268 | .
|
---|
269 | .
|
---|
270 | END
|
---|
271 |
|
---|
272 | OUTPUT:
|
---|
273 |
|
---|
274 | IER = 0
|
---|
275 | VAR = 0.0279
|
---|
276 | GENERALIZED CROSS VALIDATION = 0.0318
|
---|
277 | MEAN SQUARE RESIDUAL = 0.0246
|
---|
278 | RESIDUAL DEGREES OF FREEDOM = 43.97
|
---|
279 | FOR CHECKING PURPOSES THE FOLLOWING OUTPUT IS GIVEN:
|
---|
280 |
|
---|
281 | X(1) = 0.0046 F(1) = 0.2222 Y(1) = 0.0342 SE(1) = 0.1004
|
---|
282 | X(21) = 0.4126 F(21) = 0.9810 Y(21) = 0.9069 SE(21) = 0.0525
|
---|
283 | X(41) = 0.8087 F(41) = -0.5392 Y(41) = -0.6242 SE(41) = 0.0541
|
---|
284 |
|
---|