Opened 4 years ago

Closed 4 years ago

Gaussian process regression should also calculate the leave-one-out predictive probability

Reported by: Owned by: gkronber gkronber medium HeuristicLab 3.3.15 Algorithms.DataAnalysis 3.3.14

comment:1 Changed 4 years ago by gkronber

r14899: implemented calculation of LOO predictive probability for Gaussian process regression

comment:2 Changed 4 years ago by gkronber

r14918: removed a comment

comment:3 Changed 4 years ago by gkronber

• Owner set to gkronber
• Status changed from new to accepted

comment:4 Changed 4 years ago by gkronber

• Owner changed from gkronber to mkommend
• Status changed from accepted to reviewing

comment:5 follow-up: ↓ 6 Changed 4 years ago by mkommend

• Owner changed from mkommend to gkronber
• Status changed from reviewing to readytorelease

Reviewed and tested r14899 and r14918.

I haven't verified that the calculation matches exactly in GPML. <BR> During testing i noticed that the LOO pred probability (which is not a probability btw) is rather large (1E40 or more) for models with little variance on the estimates.

comment:6 in reply to: ↑ 5 Changed 4 years ago by gkronber

Reviewed and tested r14899 and r14918.

I haven't verified that the calculation matches exactly in GPML. <BR> During testing i noticed that the LOO pred probability (which is not a probability btw) is rather large (1E40 or more) for models with little variance on the estimates.

Thanks for the review. I also noticed that the resulting values are rather strange. I need to re-check this. The GPML book calls this value either 'LOO log predictive probability' or alternatively 'log pseudo-likelihood'.

EDIT: the calculation has been checked and fixed. See comment:8 and comment:9

Last edited 4 years ago by gkronber (previous) (diff)

comment:7 Changed 4 years ago by gkronber

r15160: renamed LOO log predictive probability to LooCvNegativeLogPseudoLikelihood

comment:8 Changed 4 years ago by gkronber

Compared results for LOO calculation to the results of the GPML package using the following MATLAB script. The results are completely off.

trainX = towerDatascaled(1:500,1:25);
trainY = towerDatascaled(1:500,26);
hyp.mean = [0];
hyp.cov = [0; 0];
hyp.lik = [-3];
hyp2 = minimize(hyp, @gp, -100, @infGaussLik, 'meanConst', 'covSEiso', 'likGauss', trainX, trainY);
[ymu, ys2, fmu, fs2, lp] = gp(hyp2, 'infLOO', 'meanConst', 'covSEiso', 'likGauss', trainX, trainY, trainX, trainY);

sum_lp = sum(lp);


r15163: changed calculation of LOO log pseudo likelihood (evaluation order) which seemingly improves the results. More testing is necessary.

comment:9 Changed 4 years ago by gkronber

r15165: fixed calculation of log pseudo-likelihood by adding the noise term to the covariance function.

        ki[i] += sqrSigmaNoise;


Several other changes have been made which are effectively only re-arrangements of the formula.

The new calculation has been checked using the MATLAB GPML package for meanConst and covSEiso. The results are the same.

comment:10 Changed 4 years ago by gkronber

r15187: renamed remaining fields and properties referring to 'PredictiveProbability'

comment:11 Changed 4 years ago by gkronber

r15188: merged r14899,r14918,r15160,r15163,r15165,r15187 from trunk to stable

comment:12 Changed 4 years ago by gkronber

all changesets merged to stable

comment:13 Changed 4 years ago by gkronber

• Resolution set to done
• Status changed from readytorelease to closed
Note: See TracTickets for help on using tickets.