Tutorial 09: Bootstrap and Pareto smoothed importance sampling
Finn Lindgren
Source:vignettes/Tutorial09.Rmd
Tutorial09.Rmd
Introduction
In this lab session you will explore
- Exchangeability test for prediction scores
- Pareto smoothed importance sampling
- Open one of your github lab repository clone projects
- During this lab, you can work in a
.R
file, but working with code chunks in a.Rmd
is recommended. - Make sure you update the
StatCompLab
package to version 21.9.0 or higher.
Solution: The accompanying
Tutorial09Solutions
tutorial/vignette documents contain the
solutions explicitly, to make it easier to review the material after the
workshops.
Filement model prediction comparison
Leave-one-out prediction
Revisit the Tutorial 6 and load the 3D printer filament data:
data("filament1", package = "StatCompLab")
In tutorial 6, the two models A and B were estimated with
Predictions and scores were done with
pred_A <- filament1_predict(fit_A, newdata = filament1)
pred_B <- filament1_predict(fit_B, newdata = filament1)
score_A <- cbind(pred_A, filament1) %>%
mutate(
se = proper_score("se", Actual_Weight, mean = mean),
ds = proper_score("ds", Actual_Weight, mean = mean, sd = sd),
interval = proper_score("interval", Actual_Weight,
lwr = lwr, upr = upr, alpha = 0.1)
)
score_B <- cbind(pred_B, filament1) %>%
mutate(
se = proper_score("se", Actual_Weight, mean = mean),
ds = proper_score("ds", Actual_Weight, mean = mean, sd = sd),
interval = proper_score("interval", Actual_Weight,
lwr = lwr, upr = upr, alpha = 0.1)
)
as well as with a 50% estimation/prediction data split.
Implement a function leave1out(data, model)
that
performs leave-one-out cross validation for the selected
model
. the two models; for each \(i=1,\dots,N\), estimate the model
parameters using \(\{(x_j,y_j), j\neq
i\}\), compute the prediction information based on \(x_i\) for prediction model \(F_i\), and compute the required scores
\(S(F_i,y_i)\).
The output should be a data.frame
with \(N\) rows that extends the original data
frame with four additional columns mean
, sd
,
se
and ds
of leave-one-out prediction means,
standard deviations, and prediction scores for the Squared Error and
Dawid-Sebastiani scores.
The following code should work if your code is correct:
score_A <- leave1out(filament1, model = "A")
score_B <- leave1out(filament1, model = "B")
ggplot() +
geom_point(aes(CAD_Weight, se, colour = "A"), data = score_A) +
geom_point(aes(CAD_Weight, se, colour = "B"), data = score_B)
ggplot() +
geom_point(aes(CAD_Weight, ds, colour = "A"), data = score_A) +
geom_point(aes(CAD_Weight, ds, colour = "B"), data = score_B)
Exchangeability test
We want to investigate whether one model is better at predicting than the other. If they are equivalent, it should not matter, on average, if we randomly swap the A and B prediction scores within each leave-one-out score pair \((S^A_i, S^B_i)\). Use the test statistic \(\frac{1}{N}\sum_{i=1}^N (S^A_i - S^B_i)\), and make a Monte Carlo estimate of the p-value for a test of exchangeability between model predictions from A and from B against the alternative hypothesis that B is better than A. First compute the test statistic for the two prediction scores.
Hints: For this particular test statistic, one possible approach is to first compute the pairwise score differences and then generate randomisation samples by sampling random sign changes. Compute the test statistic for each randomly altered set of values and compare with the original test statistic. See the lecture on exchangeability for more information. Start with \(J=1000\) iterations when testing your code. Increase to 10000 for more precise results.
Pareto smoothed importance sampling
Explore the Pareto smoothed importance sampling (PSIS) method. Use a t-distribution with 10 degrees of freedom, and importance sampling distribution \(x\sim \mathsf{Normal}(0, \sigma^2)\) for different values of \(\sigma\). Use \(N=100\) or more samples.
Use the loo::psis()
function to compute Pareto smoothed
importance log-weights and diagnostics. What effect does different
values of \(\sigma\) have on the \(k\) parameter estimate? (the \(\hat{k}\) value is in the
$diagnostics$pareto_k
field of the loo::psis()
output.) Plot the original and smoothed log-weights as a function of
\(x\), and with stat_ecdf
,
for different values of \(\sigma\),
including \(\sigma=1\) and \(\sigma=2\).