Vickers A. A comparison of the performance of a range of trial-level surrogacy methods. Poster presented at the ISPOR Europe 2022; November 7, 2022. Vienna, Austria. [abstract] Value Health. 2022 Dec 1; 25(12):S357. doi: 10.1016/j.jval.2022.09.1771


OBJECTIVES: Several trial-level surrogate methods have been proposed in the literature. However, often only one method is presented in practice. This research demonstrates the value of comparing a range of model predictions by plotting trial-level associations with prediction intervals and presenting results from cross-validation procedures.

METHODS: Two oncology data sets were used as examples; one contained 34 studies with a moderate surrogate association and the other contained 14 studies with a strong association. The models fitted included weighted linear regression, meta-regression, and Bayesian random-effects bivariate meta-analysis (BRMA).

RESULTS: Predictions from the models investigated showed a high degree of variation when there was a moderate association, with surrogate threshold effects (STE) of 0.398 - 0.906 and less variation with a strong association, with STE of 0.822 - 0.887. Methods that did not account for all the heterogeneity underestimated the error when trial-level associations were moderate, whereas methods that modelled all the heterogeneity had difficulty converging with the smaller sample size. With many studies present, BRMA gave the most robust results, whereas with a smaller number of studies random-effects meta-regression gave the most robust results. Frequentist fixed-effects weighted regression models gave reasonable predictions in both examples when weighted using the inverse variance of the target variable. However, the results from meta-analyses techniques demonstrated greater uncertainty in predictions.

CONCLUSIONS: Frequentist fixed-effects weighted regression provides a useful reference model because prediction intervals represent 95% of the variance in the data. Because meta-analyses model the heterogeneity separately, prediction error is partly determined by how the random-effects part of the model is specified, which can result in a high degree of variation in the results from such models. Plots of trial-level associations with prediction errors need to be presented from all models to make comparisons between model predictions and to make the results applicable to future studies.

Share on: