Skip to main content

Table 4 Summary of key characteristics of the several modelling approaches

From: Comparison of Bayesian approaches for developing prediction models in rare disease: application to the identification of patients with Maturity-Onset Diabetes of the Young

Modelling Approaches

Advantages

Disadvantages

Original

- Model can be built in one training dataset (e.g. case-control) but probabilities can be updated based on information of prior probability from other studies.

- Individuals cannot have a recalibrated probability lower than the estimated prevalence in the general population.

- Probabilities are aggregated into groups.

- The recalibrated probabilities can be sensitive to the choice of grouping used.

Albert Offset

- Model can be built in one training dataset (e.g. case-control) and probabilities can be updated based on information of prior probability from other studies.

- Assumes the likelihood ratio for any given set of covariates is the same in the training data and calibration/target population.

Re-estimation

- Model can be built on one training dataset (population-representative) so no recalibration necessary.

- Requires a large sample size from the general population if model development required.

- A lot of uncertainty in a rare disease setting due to very low number of positive cases.

Recalibration

- Scales odds ratios allowing for easy recalibration so that the model can be used in different settings (e.g.: general population or lab referral usage).

- Requires two datasets: a training dataset (e.g. case-control) and recalibration dataset (e.g. general population).

Re-estimation mixture

- The model can be specified to include additional information on biomarker testing.

- Requires a large sample size from the general population.

- Model can be built on one training dataset (population-representative) so no recalibration necessary.

- A lot of uncertainty in a rare disease setting due to very low number of positive cases.

Recalibration mixture

- The model can be specified to include additional information on biomarker testing.

- Requires two datasets: a training dataset (e.g. case-control) and recalibration dataset (e.g.: general population).

- Scales odds ratios allowing for easy recalibration so that the model can be used in different settings (e.g.: general population or lab referral usage).