- Research
- Open access
- Published:
Warnings on the inclusion of cluster randomized trials in meta-analysis: results of a simulation study
BMC Medical Research Methodology volumeĀ 25, ArticleĀ number:Ā 133 (2025)
Abstract
Background
Consolidation of treatment effects from randomized controlled trials (RCT) is considered one of the highest forms of evidence in research. Cluster randomized trials (CRT) are increasingly used in the assessment of the effectiveness of interventions when individual-level randomization is impractical. In meta-analyses, CRTs that address the same clinical question as RCTs can be pooled in the same analysis; however, they need to be analyzed with appropriate statistical methods. This study examined the extent to which meta-analysis results are influenced by the inclusion of incorrectly analyzed CRTs through a series of simulations.
Methods
RCT and CRT datasets were generated with a continuous treatment effect of zero, two trial arms, and equal number of participants. CRT datasets were generated with varying number of clusters (10, 20 or 40), observations per cluster (10, 30 or 50), total variance (1, 5 or 10) and ICC (0.05, 0.10 or 0.20). Each simulated CRT dataset (nā=ā1000 for each scenario) was analyzed using standard linear regression and mixed-effects regression with clusters treated as random effects to represent the incorrectly and correctly analyzed CRTs. Meta-analytic datasets were created by varying the total number of studies (4, 8 or 12), number of CRTs out of the total number of studies (single, half or all), and the number of correctly analyzed CRTs (none, half or all). Model performance was summarized from 1000 random-effects meta-analyses for each scenario.
Results
The percentage of statistically significant results (at pā<ā0.05) was consistently lower when all CRTs were correctly analyzed. The alpha threshold (5%) was exceeded in 6 (2.47%) of 243 scenarios when all CRTs were correctly analyzed, compared to 177 (72.84%) and 195 (80.25%) scenarios when half or none of the CRTs were correctly analyzed, respectively. Coverage probabilities and model-based SEs were higher when all CRTs were correctly analyzed, while the estimated effect sizes and bias averaged across iterations showed no differences regardless of the number of correctly analyzed CRTs.
Conclusions
Ignoring clustering in CRTs lead to inflated false-positive conclusions about the efficacy of treatments, highlighting the need for caution and proper analytical methods when incorporating CRTs into meta-analyses.
Background
The emergence of infodemic [1], coupled with the surge in both clinical trials and observational studies, has heightened the need for meta-analyses to synthesize and summarize evidence. Meta-analyses play a key role in combining information from diverse sources, providing an overall assessment of the treatment effect, and informing decision-making within the framework of evidence-based practice [2]. The growing demand for meta-analyses underscores the importance of a robust and systematic approach to evidence synthesis amid a progressively expanding research landscape.
In many fields of research, consolidation of treatment effects from well-conducted randomized controlled trials (RCT) is generally considered one of the highest forms of evidence [3]. Cluster randomized trials (CRT), a form of randomized trial, are increasingly being utilized in the assessment of the effectiveness of interventions [4]. CRTs differ from conventional RCTs in that clusters, rather than individual participants, are randomized to treatment groups [5]. This design is particularly useful in situations where randomization at the individual level may not be feasible due to practical or logistical reasons, or when the contamination between treatment groups within clusters is high such as when the effects of the interventions are expected to extend beyond individual participants [6]. The use of CRTs has grown as researchers acknowledge their relevance to tackle specific challenges associated with implementing interventions at the group level [7].
In the context of meta-analysis, CRTs that respond to the same clinical question as RCTs can be pooled in the same analysis. However, it is crucial to consider and address the methodological differences between these two types of trials. CRTs are generally more complex to analyze than RCTs due to the degree of similarities of observations within a cluster and the differences between clusters, quantified by the intra-cluster correlation coefficient (ICC) [4]. Failure to account for the lack of independence of observations within clusters results in a unit-of-analysis error, which means that the CRT is analyzed as if individuals, rather than clusters, were randomized to treatment groups [8]. This error can lead to standard errors that are artificially small and may increase the rate of false-positive findings about the treatment effects. This issue extends to meta-analysis since CRTs where clustering has not been adequately addressed will have inflated weights in the pooling procedure, introducing bias into the analysis.
Statistical methods for including CRTs in aggregate data meta-analysis have been developed and documented [8]. Yet, the inclusion of CRTs that fail to account for clustering remains common in systematic reviews and meta-analyses. An evaluation of CRTs in reviews highlights this issue, revealing that reporting of cluster adjustment methods is often lacking, risk-of-bias assessments are frequently incomplete, and unadjusted CRT data are still routinely included in meta-analyses [9]. Possible explanations include a lack of awareness among researchers regarding appropriate statistical methods, limitations in statistical software that do not cater to methods for analyzing clustered data, and differences in journal reporting standards that may not consistently enforce proper analytical approaches.
Although in many meta-analyses only a small proportion of the included studies are CRTs, their incorrect analysis can still influence the overall meta-analysis results [10]. While it may seem intuitive that incorporating biased estimates from CRTs in meta-analyses leads to biased results, the magnitude and direction of such bias are not always straightforward. The extent of bias depends on several factors. For example, the degree of clustering plays a critical role, such that when higher ICC values are ignored, the bias tends to be more pronounced due to greater underestimation of the design effect [8]. The proportion of CRTs analyzed incorrectly could also have an effect, such that if only a small fraction of CRTs misapplied methods, their impact on the overall meta-analytic estimate may be limited. Furthermore, variations in cluster size can amplify bias, particularly when large clusters dominate the sample, potentially skewing results in one direction. The type of outcome measure could also be relevant, as adjustments to standard errors may differ across outcome types. Prior studies have underscored issues with improper handling of CRTs in individual analyses [8, 11, 12], yet the propagation of these biases in meta-analyses remains relatively underexplored. Thus, this study aimed to examine the extent to which meta-analysis results are influenced by the inclusion of incorrectly analyzed CRTs, by exploring different scenarios through a series of simulations.
Methods
Data generating model and simulation scenarios
The simulation plan was conceived to mimic as close as possible the realistic conditions of obtaining meta-analytic data. Data generating models for RCTs and CRTs were set. For the RCTs, datasets were generated corresponding to a balanced design with two trial arms representing treatment and control, a normally distributed continuous treatment effect of 0 (i.e. mean difference of 0 between treatment and control groups), and a fixed sample size of 1000 per study. Datasets were generated corresponding to CRT design with varying number of clusters (10, 20 or 40), observations per cluster (10, 30 or 50), total variance (1, 5 or 10; the sum of the cluster variance and residual variance) and ICC (0.05, 0.10 or 0.20; generated by adjusting the cluster and residual variance accordingly). These ICC values were chosen to represent a wide range of plausible values typically observed in primary care and public health studies. For simplicity, only two trial arms were simulated, with equal number of clusters and observations per arm. Table 1 shows the 81 data generating scenarios used for simulating the CRT datasets. Under each scenario, 1000 datasets were produced using the simstudy package in R [13]. In all scenarios, the true treatment effect was set to be similar across studies. The simulation codes are available in Additional file 1.
Generating and analyzing the meta-analytic datasets
The meta-analytic datasets were generated based on a two-stage approach. First, each simulated CRT dataset described above was fitted using two models: a standard linear regression model and a mixed-effects regression model with clusters treated as random effects using the lme4 package [14]. Throughout this work, the first model was treated as incorrectly specified, given that it ignored clustering, while the second model was treated as correctly specified [11]. Second, the beta-coefficient (i.e. mean difference between the treatment and control arms) and the standard error were extracted from each model. The above steps allowed the generation of pairs of aggregate outcome data which were used to create the meta-analytic datasets (with correctly and incorrectly analyzed CRTs), varying in the number of studies (4, 8 or 12), the number of CRTs out of the total number of studies (single, half or all), and the number of correctly analyzed CRTs (none, half or all). Table 2 provides a summary of the combinations of studies in the meta-analytic datasets.
Thus, the intersection of scenarios outlined in Tables 1 and 2 produced a total of 1944 (81 Ćā24) meta-analytic scenarios. For each scenario, random-effects meta-analysis was carried out 1000 times, employing the Sidik-Jonkman estimator of between-study heterogeneity tau [15]. This method was chosen for its better performance with larger between-study heterogeneity and computational efficiency, although alternative estimators such as the DerSimonian-Laird or the Restricted Maximum Likelihood estimator could also be considered [16].
Assessing model performance
To assess model performance, the estimated effect size averaged over all simulation iterations was obtained, together with the bias (i.e. difference between the mean pooled effect size and the treatment effect specified during data generation), model-based standard error (i.e. the average of standard errors across repetitions), coverage (i.e. percent at which the 95% confidence interval contained the treatment effect), and the percentage of statistically significant meta-analytic results at pā< 0.05 out of 1000 pooled effect sizes [17]. It is important to note that the percentage of statistically significant meta-analytic results represents type I error in the main analysis (where true treatment effect was set to 0), while it represents power in the sensitivity analysis (where true treatment effect was set to 0.3). In addition, the Monte Carlo Standard Error (MCSE; a measure of the uncertainty around the estimates from repeated sampling in simulations) for each performance measure was reported. The simulation results were analyzed using the siman suite in Stata [18], while plots were generated using the ggplot2 package in R [19, 20].
Sensitivity analysis
As a sensitivity analysis, a mean difference of 0.3 between treatment and control groups was considered. This effect size was chosen because: (1) although it may be small at the individual-level, it reflects a practical and achievable outcome in real-world interventions, for example, in the context of nutrition intervention studies [21]; and (2) to provide a balanced power profile, to avoid saturation that might occur with larger effects and facilitate meaningful comparisons across meta-analyses where none, half, or all CRTs were correctly analyzed.
Results
We primarily present the simulation results from scenarios where all studies included in the meta-analysis were CRTs (i.e. 4 CRTs of 4 studies, 8 CRTs of 8 studies, and 12 CRTs of 12 studies), covering a total of 729 scenarios. While such scenarios are less common in practiceāa review found that only 10% of reviews include only CRTs, whereas the remaining 90% include a mix of RCTs and CRTs [9]āthey represent the upper bound of potential biases and undercoverage, allowing for a clearer assessment of the impact of including incorrectly analyzed CRTs in meta-analyses. To provide a more comprehensive picture, a summary of results for scenarios where only one (nā= 486 scenarios) or half (nā= 729 scenarios) of the included studies were CRTs are presented in the succeeding subsection. All performance measure estimates and their MCSE are available in the supplementary file.
Percentage of statistically significant results
Across diverse scenarios encompassing varying number of clusters, number of observations per cluster, and total variance, the percentage of statistically significant results (at pā< 0.05) was consistently lower when all CRTs were correctly analyzed (Fig.Ā 1). Out of 243 scenarios with 1000 repetitions each, the alpha threshold of 5% was exceeded in 6 (2.47%) scenarios when all CRTs were correctly analyzed, in 177 (72.84%) scenarios when half of the CRTs were correctly analyzed, and 195 (80.25%) scenarios when none of the CRTs were correctly analyzed, suggesting substantial inflation of the type I error rate in scenarios where CRTs are not properly accounted for. In terms of absolute difference in the percentage of statistically significant results, the scenarios where none or half of the CRTs were correctly analyzed produced higher type I error compared to when all CRTs were correctly analyzed, on average by 4.4% (maximum 12.4%) and by 4.0% (maximum 11.6%) across 243 scenarios, respectively (TableĀ 3). Higher percentage differences were observed with higher ICCs and a smaller number of studies. A similar pattern was observed in the sensitivity analysis with a treatment effect of 0.3. Notably, the percentage differences in power were systematically higher compared to when the treatment effect was zero, and the total variance showed an influence on power (TableĀ 3; Additional file 2).
Coverage
The coverage probabilities ranged from 84.2% to 98.7%. Overall, coverage was higher when all CRTs were correctly analyzed (mean coverage across 243 scenarios: 97.1%, range: 94.2% to 98.7%) compared to when only half (93.2%, 84.8% to 97.2%) or none of the CRTs were correctly analyzed (92.7%, 84.2% to 96.9%) (Fig.Ā 2). Higher coverage probabilities were observed with higher number of studies included in the meta-analysis and with lower ICC, while coverage remained consistent across different variances. The observed trends in coverage probabilities were similar in the sensitivity analysis with a treatment effect of 0.3 (Additional file 3), suggesting that coverage patterns were not influenced by the magnitude of the treatment effect.
Mean of the pooled effect sizes, bias, and model-based SE
The estimated effect size averaged over all simulation iterations showed a notable level of consistency with the treatment effect (0 and 0.3) set during the data generation process. Consequently, the mean bias was 0.00 ranging from ā0.02 to 0.02 in both the main (Fig.Ā 3) and sensitivity analysis (Additional file 4). For both mean and bias, higher MCSEs were observed with higher variance, fewer studies, and higher ICC, however, all MCSE estimates remained below 0.02. The model-based SE followed this trend, with higher SEs observed with higher variance, fewer studies, and higher ICC (Fig.Ā 4). As such, the largest set of model-based SEs was observed in the scenarios including 4 studies, variance of 10, and ICC of 0.20. Conversely, smaller SEs were generated with larger number of clusters and larger number of observations per cluster, although this trend was more pronounced in the former than the latter. There were no marked differences observed in terms of the mean of the pooled effect sizes and bias according to the number of CRTs correctly analyzed, however, larger model-based SEs were observed when all CRTs were correctly analyzed compared to when only half of none of the CRTs were correctly analyzed.
Results of scenarios where only one or half of the included studies were CRTs
Compared to scenarios where all included studies were CRTs, the extent of inflated percentages of statistically significant results and reduced coverage was less pronounced in scenarios where only one or half of the studies were CRTs, although the overall pattern persisted according to the number of correctly analyzed CRTs. Specifically, when none of the CRTs were correctly analyzed, the nominal 5% alpha threshold was exceeded in 78 out of 243 scenarios (32.1%) with half CRTs, and in 9 out of 243 scenarios (3.7%) with one CRT (TableĀ 4; Additional Files 5 and 6). In contrast, when the CRTs were correctly analyzed, none of the scenarios (0 out of 243) exceeded the threshold. Higher coverage probabilities were observed when all CRTs were correctly analyzed, regardless of the number of CRTs included in the meta-analyses (Additional files 7 and 8). Similar to the scenarios where all included studies were CRTs, the effect size averaged over all simulation iterations and the mean bias were zero, suggesting that the primary impact of incorporating incorrectly analyzed CRTs in meta-analyses manifests through statistical significance and coverage. Finally, the trends for model-based SE were similar to those observed in the all-CRT scenarios, but were less prominent (Additional files 9 and 10).
Discussion
This simulation study explored the impact of including incorrectly analyzed CRTs in meta-analyses. By systematically varying key parametersāincluding the total number of studies, the variance, the number of CRTs, number of clusters, number of observations per cluster, and ICCāthis study provided insights into how meta-analysis results vary across scenarios where none, half, or all CRTs were correctly analyzed. The results highlighted the importance of properly accounting for clustering effects in CRTs to avoid biased estimates and misleading conclusions in meta-analyses [8].
The study revealed important trends in model performance measures. Across diverse scenarios, the estimated effect size averaged across all simulation iterations and bias exhibited remarkable consistency, which suggests that despite several variations in study characteristics, the overall treatment effect estimation was relatively stable across repeated iterations. This also suggests that when incorrectly analyzed CRTs are included in meta-analysis, the issue is more related to increasing type I error and reducing coverage, rather than causing systematic under or overestimation of treatment effects. However, it is important to note that there was increasing MCSE with increasing number of CRTs making up the total number of studies for both metrics (i.e. higher MCSE when all of the studies included in the meta-analyses were CRTs compared to when only one or half were CRTs), suggesting greater uncertainty and potential variability in the estimates when more CRTs were included in the meta-analysis. In practice, it is expected that this issue will be more severe, as there will likely be a broader array of potential sources of heterogeneity when incorporating CRTs into meta-analyses, for example, the type of unit randomised (household, community, school, or other forms of clusters), the size of the clusters, and the composition of the sample [22].
The trends observed for the model-based SEs were as anticipated. Higher model-based SEs were generated when all CRTs were correctly analyzed since these scenarios have smaller effective sample sizes than when none or only half of the CRTs were correctly analyzed [23]. It was interesting to see the more pronounced decreasing trend in the model-based SEs with increasing number of clusters, than with increasing number of observations per cluster (i.e. cluster size). This is consistent with previous findings [23, 24], which show that a high number of clusters (holding the cluster size constant) yields a smaller design effect and larger effective sample size than a high cluster size (holding the number of clusters constant). This pattern underscores the influence of study design factors on the precision of treatment effect estimates which are important to consider when designing CRTs.
The scenarios where all CRTs were correctly analyzed yielded higher coverage probabilities, indicating more confidence intervals encompassing the true value of the parameter being estimated. The lower coverage probabilities in the scenarios were half or none of the CRTs were correctly analyzed suggest potential issues with the accuracy and precision of the estimates from these scenarios. We provided some indication of the extent of the problem (i.e. only about 1 in 4 and 1 in 5 of the scenarios achieved good coverage when half and none of the CRTs were correctly analyzed, respectively), although how this applies in practice requires further investigation. In terms of the percentage of statistically significant results, a clear pattern was observedāthat the scenarios where none or only half of the CRTs were correctly analyzed produced higher proportions of statistically significant findings. This overestimation can lead to false positive findings about treatment effects. This emphasized the underestimated uncertainty and inflated type I error rates due to the insufficient adjustment for clustering effects in CRTs. The widening of the differences in the percentage of statistically significant results with increasing ICC suggests the importance of also considering the magnitude of clustering effects. In practice, compared to our simulation scenarios, there are several other factors that can potentially increase the type I error rate in CRTs, such as the small number of clusters and variable cluster sizes [25, 26]. These findings highlight that meta-analytic results are contingent upon proper handling of clustering in CRTs.
Our sensitivity analysis using a treatment effect of 0.3 demonstrated that even with this relatively small effect, the biases associated with the incorrect analysis of CRTs were evident, leading to an increased rate of false positive findings. We anticipate that these biases would be further amplified with larger treatment effects, potentially leading to an even greater overestimation of significance when CRTs are not correctly analyzed. This highlights the importance of appropriate analytical strategies when including CRTs in meta-analyses, particularly in the presence of large treatment effects.
Some limitations of this simulation study warrant consideration. Firstly, the scenarios were based on certain assumptions and conditions, and may not fully capture the complexity of real-world data. For example, in aggregate data meta-analysis, researchers often lack access to the ICC, as it is frequently not reported in primary studies. As a result, they typically assume a particular value (usually derived from another source) or a range of values to calculate the design effect and adjust the effective sample size [8]. Alternatively, some authors conduct sensitivity analyses that exclude CRTs. These approaches could lead to different outcomes than those observed in this study. Secondly, while the study explored a broad range of scenarios, it did not include other factors such as unequal number of clusters, unequal cluster sizes, and various mixtures of sample sizes, which are expected to affect model performance. Thirdly, we only considered a random-effects model in our analysis and did not assess a common-effect model, which may have yielded different observations. This decision was made a priori, given that meta-analyses including CRTs often exhibit substantial between-study heterogeneity. Lastly, the study was limited to a continuous treatment effect and did not consider other forms of outcome measures (e.g. binary outcomes). It is well-recognized that the analysis of binary outcomes in CRTs, particularly when estimating odds ratios, is more complex due to differences in interpretation between marginal and cluster-specific estimands and their implications for effect estimation. Compared to continuous outcomes, the choice of estimand becomes more important and should be guided by a clear research question: whether the researcher is interested in a population-averaged or cluster-specific effect. This topic is currently under investigation [27], which may help inform the appropriate strategy for incorporating CRTs with binary outcomes into meta-analyses. Finally, future research could explore alternative simulation scenarios and evaluate the performance of different meta-analytic models under varying conditions, further enhancing our understanding of methodological considerations in meta-analyses involving CRTs.
Conclusion
In conclusion, the findings from this simulation study provide important implications for practice, emphasizing the need for robust methodological approaches to evidence synthesis, particularly when incorporating CRTs in meta-analyses. Inclusion of incorrectly analyzed CRTs in meta-analyses lead to false-positive conclusions about the efficacy of a treatment. Researchers should ensure the appropriate handling of clustering effects to ensure the validity and reliability of meta-analytic findings.
Data availability
The materials and codes used are available as additional files.
Abbreviations
- RCT:
-
Randomized controlled trial
- CRT:
-
Cluster randomized trial
- ICC:
-
Intra-cluster correlation coefficient
- MCSE:
-
Monte Carlo standard error
References
Zhang J, Pan Y, Lin H, Sun Z, Wu P, Tu J. Infodemic: Challenges and solutions in topic discovery and data process. Archives of Public Health. 2023;81(1):166.
Gurevitch J, Koricheva J, Nakagawa S, Stewart G. Meta-analysis and the science of research synthesis. Nature. 2018;555(7695):175ā82.
Deaton A, Cartwright N. Understanding and misunderstanding randomized controlled trials. Soc Sci Med. 2018;210:2ā21.
Dron L, Taljaard M, Cheung YB, Grais R, Ford N, Thorlund K, et al. The role and challenges of cluster randomised trials for global health. Lancet Glob Health. 2021;9(5):e701ā10.
Moberg J, Kramer M. A brief history of the cluster randomised trial design. J R Soc Med. 2015;108(5):192ā8.
Torgerson DJ. Contamination in trials: is cluster randomisation the answer? BMJ. 2001;322(7282):355ā7.
Allanson ER, TunƧalp Ć, Vogel JP, Khan DN, Oladapo OT, Long Q, et al. Implementation of effective practices in health facilities: a systematic review of cluster randomised trials. BMJ Glob Health. 2017;2(2):e000266.
Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page MJ, Welch VA (editors). Cochrane Handbook for Systematic Reviews of Interventions version 6.5 (updated August 2024). Cochrane. 2024. Available from www.training.cochrane.org/handbook.
Richardson M, Garner P, Donegan S. Cluster Randomised Trials in Cochrane Reviews: Evaluation of Methodological and Reporting Practice. PLoS ONE. 2016;11(3):e0151818.
Bolland MJ, Avenell A, Grey A. Analysis of cluster randomised trials as if they were individually randomised. Lancet Diabetes Endocrinol. 2023;11(2):75.
Hemming K, Taljaard M. Key considerations for designing, conducting and analysing a cluster randomized trial. Int J Epidemiol. 2023;52(5):1648ā58.
Billot L, Copas A, Leyrat C, Forbes A, Turner EL. How should a cluster randomized trial be analyzed? Journal of Epidemiology and Population Health. 2024;72(1):202196.
Goldfeld KS, Wujciak-Jens J. simstudy: Illuminating research methods through data generation. J Open Source Softw. 2020;5:2763.
Bates D, MƤchler M, Bolker B, Walker S. Fitting Linear Mixed-Effects Models Using lme4. J Stat Softw. 2015;67(1):1ā48.
Sidik K, Jonkman JN. Robust variance estimation for random effects meta-analysis. Comput Stat Data Anal. 2006;50(12):3681ā701.
Veroniki AA, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G, et al. Methods to estimate the between-study variance and its uncertainty in meta-analysis. Research Synthesis Methods. 2016;7(1):55ā79.
Morris TP, White IR, Crowther MJ. Using simulation studies to evaluate statistical methods. Stat Med. 2019;38(11):2074ā102.
Marley-Zagar E, White IR, Morris TP. A suite of Stata programs for analysing simulation studies. London Stata Conference 2022 02, Stata Users Group. 2022. Available from https://ideas.repec.org/p/boc/lsug22/02.html.
Wickham H. ggplot2: Elegant Graphics for Data Analysis. New York: Springer-Verlag; 2016.
Weber F, Knapp G, Glass Ć, Kundt G, Ickstadt K. Interval estimation of the overall treatment effect in random-effects meta-analyses: Recommendations from a simulation study comparing frequentist, Bayesian, and bootstrap methods. Res Synth Methods. 2021;12(3):291ā315.
Candel M, van Breukelen GJP. Best (but oft forgotten) practices: Efficient sample sizes for commonly used trial designs. Am J Clin Nutr. 2023;117(6):1063ā85.
Donner A, Klar N. Issues in the meta-analysis of cluster randomized trials. Stat Med. 2002;21(19):2971ā80.
Killip S, Mahfoud Z, Pearce K. What is an intracluster correlation coefficient? Crucial concepts for primary care researchers. Ann Fam Med. 2004;2(3):204ā8.
Hemming K, Eldridge S, Forbes G, Weijer C, Taljaard M. How to design efficient cluster randomised trials. BMJ. 2017;358:j3064.
Kahan BC, Forbes G, Ali Y, Jairath V, Bremner S, Harhay MO, et al. Increased risk of type I errors in cluster randomised trials with small or medium numbers of clusters: a review, reanalysis, and simulation study. Trials. 2016;17(1):438.
Rutterford C, Copas A, Eldridge S. Methods for sample size determination in cluster randomized trials. Int J Epidemiol. 2015;44(3):1051ā67.
Hemming K, Thompson JY, Taljaard M, Watson SI, Kasza J, Thompson JA, et al. Re-analysis of data from cluster randomised trials to explore the impact of model choice on estimates of odds ratios: study protocol. Trials. 2024;25(1):818.
Acknowledgements
Not applicable
Funding
No funding was received for this work.
Author information
Authors and Affiliations
Contributions
JS and GLDT conceptualized the study. JS analyzed the data, reviewed by ER and GLDT. JS drafted the initial version of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
12874_2025_2586_MOESM2_ESM.pdf
Additional file 2.Ā Percentage of statistically significant results across scenarios with all CRTs when treatment effect is 0.3
12874_2025_2586_MOESM11_ESM.docx
Additional file 11.Ā Performance measures across scenarios based on correctly analyzed CRTs (None vs Half vs All) when all included studies are CRTs
12874_2025_2586_MOESM12_ESM.docx
Additional file 12.Ā Performance measures across scenarios based on correctly analyzed CRTs (None vs Half vs All) when half of the included studies are CRTs
12874_2025_2586_MOESM13_ESM.docx
Additional file 13.Ā Performance measures across scenarios based on correctly analyzed CRTs (None vs Half vs All) when one of the included studies is a CRT
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the articleās Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleās Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Santos, J.A.R., Riggi, E. & Di Tanna, G.L. Warnings on the inclusion of cluster randomized trials in meta-analysis: results of a simulation study. BMC Med Res Methodol 25, 133 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12874-025-02586-2
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12874-025-02586-2