For instance, in an intense academic debate,5 6 7 8 9 10 11 one camp maintained that effects of propranolol on death differed in two groups of study centres, whereas the other remained highly sceptical.For example, consider the effect of statin therapy on major coronary events (that is, non-fatal myocardial infarction and coronary heart disease death) in patients with varying coronary risks.A 45 year old non-smoking woman without a family history of heart disease and without diabetes presents with a raised serum cholesterol (5.2 mmol/L and a blood pressure of 130/85 mm Hg.This example shows how subgroup effects are often present when using the absolute risk reduction, but rarely present when using a relative effect measure.Indeed, in the presence of known prognostic factors that allow definition of groups at varying risk, if no subgroup effect is associated with these factors for relative measures of effect, a subgroup effect for absolute measures must exist.Subgroups defined according to post-randomisation characteristics might be influenced by tested interventions; that is, the apparent difference of treatment effect between subgroups can be explained by the intervention itself, or by differing prognostic characteristics in sub-groups that emerge after randomisation, rather than by the subgroup characteristic itself.

Thus, the credibility of subgroup hypotheses based on post-randomisation characteristics is severely compromised, and can be rejected simply on this criterion.

In this article, we describe these new criteria, use real-world examples to show how they influence the strength of inference of subgroup hypotheses, and discuss their implications.

Finally, we propose a re-structured checklist of items addressing study design, analysis, and context.

These limitations became vivid when deciding on the credibility of a subgroup hypothesis of a large multi-centre randomised trial.24 On the basis of this experience, a review of published methodological articles addressing subgroup analyses, and consultation with clinicians and epidemiologist colleagues, we identified four new criteria that could further aid differentiation between spurious and real subgroup effects.

We now believe that failure to consider these criteria could result in misleading inferences about subgroup hypotheses.

A crucial issue in subgroup analyses is that the effects should be examined with relative rather than absolute measures.