It just don’t work on you

How to interpret a hazard ratio

Dec 06, 2023

In the blues stomper Got My Mojo Workin’, Muddy Waters shouts:

Text within this block will maintain its original spacing when published

Got my mojo working,
Got my mojo working,
Got my mojo working, but it just don’t work on you.

However dazzling a treatment looked in its trial, it didn’t work on everyone and, like the mojo, may not work on you.1 You have to down the elixir and see. But the odds that you’ll improve are part of the trial’s results, captured in the hazard ratio — the HR. Yes, the mysterious HR. We'll unpack it in this post.

Be patient — an HR of 0.5 does not mean you have a 50% chance and an HR of 1.5 does not mean you have a 150% chance. For the tl;dr crowd, HR literally gives racing odds on which trial arm will reach endpoints2 first. A dud has HR=1.

Knowing whether a drug will work doesn’t tell you how much good it will do. Perhaps few patients in the trial responded but did so miraculously. Or nearly everyone responded but gained only weeks. Since HR and treatment benefit are free to go in separate directions, you need additional information to judge a trial.3

The intuition

Let's conduct a poor-man’s clinical trial:

On January 1, 2010, we give 100 men standard treatment and 100 men a new drug. On January 1, 2020, we count deaths in each group.

If 50 men died under standard treatment and 25 died who got the drug, we’d report that the death rate in the drug group was 50% of the death rate in the standard-of-care group.

This result (called risk ratio or relative risk) gives the essence of a hazard ratio: Compared to someone on the control arm, how much less likely am I to experience some unwelcome event (death, PSA rise, new metastasis) in any given period?

Real trials have complications we haven’t accounted for. Participants start at different times. Some drop out.4 But it's still possible to do a careful analysis and get the same kind of information, and the result is the hazard ratio. We first discuss what it is that HR measures, then how to convert HR into a number interpretable as relative risk.

On your mark

An imaginary 20-month trial created to illustrate HR. Dots are deaths; blue is experimental, red is control. Deaths occurred randomly and red had a 75% probability of being picked — so control patients had 75% probability of being closer to the endpoint, which is the same as saying experimental patients had a 25% probability. We then use the definition of odds —probability for divided by probability against — to compute the HR. HR gives the odds that an experimental patient died earlier. That’s 25/75, or 1/3, so hazard ratio is 0.33.

In our simpleminded trial we said the rate of death had been halved. A clinical trial5 is in fact a race: patients in each arm are heading for their endpoint, and success hinges on which group reaches its endpoints first.

If a disease is curable — the endpoint could be fever returning to normal — success means that patients in the experimental arm reach the endpoint earlier than control patients.

In cancer, where an endpoint means the party’s over — rising PSA, new metastasis, death — success means that treated patients take their time and let the early endpoints be hit by controls.

This leads us to the hazard ratio: A hazard ratio is odds6 that a patient on the experimental arm reaches the endpoint before a patient on the control arm.7

HR should exceed 1 if a trial’s endpoint is good news. Higher numbers mean more effective treatment — higher probability that a treatment-arm subject will beat a control-arm subject to recovery.

Whether the HR in a cancer trial should be less than 1 or greater than 1 depends on the investigators. If they defined HR in the usual way, HR should be as close to possible to zero — this is a race we don’t want to win. Alternatively, investigators may define it as control-vs-treatment odds so the HR won’t be fractional — an 5.9 HR seems clearer than 0.17 — and then bigger is again better.

In any kind of trial, it’s easy to see why HR = 1 is a nonstarter. 1:1 odds means you’ll have the same chances whether you take the drug or not.

Beware the CI

Because random trials are random, running them again may not yield exactly the same results, including the same HR. A notation like

HR, 0.35; 95% CI 0.15 to 0.82

means the reported HR of 0.35 might plausibly range, with 95% confidence8, between 0.15 and 0.82.

When you read an HR, check that the confidence interval does not cross 1 — an interval like

CI 0.75 to 1.7

is a red flag. Crossing 1 means the calculated HR falls below a predetermined threshold of certainty, so the real HR might be an ineffective 1.0 or even point the wrong way, meaning treatment hurts rather than helps.

From odds to probabilities

Odds aren’t identical to probabilities, but conversion is simple.9 The formula to convert HR to probability is

\(\begin{align*} p &= \frac{HR}{1+HR} \end{align*}\)

We’ll show how the formula can make a paper’s hazard ratio more interpretable. Our example is taken from a published paper on hazard ratios.10 (Drug names in the example are unfamiliar because the cancer is metastatic renal-cell carcinoma.)

The HR paper quotes this outcome from a real trial:

The median progression-free survival was significantly longer in the sunitinib group (11 months) than in the interferon alfa group (5 months), corresponding to a hazard ratio of 0.42 (95% confidence interval, 0.32 to 0.54; P < 0.001).

The paper then offers this translation:

The median progression-free survival was significantly longer in the sunitinib group (11 months) than in the interferon alfa group (5 months). The probability that the progression-free survival period was longer for a patient in the sunitinib group compared to a patient in the interferon alfa group was 70.4% (95% confidence interval, 64.9% to 75.8%; P < 0.001).

Sunitinib is the experimental drug and interferon alfa is the control.

Plugging into the formula, we find the probability associated with HR=0.42 is (0.42/1.42) = 0.296, or 29.6%.

But that isn’t the number we want. Remember how HR is defined — it’s based on an arm arriving earlier. So even though paragraph talks about lengthening survival, the HR is saying “the probability that the survival period was shorter is 29.6%.”

What’s the probability of it being longer? The probability of something not happening is always 1 minus the probability of it happening, so the probability of a longer period is 100% - 29.6% or 70.4% — which is the number that appears in the rewrite.11

Alternative calculation

Another way to calculate this probability is to flip the meaning of the HR so it becomes odds against being earlier, which we do by taking its reciprocal (1/0.42 = 2.38). Plugging this new HR into the probability formula (2.38/3.38 = 0.704) gives 70.4%.

Final thoughts

The dot illustration above helps show why large hazard ratios needn’t mean long survival. Imagine all dots were squeezed around a particular month. Because dot order hasn’t changed, hazard ratio is unaffected, but everyone dies at nearly the same time.

Since HR is a single number, it implies that odds didn’t change during the race (the proportional hazards assumption). Trial results don’t always turn out this way, and those trials should not quote an HR.

We’ve tried to explain the hazard ratio in a way that yields quantitative insight, but the definition may still feel strange. That’s OK. What’s important to us as patients is that we understand its intention and can interpret high and low values.

Several authors discovered independently that hazard ratio is identical to odds, and other authors have proved it12, but it’s not what’s customarily presented. Textbooks explain hazard ratio as a ratio of hazard functions, and I’m grateful we didn’t have to go there.

One problem is that we don’t understand drugs or disease in enough detail to make better predictions — this is the hope of precision medicine. Another reason is practical. To yield statistically valid results, trials need a large number of patients. That can mean lumping together men who are different in ways the investigators hope won’t be meaningful.

The endpoint is the clinical event — for instance, a PSA rise — that was chosen to measure a treatment’s effectiveness. A trial subject is followed till he reaches the endpoint and then leaves the trial.

That’s not all: you also care about cost, side effects, and hospital time. The good news in trial results that HR glosses over will be the focus of a future post.

The same post will also discuss how these trial contingencies are accounted for.

This is true of the kind of trials we look at as cancer patients though not necessarily of all trials.

The difference between odds and probability is explained below.

Some trials reverse this so HR odds are for the control arm arriving first. We discuss this below.

What “95% confidence” means is a great question that won’t be answered here.

Odds are the ratio of the probabilities for and against, or p/(1-p). The formula is then easy to solve, letting HR = odds.

The paper is paywalled — but aside from this nice example it’s too technical for the likes of us anyway. De Neve, J, Gerds, TA. On the interpretation of the hazard ratio in Cox regression. Biometrical Journal. 2020; 62: 742–750. https://doi.org/10.1002/bimj.201800255

The same calculations work for the two ends of the confidence interval — but because a confidence interval lists the smaller number first, the probability for 0.54 appears on the left as 64.9% and the probability for 0.32 appears on the right as 75.8%.

This paper offers a proof (no one’s asking you to follow it) and cites papers that use the odds explanation.

Progressions: A deep look at prostate cancer

Discussion about this post

Ready for more?