Serious Questions Raised About Integrity Of International Trials

–TOPCAT analysis is the ‘smoking gun’ for trouble from ‘offshoring’ trials.

Large international randomized controlled trials, the cornerstone of modern medicine, are in big trouble.

As clinical trials have become a global enterprise, many observers have become increasingly worried about the integrity of data from certain geographic areas, in particular from Russia and other countries in the former Soviet Union. Now, a new paper published in the New England Journal of Medicine has provided “smoking gun” evidence validating such worries in the major TOPCAT trial. As a result, millions of patients who have heart failure with preserved ejection fraction may not be taking an inexpensive drug that could benefit them, when there are no available alternative therapies.

The new paper provoked expressions of concerns from not only well-known skeptics but also those deeply committed to clinical research. Former FDA commissioner Rob Califf said the finding “is disturbing and it’s good to get it published. I’ll just say that globalization of trials is much needed since 96% of people don’t live in the U.S., but ‘offshoring’ for financial reasons is bad because it raises the risk of malfeasance.” Noted skeptic John Ioannidis (Stanford University) went further, saying that the finding “opens a can of worms on how much we can trust clinical evidence from these settings.”

Who Killed TOPCAT?

The new paper is about the NIH’s TOPCAT trial, which in 2014 concluded that spironolactone (Aldactone) did not improve clinical outcomes in heart failure patients with preserved ejection fraction. However, a post-hoc analysis of the trial turned up a geographic difference. Spironolactone was significantly better than placebo in patients in North and South America, but there was no difference in Russia and Georgia.

The TOPCAT investigators have been wrestling with the regional variation issue since the paper’s original publication. Prior to the new NEJM paper, it was the subject of three previous papers.

Patients enrolled in the trial were about equally divided between the Americas and Russia/Georgia, but the event rate was three times as high in the Americas, leading the investigators to wonder at first whether the patients in Russia and Georgia actually had HFpEF. A 2015 Circulation paper by the trial investigators was unable to identify genuine differences in medical practice that might explain the variation in outcome.

A subsequent and highly unusual paper by the members of the Data and Safety Monitoring Board helped turn up the heat. Then last year Marc Pfeffer (Brigham and Women’s Hospital), the trial’s principal investigator, and Eugene Braunwald elevated the issue by publicly asking whether the TOPCAT patients in Russia and Georgia “actually had HFpEF and even whether one-half of them received spironolactone.”

The Smoking Gun

This simmering problem now appears to have reached full boil in the new paper in NEJM, which makes clear that the geographic variability in TOPCAT can not be explained by normal regional differences in diagnosis and management of heart failure. Instead, it offered compelling evidence that something went seriously wrong in TOPCAT, because the evidence now shows that many patients in Russia and Georgia did not even receive the study drug. (Some observers privately speculate that the study investigators in these regions may have sold the study drug on the open market.)

In the new paper, the TOPCAT investigators analyzed stored blood samples from the trial. They found that canrenone (an active metabolite of the study drug spironolactone) concentrations were undetectable in 30% of patients in Russia, compared with 3% of patients in the U.S. and Canada. Further, the intended effect of spironolactone (increases in serum potassium and aldosterone) were found only in those patients with detectable canrenone levels. The findings, wrote the authors, “arouse concerns regarding study conduct at some sites in Russia and… Georgia, where event rates and response to spironolactone were also of concern.”

Sanjay Kaul (Cedars-Sinai) said that the new paper presented firm evidence to confirm earlier suspicions. “There were already some suspicions raised previously about possible malfeasance, but we now appear to have the incriminating evidence of a ‘smoking gun’… I am not aware of any ethnic or genetic differences in the metabolism of spironolactone or propensity for hyperkalemia or creatinine elevation secondary to it.”

Missed Opportunity?

The most immediate impact of the story is that it is possible, and perhaps even probable, that millions of patients are not taking an inexpensive drug from which they could derive significant benefit. Currently, there are no drugs that are known to improve outcomes for HFpEF patients.

In an interview, Pfeffer was particularly concerned about this aspect of the story. “When you practice medicine, you use the best available data,” he said. “I would like people to be more aware” that spironolactone may be an option for their HFpEF patients. He acknowledged, though, that this sort of post-hoc analysis can be dangerous. “If this had been a pharma trial and I were doing this, you’d probably shoot me and I’d shoot myself,” he said. But, he pointed out, these concerns weighs less heavily given that spironolactone is an inexpensive generic pill that costs seven cents a day.

“My major intent is to alert physicians to the post-hoc data indicating the benefits of treating individuals with heart failure and preserved ejection fraction with spironolactone,” said Pfeffer. “With the caveats of monitoring renal function and potassium, I believe this is the best currently available data for physicians and patients to consider at this time.”

Pfeffer’s support for the use of spironolactone was echoed by other experts.

“I am generally very skeptical about subgroup analyses,” said Milton Packer (Baylor University Medical Center, Dallas). “There are very few subgroup or post hoc analyses that have credibility and are likely to be replicated. However, when I first saw the TOPCAT regional heterogeneity analysis (at an investigators’ meeting), it was the first subgroup analysis I had ever seen that had some real credibility. Subsequent analyses (including the one published this week) strongly support the conclusion that it is no longer reasonable to include the results of Russia and Georgia in interpreting the primary endpoint results of the TOPCAT trial. I am sure that there are people who will remain skeptical about excluding these regions, and they may think that my willingness to exclude Russia and Georgia violates the ‘rules’. My only response is: In general, the reason to follow the rules is because they make sense and they keep you out of trouble. When following the rules no longer makes sense, then continuing to follow them is insanity.”

Kaul said he also now supports the use of spironolactone for HFpEF, though he is slightly less impressed by TOPCAT. “Even if one only accepts the results from the North America cohort of TOPCAT to be unbiased and reliable, the 18% relative risk reduction in the primary endpoint is not large and the P-value of 0.026 is not robust enough to pass the strict regulatory muster of a claim on the basis of a single trial,” said Kaul. “On the other hand, a pragmatist could argue that, given the lack of effective disease-modifying therapies for HFpEF currently, there are reasonable grounds to include the favorable subgroup results from North America in the updated guidelines as a Class II recommendation. This is the best we have until the results of ongoing trials in HFpEF with sacubitril/valsartan [Entresto] (PARAGON-HF) or empagliflozin [Jardiance] (EMPEROR-Preserved) become available.”

Larger Implications

The new paper could also raise the temperature in the already-heated debate over the integrity of clinical trials, which are the underlying edifice of evidence-based medicine.

Ioannidis said that “the data are indeed troubling and they suggest that in this trial, the results from sites in Russia/Georgia cannot be trusted. A very large number of trials currently are trying to recruit participants from multiple sites around the world, including an increasing number of sites from countries where there is either no strong tradition in clinical research or no sufficient oversight. I suspect that errors, sloppiness, questionable research practices, and even occasional fraud may infiltrate such trials. We have mounting evidence to suggest this problem is common (although it is difficult to say with certainty which exact reason is the explanation).”

In a 2013 BMJ paper Ioannidis and co-authors found that results from trials done in less developed countries with no strong tradition in clinical research were systematically different than those done in countries with strong tradition of clinical research, even for hard endpoints like mortality.

“Regional differences in clinical trials is an important topic with potentially big implications for doing research globally and for how we interpret data from global cardiovascular trials,” said Robert Harrington (Stanford University). “Second, kudos to the investigators for pursuing these analyses, which add insight into the regional differences but also call into question issues around the overall quality monitoring of the trial. These findings are certainly troubling since they do suggest a serious problem with compliance with study medication despite investigators indicating that patients were taking the study medications.”

In the future, said Harrington, “we need to be sure that when enrolling patients around the globe that robust methods be in place to ensure selection of experienced, high-quality investigators.”

Economic Considerations and the NIH

“You get what you pay for” may be another lesson from TOPCAT. Some observers believe that the trial’s problems were accelerated by its sponsor, the NIH. The NIH is not known for spending top dollar on trials, and it does not perform its own clinical trials. TOPCAT, like other NIH-sponsored trials, was performed by a for-profit clinical research organization (CRO) under contract with the NIH.

Observers point out that CROs perform these trials with little oversight and are driven by the need to make a profit. They are under intense pressure to recruit patients rapidly and inexpensively and may have little motivation to rigorously review sites that enroll large numbers of patients. This constellation of factors, some industry observers believe, create a hospitable environment for shortcuts and misconduct.

In addition, economic differences in regions lead to a situation in which it is cheaper to enroll patients in these regions. At the same time, the revenue from enrolling patients can be a key factor at the sites. Pfeffer said he “can’t disagree” with this perspective: “You pay less for a patient randomized in Russia or Georgia, so your dollars go further, and those dollars mean more locally there, so the stimulus is greater.”

Another factor is that the NIH differs from pharmaceutical companies in that it does not indemnify trial sites for participating in the trials. According to Pfeffer, this is why there were no western European sites in the trial, though many of them would have been eager to participate.

Industry observers also caution that there is no reason to think that similar problems couldn’t occur in industry-sponsored trials. The pharmaceutical industry may be vulnerable to the same sort of problems that happened in TOPCAT. It’s not widely known, but most pharmaceutical companies no longer perform their own clinical trials. As a result of the dramatic restructuring of the pharmaceutical industry— in response to the patent cliff, the financial crisis, and industry consolidation, among other factors— nearly all major pharmaceutical companies now outsource their clinical trials to CROs. In response to shrinking profits, pharmaceutical companies seek to squeeze profits by reducing expenses. This can lead to less money spent on clinical trials and on oversight of those trials.

Pfeffer’s first major trial was SAVE, which was published 25 years ago. “In SAVE, I knew every investigator and coordinator, and we would meet all the time,” said Pfeffer. By contrast, Pfeffer says, he wouldn’t know many of the TOPCAT investigators, including the Russian and Georgian investigators, if he passed them on the sidewalk.

Followup post:



  1. Who would expect conscientious, competent results from the ex-USSR? At mathematics, sure. But on a medical trial like this? Nuts. The endless censorship of the idea that peoples and cultures differ deeply exacts endless costs.

  2. This issue was know from the beginning – Dr. Stuart Pocock presented this at ESC 2014 in Barcelona and I believe at other occasions as well. Geographic differences are one of the important pitfall in international trials..

  3. Fred J Pane says

    I agree that we need to be careful and do our own research of studies. A number of years ago we were presented with a real world study from a pharma company and one of my clinical pharmacists was doing literature review for P and T Committee presentation and found out that data from two countries was dropped from the Real World Study. The company didn’t know what to say, because no one had ever caught this fact before and we ended up not approving the drug. Either the study was Real World and included all the countries studies or it wasn’t.
    We do know that in other countries, patient treatment maybe different than the US and that is also why, when we were looking to add drugs to our hospital formulary, that we would review what studies were done in different countries.

Speak Your Mind