Reprinted from: Orthomolecular Medicine News Service, December 7, 2011
(OMNS, Dec 7, 2011) Evidence-based medicine (EBM) is the practice of treating individual patients based on the outcomes of huge medical trials. It is, currently the self-proclaimed gold standard for medical decision-making, and yet it is increasingly unpopular with clinicians. Their reservations reflect an intuitive understanding that something is wrong with its methodology. They are right to think this, for EBM breaks the laws of so many disciplines that it should not even be considered scientific. Indeed, from the viewpoint of a rational patient, the whole edifice is crumbling.
The assumption that EBM is good science is unsound from the start. Decision science and cybernetics (the science of communication and control) highlight the disturbing consequences. EBM fosters marginally effective treatments, based on population averages rather than individual need. Its mega-trials are incapable of finding the causes of disease, even for the most diligent medical researchers, yet they swallow up research funds. Worse, EBM cannot avoid exposing patients to health risks. It is time for medical practitioners to discard EBM’s tarnished gold standard, reclaim their clinical autonomy, and provide individualized treatments to patients.
The key element in a truly scientific medicine would be a rational patient. This means that those who set a course of treatment would base their decision-making on the expected risks and benefits of treatment to the individual concerned. If you are sick, you want a treatment that will work for you, personally. Given the relevant information, a rational patient will choose the treatment that will be most beneficial. Of course, the patient is not in isolation but works with a competent physician, who is there to help the patient. The rational decision making unit then becomes the doctor-patient collaboration.
The idea of a rational doctor-patient collaboration is powerful. Its main consideration is the benefit of the individual patient. However, EBM statistics are not good at helping individual patients. Rather, they relate to groups and populations.
The Practice of Medicine
Nobody likes statistics. Okay, that might be putting it a bit strongly but, with obvious exceptions (statisticians and mathematical types), many people do not feel comfortable with statistical data. So if you feel inclined to skip this article in favor of something more agreeable, please wait a minute. For although we are going to talk about statistics, our ultimate aim is to make medicine simpler to understand and more helpful to each individual patient.
The current approach to medicine is “evidence-based.” This sounds obvious but, in practice, it means relying on a few large-scale studies and statistical techniques to choose the treatment for each patient. Practitioners of EBM incorrectly call this process using the “best evidence.” In order to restore the authority for decision-making to individual doctors and patients, we need to challenge this orthodoxy, which is no easy task. Remember Linus Pauling: despite being a scientific genius, he was condemned just for suggesting that vitamin C could be a valuable therapeutic agent.
Historically, physicians, surgeons and scientists with the courage to go against prevailing ideas have produced medical breakthroughs. Examples include William Harvey’s theory of blood circulation (1628), which paved the way for modern techniques such as cardiopulmonary bypass machines; James Lind’s discovery that limes prevent scurvy (1747); John Snow’s work on transmission of cholera (1849); and Alexander Fleming’s discovery of penicillin (1928). Not one of these innovators used EBM. Rather, they followed the scientific method, using small, repeatable experiments to test their ideas. Sadly, practitioners of modern EBM have abandoned the traditional experimental method, in favor of large group statistics.
What Use are Population Statistics?
Over the last twenty years, medical researchers have conducted ever larger trials. It is common to find experiments with thousands of subjects, spread over multiple research centers. The investigators presumably believe their trials are effective in furthering medical research. Unfortunately, despite the cost and effort that go into them, they do not help patients. According to fundamental principles from decision science and cybernetics, large-scale clinical trials can hardly fail to be wasteful, to delay medical progress, and to be inapplicable to individual patients.
Much medical research relies on early twentieth century statistical methods, developed before the advent of computers. In such studies, statistics are used to determine the probability that two groups of patients differ from each other. If a treatment group has taken a drug and a control group has not, researchers typically ask whether any benefit was caused by the drug or occurred by chance. The way they answer this question is to calculate the “statistical significance.” This process results in a p-value: the lower the p-value, the less likely the result was due to chance. Thus, a p-value of 0.05 means a chance result might occur about one time in 20. Sometimes a value of less than one-in-one-hundred (p < 0.01), or even less than one-in-a-thousand (p < 0.001) is reported. These two p-values are referred to as “highly significant” or “very highly significant” respectively.
Significant Does Not Mean Important
We need to make something clear: in the context of statistics, the term significant does not mean the same as in everyday language. Some people assume that “significant” results must be “important” or “relevant.” This is wrong: the level of significance reflects only the degree to which the groups are considered to be separate. Crucially, the significance level depends not only on the difference between the studied groups, but also on their size. So, as we increase the size of the groups, the results become more significant-even though the effect may be tiny and unimportant.
Consider two populations of people, with very slightly different average blood pressures. If we take 10 people from each, we will find no significant difference between the two groups because a small group varies by chance. If we take a hundred people from each population, we get a low level of significance (p < 0.05), but if we take a thousand, we now find a very highly significant result. Crucially, the magnitude of the small difference in blood pressure remains the same in each case. In this case a difference can be highly significant (statistically), yet in practical terms it is extremely small and thus effectively insignificant. In a large trial, highly significant effects are often clinically irrelevant. More importantly and contrary to popular belief, the results from large studies are less important for a rational patient than those from smaller ones.
Large trials are powerful methods for detecting small differences. Furthermore, once researchers have conducted a pilot study, they can perform a power calculation, to make sure they include enough subjects to get a high level of significance. Thus, over the last few decades, researchers have studied ever bigger groups, resulting in studies a hundred times larger than those of only a few decades ago. This implies that the effects they are seeking are minute, as larger effects (capable of offering real benefits to actual patients) could more easily be found with the smaller, old-style studies.
Now, tiny differences – even if they are “very highly significant” – are nothing to boast about, so EBM researchers need to make their findings sound more impressive. They do this by using relative rather than absolute values. Suppose a drug halves your risk of developing cancer (a relative value). Although this sounds great, the reported 50% reduction may lessen your risk by just one in ten thousand: from two in ten thousand (2/10,000) to one in ten thousand (1/10,000) (absolute values). Such a small benefit is typically irrelevant, but when expressed as a relative value, it sounds important. (By analogy, buying two lottery tickets doubles your chance of winning compared to buying one; but either way, your chances are miniscule.)
The Ecological Fallacy
There is a further problem with the dangerous assertion implicit in EBM that large-scale studies are the best evidence for decisions concerning individual patients. This claim is an example of the ecological fallacy, which wrongly uses group statistics to make predictions about individuals. There is no way round this; even in the ideal practice of medicine, EBM should not be applied to individual patients. In other words, EBM is of little direct clinical use. Moreover, as a rule, the larger the group studied, the less useful will be the results. A rational patient would ignore the results of most EBM trials because they aren’t applicable.
To explain this, suppose we measured the foot size of every person in New York and calculated the mean value (total foot size/number of people). Using this information, the government proposes to give everyone a pair of average-sized shoes. Clearly, this would be unwise-the shoes would be either too big or too small for most people. Individual responses to medical treatments vary by at least as much as their shoe sizes, yet despite this, EBM relies on aggregated data. This is technically wrong; group statistics cannot predict an individual’s response to treatment.
EBM Selects Evidence
Another problem with EBM’s approach of trying to use only the “best evidence” is that it cuts down the amount of information available to doctors and patients making important treatment decisions. The evidence allowed in EBM consists of selected large-scale trials and meta-analyses that attempt to make a conclusion more significant by aggregating results from wildly different groups. This constitutes a tiny percentage of the total evidence. Meta-analysis rejects the vast majority of data available, because it does not meet the strict criteria for EBM. This conflicts with yet another scientific principle, that of not selecting your data. Rather humorously in this context, science students who select the best data, to draw a graph of their results, for example, will be penalized and told not to do it again.