In Defense Of Epidemiological
Methods Including Relative Risk
William F. Simpson, PhD, FACSM
The College of St. Scholastica
Division of Health Sciences
Department of Exercise Physiology
Duluth, MN 55811
RECENTLY, Birnbaum
published an editorial in
PEPonline
entitled “The Case Against Relative Risk” [
1].
Based on his premise that the use of "relative risk" [RR] may actually
be fraudulent and inappropriate, I feel compelled to respond to his thesis.
I do not intend to defend RR or any other statistic, only to shed light
on the fact that suggesting RR as a “bad” statistic is incorrect when one
does not entertain the big picture in respect to the use and application
of epidemiology and biostatisitics.
Birnbaum’s description of RR is correct
in regards to the use of the standard 2 x 2 contingency table to quantify
data. Further, it is typically accepted that a RR of 1.0 is the baseline
or reference point in regards to a condition. When the RR is reported
below 1.0 the interpretation follows that overall risk of developing said
disease is less. When the RR is reported above 1.0 the risk of disease
is increased. However, as Birnbaum suggested, changing policies or
affecting economic factors because of this statistic is a major mistake.
First and foremost, one must not
totally rely on any one statistical test to qualify any research hypothesis.
The notion that statistics tells all is an improper use in research. I
recall the philosophy of Dr. Helen Abby, my biostatistics professor, who
often said “Statistics means never having to say that you are certain”.
Research data represent a collection of numbers, which inherently have
error. Further, the recording, analysis, and interpretation of the
data are loaded with the potential for error. In any investigation,
measurement errors are part of the research process.
The goal of the investigators regarding
error is to anticipate and correct where possible and appropriate. When
interpreting results, a number of items must be taken into account. For
example, the tendency of a measure to deviate from its true value [i.e.,
bias] should be accounted for in an investigative report. Both observer
bias and selection bias may sway the results. Confounding factors, which
cannot be completely controlled by the design of the study are adjusted
for within the statistical analysis. This is a common practice in
longitudinal studies. Another example is chance. It has a role
in the results of an investigation. Finally, correlation is another
item that needs careful interpretation. Two factors may be highly
correlated [thus identifying a relationship between them], but is inappropriate
to assign a cause and effect relationship between the two.
Sir Bradford Hill, as reported by
Hennekens and Buring [2], suggested that one should not
take the statistically significant result to heart without first having
a biological explanation to support what the statistical treatment of the
data are suggesting. Dr. Brian Whipp recently suggested, during his
presentation at the ACSM annual meeting, that the bottom line of any research
investigation is the answer to “SO WHAT” [3]. If
the results of an investigation are nonsense or do not make physiological
sense, then the value of the science is exposed. As an example, if an observational
study found that short men have a higher RR for depression, does this mean
that all short men should be placed on anti-depressants? Can a physiological
association or explanation or any other explanation shed light on the research?
Is it not likely that the association could have been the result of selection
bias or, perhaps, just chance?
When interpreting research findings
with a statistic such as RR, the following guidelines [4]
should be adhered to:
1. Strength of Association:
The power and sample size must be sufficient to demonstrate a statistically
significant difference.
2. Consistency: The observation
of the association must be repeatable in different populations at different
times.
3. Temporality: The cause
must precede the effect.
4. Plausibility: The explanation
must be biological plausible.
5. Biological Gradient:
The presence of a dose-response relationship.
These criteria are suggested from the
works of Sir Bradford Hill and Sir Richard Doll and their 1950 study, “Smoking
and Carcinoma of the lung: Preliminary Report. Hennekens and Buring
[
2] have also suggested the following framework for the
interpretation of an epidemiological study.
1. Is there a valid statistical
association?
a. Is the association likely
to be due to chance?
b. Is the association likely to
be due to bias?
c. Is the association likely to
be due to confounding?
2. Can this valid statistical association
be judged as cause and effect?
a. Is there a strong association?
b. Is there biological credibility
to the hypothesis?
c. Is there consistency with other
studies?
d. Is the time sequence compatible?
e. Is there evidence of a dose-response
relationship?
One must appreciate that when a conclusion
is based on RR, an odds ratio [OR], or a correlation coefficient [r], the
statistic itself should not stand alone. It is the expectation and the
responsibility of the investigator to interpret the data based upon the
above stated criteria. Often, my students will cite one or two studies
in a laboratory write up. To assert that their experiment was “correct,”
they use the statement such as, “...the works of Simpson and Coady
proved....” My response is simply, “No single study or
even a handful of studies ‘prove’ anything.” Also, it is important
to remember that science is fluid. What is believed to be correct to say
today may be obsolete 20 years from now.
Birnbaum’s example with the use of
cholesterol studies may be misleading. If one were to use his interpretation
of the RR based on the small sample size, I would also be very suspicious.
However, in reality, the example of the association of elevated cholesterol
and coronary heart disease [CHD] is a more complex analysis. First
of all, it may not be correct to state that it is common knowledge that
elevated cholesterol is the cause of CHD. There are a number of
major and minor risk factors associated with this disease.
Since the Framingham study [5]
suggested that elevated cholesterol is associated with CHD, there have
been numerous large and well designed controlled observational studies
that have found varying degrees of association. In fact, the most
recent updated National Cholesterol Education Program [NCEP] guidelines
have introduced the third Adult Treatment Panel [ATP III] summary, which
specifically targets LDL-C [6]. These guidelines
are based on observational studies. Further, there is a biological
association. It is well established that increased levels of low
density lipoproteins in concert with a low, high density lipoprotein disturbs
the natural functioning of the arterial lumen and intima by disrupting
secretion of Endothelium–Derived Relaxing Factor and Nitric Oxide,
[EDRF -NO] causing subsequent plaque formation [7].
Further, this cascade in association with other risk factors [e.g., smoking,
hypertension, sedentary lifestyle, and diabetes] increases the risk of
CHD.
By contrast, there are individuals
who are hypertensive, overweight, and sedentary who have from elevated
cholesterol and yet they have no evidence of CHD. There are also
individuals who have no known risk factors and die of CHD at an early age.
At the present time, epidemiology, statistics, or biology cannot explain
these types of conditions. As previously stated, science is fluid
and continues to evolve. Science and medicine are imperfect.
Researchers, scientists, and clinicians accept this fact. Patients
and their families will always ask “Why me?” and, often the answer will
allude the physician and others involved in the research.
Based upon 50+ years of careful investigations
from Boston to Stanford to Helsinki, the conclusions regarding lipids and
CHD have been made after careful and meticulous review and are considered
current in respect to clinical practice. They were not made on one
or two RRs or any other statistical analysis. In fact, if the reader
reviews the data from the Multiple Risk Factor Intervention Trial [MRFIT],
which led to the current cholesterol guidelines, it is clear that they
are not as stringent as they might be. In other words, <200 mg/dl
may not be the most prudent level, based on the deciles in which the data
fall in regards to RR [8]. Yet a consensus of scientists,
researchers, and clinicians by conventional thinking established the guidelines.
The basic or applied scientist who
routinely works with small sample sizes [< 15] and desires alpha levels
at 0.05 or less may become very uncomfortable with the conclusions made
from observational studies. Personally, I was shocked during doctoral
work when an epidemiologist presented results from a study that was considered
“significant” based on correlation coefficients of .29 to .33. I
was astounded that one could even consider the data relevant. However,
as stressed to me, this particular result was one contributing bit of evidence
in concert with other “bits” when placed together form a larger picture.
Going back to basic statistical training, most readers have probably heard
that the best way to study a population is to study every person.
Since it is not possible to do so, researchers must take representative
samples from the population. This is what these observational studies
attempt to accomplish. Determining what is the truth in science is
not easy.
Confidence intervals [CI] are also
used to appreciate “where the truth lies”. Like RR, it is another
common statistical tool used in observational and longitudinal studies.
By comparison, if the RR is calculated as 2.3 with a 95% CI of 1.9 to 2.8,
the researcher is likely to interpret this as “ I am 95% certain that the
truth is between 1.9 and 2.8”. The researcher is not certain 100%
that the actual RR is 2.3 but one can be 95% certain of the range it falls
in. To the researcher who relies on small subject sizes and precise
results with a controlled environment, this may appear unrealistic.
Yet with large data sets from observational studies, this is a reliable
estimate of estimating the true number. Here again, it may be useful
to revisit the p value. For example, a p value of 0.05 still tells
the researcher that 5 out of 100 times the results are by chance. S o,
where is the exact truth? In all likelihood, it lies within error.
In summary, as stated in the introduction,
this editorial is not meant to debate the merits of RR. Perhaps that
is best left to the biostatisticians or individuals who have a major interest
in its statistical use and application. It is reasonable, however,
to conclude that scientific findings are preserved even with the use of
RR. The value of science and doing research is, if there is disagreement
with a stated hypothesis or conclusion from a particular paper, the researcher
has the freedom, if not the responsibility, to conduct an investigation
to refute published work. That is the true meaning and value of scientific
research and peer-reviewed papers. Iit is hope that this paper sheds
light on some of the basic realities that are associated with research
reports in various journals. Fortunately, the time honored peer review
process helps to keep science and authors “honest”. It is not perfect
by any means and the truth is not always what is published, but it is the
best we have. The researcher’s job is to find where the truth lies
and thus, where appropriate, convince the skeptical audience.
References
1. Birnbaum, L.
[2002]. The Case Against Relative Risk. Professionalization of Exercise
Physiology-Online. Vol 5 [2].
http://www.asep.org/asep/asep/relativeRISK.html
2. Hennekens, C.H.
and Buring, J.E. [1987]. Epidemiology in Medicine. Little Brown Publishing.
3. Whipp, B. [2002].
American College of Sports Medicine Annual Meeting. Wolfe Memorial Lecture.
4.
http://www.pitt.edu/`super1/lecture/lec4201/019.htm.
Supercourse in Epidemiology. Accessed on February 4, 2002.
5. Kannel, W.B.
[1985]. Epidemiologic Insights into Atherosclerotic Cardiovascular Disease
from the Framingham Study.
In: Pollock, M.L. and Schmidt,
D.H. [eds.]: Heart Disease and Rehabilitation, ed 2, Human Kinetics, Champaign,
pp. 2-16.
6. Stevenson, M.M.
[2001]. Updated NCEP Guidelines Set New Decision Points for Managing Dyslipdemia.
Geriatrics. Volume 56 No. [7].
7. Sabatinem, M.S.,
O’Gara, P.T., and Lilly, L.S. [1998]. Ischemic Heart Disease.
In:
Lilly, L.S: Pathophysiology of Heart Disease, ed. 2, Lippincott Williams
and Wilkins, pp. 119-123.
8. Stamler, J.,
Wentworth, D., and Neaton J.D. [1986]. Is Relationship Between Serum Cholesterol
and Risk of Premature Death from Coronary Heart Disease Continuous and
Graded? Findings in 356,222 Primary Screens of the Multiple Risk Factor
Intervention Trial (MRFIT). Journal of the American Medical Association.
Volume 256 No. [20] pp. 2823-8.
Copyright
©1997-2007
American Society of Exercise Physiologists All Rights
Reserved.