PEPonline
Professionalization of Exercise Physiologyonline

An international electronic
journal for exercise physiologists
ISSN 1099-5862

Vol 5 No 7 July 2002

 


In Defense Of Epidemiological Methods Including Relative Risk
William F. Simpson, PhD, FACSM
The College of St. Scholastica
Division of Health Sciences
Department of Exercise Physiology
Duluth, MN 55811



RECENTLY,
Birnbaum published an editorial in PEPonline entitled “The Case Against Relative Risk” [1].  Based on his premise that the use of "relative risk" [RR] may actually be fraudulent and inappropriate, I feel compelled to respond to his thesis.  I do not intend to defend RR or any other statistic, only to shed light on the fact that suggesting RR as a “bad” statistic is incorrect when one does not entertain the big picture in respect to the use and application of epidemiology and biostatisitics. 

Birnbaum’s description of RR is correct in regards to the use of the standard 2 x 2 contingency table to quantify data.  Further, it is typically accepted that a RR of 1.0 is the baseline or reference point in regards to a condition.  When the RR is reported below 1.0 the interpretation follows that overall risk of developing said disease is less.  When the RR is reported above 1.0 the risk of disease is increased.  However, as Birnbaum suggested, changing policies or affecting economic factors because of this statistic is a major mistake. 

First and foremost, one must not totally rely on any one statistical test to qualify any research hypothesis.  The notion that statistics tells all is an improper use in research. I recall the philosophy of Dr. Helen Abby, my biostatistics professor, who often said “Statistics means never having to say that you are certain”.  Research data represent a collection of numbers, which inherently have error.  Further, the recording, analysis, and interpretation of the data are loaded with the potential for error.  In any investigation, measurement errors are part of the research process. 

The goal of the investigators regarding error is to anticipate and correct where possible and appropriate. When interpreting results, a number of items must be taken into account. For example, the tendency of a measure to deviate from its true value [i.e., bias] should be accounted for in an investigative report.  Both observer bias and selection bias may sway the results. Confounding factors, which cannot be completely controlled by the design of the study are adjusted for within the statistical analysis.  This is a common practice in longitudinal studies.  Another example is chance.  It has a role in the results of an investigation.  Finally, correlation is another item that needs careful interpretation.  Two factors may be highly correlated [thus identifying a relationship between them], but is inappropriate to assign a cause and effect relationship between the two.

Sir Bradford Hill, as reported by Hennekens and Buring [2], suggested that one should not take the statistically significant result to heart without first having a biological explanation to support what the statistical treatment of the data are suggesting.  Dr. Brian Whipp recently suggested, during his presentation at the ACSM annual meeting, that the bottom line of any research investigation is the answer to “SO WHAT” [3].  If the results of an investigation are nonsense or do not make physiological sense, then the value of the science is exposed. As an example, if an observational study found that short men have a higher RR for depression, does this mean that all short men should be placed on anti-depressants?  Can a physiological association or explanation or any other explanation shed light on the research? Is it not likely that the association could have been the result of selection bias or, perhaps, just chance?

When interpreting research findings with a statistic such as RR, the following guidelines [4] should be adhered to:

1. Strength of Association: The power and sample size must be sufficient to demonstrate a statistically significant difference.
2. Consistency: The observation of the association must be repeatable in different populations at different times.
3. Temporality: The cause must precede the effect.
4. Plausibility: The explanation must be biological plausible.
5. Biological Gradient: The presence of a dose-response relationship.
These criteria are suggested from the works of Sir Bradford Hill and Sir Richard Doll and their 1950 study, “Smoking and Carcinoma of the lung: Preliminary Report.  Hennekens and Buring [2] have also suggested the following framework for the interpretation of an epidemiological study.
1. Is there a valid statistical association?
a. Is the association likely to be due to chance?
b. Is the association likely to be due to bias?
c. Is the association likely to be due to confounding?
2. Can this valid statistical association be judged as cause and effect?
a. Is there a strong association?
b. Is there biological credibility to the hypothesis?
c. Is there consistency with other studies?
d. Is the time sequence compatible?
e. Is there evidence of a dose-response relationship?
One must appreciate that when a conclusion is based on RR, an odds ratio [OR], or a correlation coefficient [r], the statistic itself should not stand alone. It is the expectation and the responsibility of the investigator to interpret the data based upon the above stated criteria.  Often, my students will cite one or two studies in a laboratory write up.  To assert that their experiment was “correct,” they use the statement such as,  “...the works of Simpson and Coady proved....”   My response is simply,  “No single study or even a handful of studies ‘prove’ anything.”  Also, it is important to remember that science is fluid. What is believed to be correct to say today may be obsolete 20 years from now. 

Birnbaum’s example with the use of cholesterol studies may be misleading.  If one were to use his interpretation of the RR based on the small sample size, I would also be very suspicious.  However, in reality, the example of the association of elevated cholesterol and coronary heart disease [CHD] is a more complex analysis.  First of all, it may not be correct to state that it is common knowledge that elevated cholesterol is the cause of CHD. There are a number of major and minor risk factors associated with this disease. 

Since the Framingham study [5] suggested that elevated cholesterol is associated with CHD, there have been numerous large and well designed controlled observational studies that have found varying degrees of association.  In fact, the most recent updated National Cholesterol Education Program [NCEP] guidelines have introduced the third Adult Treatment Panel [ATP III] summary, which specifically targets LDL-C [6].  These guidelines are based on observational studies.  Further, there is a biological association.  It is well established that increased levels of low density lipoproteins in concert with a low, high density lipoprotein disturbs the natural functioning of the arterial lumen and intima by disrupting secretion of Endothelium–Derived Relaxing Factor and Nitric Oxide,  [EDRF -NO] causing subsequent plaque formation [7].  Further, this cascade in association with other risk factors [e.g., smoking, hypertension, sedentary lifestyle, and diabetes] increases the risk of CHD. 

By contrast, there are individuals who are hypertensive, overweight, and sedentary who have from elevated cholesterol and yet they have no evidence of CHD.  There are also individuals who have no known risk factors and die of CHD at an early age.  At the present time, epidemiology, statistics, or biology cannot explain these types of conditions.  As previously stated, science is fluid and continues to evolve.  Science and medicine are imperfect.  Researchers, scientists, and clinicians accept this fact.  Patients and their families will always ask “Why me?” and, often the answer will allude the physician and others involved in the research. 

Based upon 50+ years of careful investigations from Boston to Stanford to Helsinki, the conclusions regarding lipids and CHD have been made after careful and meticulous review and are considered current in respect to clinical practice.  They were not made on one or two RRs or any other statistical analysis.  In fact, if the reader reviews the data from the Multiple Risk Factor Intervention Trial [MRFIT], which led to the current cholesterol guidelines, it is clear that they are not as stringent as they might be.  In other words, <200 mg/dl may not be the most prudent level, based on the deciles in which the data fall in regards to RR [8].  Yet a consensus of scientists, researchers, and clinicians by conventional thinking established the guidelines. 

The basic or applied scientist who routinely works with small sample sizes [< 15] and desires alpha levels at 0.05 or less may become very uncomfortable with the conclusions made from observational studies.  Personally, I was shocked during doctoral work when an epidemiologist presented results from a study that was considered “significant” based on correlation coefficients of .29 to .33.  I was astounded that one could even consider the data relevant.  However, as stressed to me, this particular result was one contributing bit of evidence in concert with other “bits” when placed together form a larger picture.  Going back to basic statistical training, most readers have probably heard that the best way to study a population is to study every person.  Since it is not possible to do so, researchers must take representative samples from the population.  This is what these observational studies attempt to accomplish.  Determining what is the truth in science is not easy.

Confidence intervals [CI] are also used to appreciate “where the truth lies”.  Like RR, it is another common statistical tool used in observational and longitudinal studies.  By comparison, if the RR is calculated as 2.3 with a 95% CI of 1.9 to 2.8, the researcher is likely to interpret this as “ I am 95% certain that the truth is between 1.9 and 2.8”.  The researcher is not certain 100% that the actual RR is 2.3 but one can be 95% certain of the range it falls in.  To the researcher who relies on small subject sizes and precise results with a controlled environment, this may appear unrealistic.  Yet with large data sets from observational studies, this is a reliable estimate of estimating the true number.  Here again, it may be useful to revisit the p value.  For example, a p value of 0.05 still tells the researcher that 5 out of 100 times the results are by chance. S o, where is the exact truth?  In all likelihood, it lies within error. 

In summary, as stated in the introduction, this editorial is not meant to debate the merits of RR.  Perhaps that is best left to the biostatisticians or individuals who have a major interest in its statistical use and application.  It is reasonable, however, to conclude that scientific findings are preserved even with the use of RR.  The value of science and doing research is, if there is disagreement with a stated hypothesis or conclusion from a particular paper, the researcher has the freedom, if not the responsibility, to conduct an investigation to refute published work.  That is the true meaning and value of scientific research and peer-reviewed papers.  Iit is hope that this paper sheds light on some of the basic realities that are associated with research reports in various journals. Fortunately, the time honored peer review process helps to keep science and authors “honest”.  It is not perfect by any means and the truth is not always what is published, but it is the best we have.  The researcher’s job is to find where the truth lies and thus, where appropriate, convince the skeptical audience. 

References
1. Birnbaum, L. [2002]. The Case Against Relative Risk. Professionalization of Exercise Physiology-Online. Vol 5 [2]. http://www.asep.org/asep/asep/relativeRISK.html
2. Hennekens, C.H. and Buring, J.E. [1987]. Epidemiology in Medicine. Little Brown Publishing. 
3. Whipp, B. [2002]. American College of Sports Medicine Annual Meeting. Wolfe Memorial Lecture.
4. http://www.pitt.edu/`super1/lecture/lec4201/019.htm. Supercourse in Epidemiology. Accessed on February 4, 2002.
5. Kannel, W.B. [1985]. Epidemiologic Insights into Atherosclerotic Cardiovascular Disease from the Framingham Study. In: Pollock, M.L. and  Schmidt, D.H. [eds.]: Heart Disease and Rehabilitation, ed 2, Human Kinetics, Champaign, pp. 2-16.
6. Stevenson, M.M. [2001]. Updated NCEP Guidelines Set New Decision Points for Managing Dyslipdemia. Geriatrics. Volume 56 No. [7].
7. Sabatinem, M.S., O’Gara, P.T., and Lilly, L.S. [1998]. Ischemic Heart Disease. In: Lilly, L.S: Pathophysiology of Heart Disease, ed. 2, Lippincott Williams and Wilkins, pp. 119-123.
8. Stamler, J., Wentworth, D., and Neaton J.D. [1986]. Is Relationship Between Serum Cholesterol and Risk of Premature Death from Coronary Heart Disease Continuous and Graded? Findings in 356,222 Primary Screens of the Multiple Risk Factor Intervention Trial (MRFIT). Journal of the American Medical Association. Volume 256 No. [20] pp. 2823-8.


Copyright ©1997-2007 American Society of Exercise Physiologists   All Rights Reserved.