Statistics 1.2     When Statistics?

By Arthur Johnson, Ph.D.


Course posted July 30, 1999
Course expires: July 30, 2001

Learning Objectives
Upon completion of this 1 unit course the learner will be able to:
  1. Define when a group of observations are well behaved.
  2. List how many questions a well-controlled study can answer.
  3. Describe four conditions that must be met before a subject can be enrolled in a study.
  4. Describe the importance of having a written protocol before beginning a study.

Where are We Going?
The arithmetic of statistical analysis can be applied to any set of numbers. It is the responsibility of both the investigator and the reader to decide if the numbers accumulated during a study are appropriate for statistical analysis. One of the first questions a person trained in statistical analysis asks is - are the observations well behaved? Well-behaved observations can only arise from studies that have been thoroughly planned, carefully executed and thoughtfully reported. Unfortunately, ready access to computer software programs often leads to frivolous analysis and vacuous hope that elaborate analytic schemes will make up for poor planning and casual execution.

When Observations are Well-behaved
Scientific work demands that observation and observed events be reproducible. An event recorded by a single investigator requires confirmation, not only by the original investigator but also by other investigators in an independent manner. It is the responsibility of the statistician to decide whether the observations are well-behaved, reproducible, and hence appropriate for statistical assessment.

There is no formal definition of well-behaved observations beyond perhaps saying that they should look linear when plotted on normal probability paper. In other words, the numbers should be more or less evenly distributed and most observations should group together. This is usually the case. Stray observations or outliers far from the group indicate poor statistical behavior. Observations on too coarse a scale such as blood pressures reported to the nearest five units present a problem.

Observations done on a course scale then converted to a finer scale with more digits per observation are a problem. Observations that tail off towards the high end such as some blood chemistries may be an accurate indication of true behavior or may be an indication that observations from normal subjects have been contaminated by observations from diseased subjects.

Example One. Suppose we are looking at the body weights of a group of subjects and trying to determine whether the weights are well behaved. After we finish weighing each individual we notice that some of the weights are far below the average and some are far above. In other words, the weights vary widely and this leads us to question the uniformity of the subjects at least with respect to weight. In order to address this lack of uniformity the usual procedure is to rewrite the specifications to exclude doubtful individuals and restrict the scope of the study. The investigator should make this decision when setting up the study with input from the statistician.

Example Two. You are administering a stress test to a patient with congestive heart failure and notice that the test values are erratic from one test period to another. You are not sure if the erratic values are due to a worsening of the heart condition or to deficiency in the implementation of the test procedure. Was the subject sufficiently responsive to motivation, for instance? Proper training of test personnel is critical to the success of a study. Careful training and auditing of test procedures is essential to generating well-behaved data. Proper procedures must be in place before the study in underway - otherwise the study will have to be restarted.

Observations representing success/ failure or presence/absence of a condition take the value 1 or the value 0. There is no question of variability in this case. However, the suitability of the observations for analysis - how well behaved the observations are - is still an issue. In this instance the question of well-behaved data is determined by the selection criteria and the procedure for implementing the criteria.

It is important to note that procedures that are adequate for patient management may not be adequate for clinical trials. This is because patient management is an on-going process. There is an opportunity for considering a spectrum of tests that allows for correction of the treatment regimen over time. In a clinical trial the test procedure is unique and must stand alone.

Results must never be set aside simply because they do not seem consistent with other results. The human mind is infinitely inventive when explaining why unwanted results should be ignored. This is as true in the field of statistical analysis as it is in any other field. In a properly planned study there will be a sufficient number of subjects so that an occasional outlier should not cloud the results. If suspect data does cloud the results then the only sound procedure is to review the plan, modify as appropriate and repeat the study.

When There is a Prior Commitment to a Question

Statistical Rule
An adequate and well-controlled clinical trial can answer only one question.

Consider a clinical trial that wants to look at treatment efficacy. Evidence of efficacy will require statistical support and this is available only for a question that has been precisely formulated. A statistical rule of thumb is that an adequate and well-controlled clinical trial can answer one question, at best. Questions suggested by the data are not scientific conclusions, but they may lead to further study. There can be no support for such statements as "I really meant to evaluate at 4 weeks, not at 6 weeks", "Well, the disease process wasn't affected but the quality of life improved", "There was not much overall improvement but younger subjects seemed to benefit more than older subjects". Such suggestions are truly in the data for anyone to see but they must lead to further study if there is real interest. It is not possible to provide statistical support for a question that is only suggested by the data.

When there are Appropriate Subjects
In a clinical trial, the subjects are used as instruments to measure the effectiveness of the therapy. Several conditions must be met before a subject is accepted for inclusion into a study:

  1. Subjects must be shown to have the condition the treatment is proposed to treat.
  2. Subjects should not be entered into a clinical trial until their eligibility has been determined.
  3. Subjects should be uniform with regards to characteristics such as age, duration of disease and severity.
  4. No person should be entered in the study unless the following issues are resolved: a) subjects must give their informed consent, b) subjects must have a full understanding of the risks and benefits of the program, and c) subjects must be assured that unwillingness to enter the study will not compromise their treatment.

Condition One. Subjects must be shown to have the condition the treatment is proposed to treat. This may seem self-evident but a number of studies have shown that specific symptoms are not always an indication of a specific pathology. Review of submissions to the FDA (Food and Drug Administration) shows that one of the primary reasons a drug is rejected for approval is because the investigators failed to show proof that the subjects actually had the condition treated by the drug.

Studies on ulcers should use imaging techniques to show the presence of an ulcer and not rely solely on symptoms. Earlier work on ulcer treatment often yielded confusing results as more recent studies with adequate visualization have shown that symptoms and presence or absence of an ulcer are not closely related. Medication for the treatment of hypertension must be discontinued prior to the start of a study to determine whether the subject is still hypertensive. Review of a number of studies has shown that a substantial number of subjects checked in this manner did not manifest hypertension.

Condition Two. Subjects should not be entered into a clinical trial until their eligibility has been determined. Trials under emergency conditions such as infarcts require special planning beyond the scope of this course.

Condition Three. Subjects should be uniform with regards to characteristics such as age, duration of disease and severity. Uniformity is most obviously required regarding characteristics that may influence the disease or the therapy. The age range should exclude the very young and the very old. If there is a concern for the response of the geriatric patient to the therapy then a separate study should be carried out.

If both men and women are included then the assumption is that the disease process is the same in both sexes and the effect of the therapy will be similar. Tossing in a few subjects different from the core subjects will not do. When a therapy has been demonstrated to be effective then the therapy can be expanded to include the variety of patients encountered in clinical practice.

Condition Four. No person should be entered in the study unless the following issues are resolved: a) subjects must give their informed consent, b) subjects must have a full understanding of the risks and benefits of the program, and c) subjects must be assured that unwillingness to enter the study will not compromise their treatment. Many trials involve subjects from public clinics or institutions. Language barriers, ethnic customs, status differences and other factors present a challenge when trying to obtain informed consent.

When The Route to the Answer Exists
Formulating a question to be investigated does not insure that an answer can be found. There is extensive interplay between the question posed and possible means of answering the question. For example, in the anti-hypertensive study mentioned above, there might be two (or more) ways to answer the question. Are we looking at change in blood pressure or at the percentage of subjects who return to normal range after therapy? In a rheumatic disease study are we concerned with a reduction of tender and/or swollen joints or an improvement in overall quality of life? In a post-MI study are we just looking for clot resolution or are we measuring long term survival? Are we looking for ulcer resolution in 4 weeks or 6 weeks?

We may look for information at a variety of end points in the study but answering the study question with a unique answer requires prior commitment. Variables such as blood pressure, blood chemistries, clot resolution, etc, will not answer a question if long term results are a concern and may, in fact be of little interest.

What are some of the roadblocks to finding an answer? Are there enough subjects for the trial? At Cook County Hospital in Chicago a study was proposed on the prophylactic use of antibiotics in pregnant woman whose membranes ruptured prematurely. Despite the thousands of women giving birth at County each year there would only have been a sufficient number of eligible subjects if the recruitment continued for seven years. This was clearly impractical.

In the past when the number of subjects was not tied to the number required for the answer, studies could be small. Such a limited compass promoted uniformity of implementation and the clinic of an individual investigator sufficed. More recently, as quantitative requirements for number of subjects has increased, a single study may involve several widely scattered investigators. The problems associated with training of investigators and uniformity of implementation becomes acute in this case.

Geographic variability produces different demographic characteristics for potential subjects. If a study is carried out across national borders there will be differences in medical practices and attitudes as well. European medical practices are much less drug-oriented than US practices. For example, an anti-hypertensive drug that had been used without problems in France for several years had to be withdrawn from the US market when it was shown to cause liver damage.

Having a Written Protocol
In the previous section we discussed some of the issues the investigator faces with the planning and execution of a study. To assure success it is essential to begin with a written document detailing the resolution of these and other issues. Such a document is called a protocol. Using a contemporary phrase, we need a protocol to make sure everyone is on the same page. Do all the participants in the study have the same understanding of what is to be done? When there are several investigators it sometimes occurs that one investigator has interaction with many more (or many fewer) subjects. Is this due to a misunderstanding of the protocol requirements? A protocol assists the team in uniform implementation of the study.

Protocol guidelines have been promulgated by the Food and Drug Administration. (Food and Drug Administration. Adequate and Well-controlled Clinical Trials. The Federal Register. Volume 47, No 202 section 314.126 C; 10/19/82.) Other sources worth consulting on this subject and statistics in general are the columns of Jane Brody in the New York Times.


(From the Federal Register)

  1. The study uses a design that permits a valid comparison with a control to provide a quantitative assessment of drug effect.
  2. The method of selection of subjects provides adequate assurance that they have the disease or condition being studied.
  3. There is a clear statement of the objectives of the investigation - and in the report of the results.
  4. The method of assigning patients to treatment and control groups minimizes bias and is intended to show comparability of the groups. Ordinarily assignment is by randomization.
  5. Adequate measures are taking to minimize bias - such as blinding.
  6. The methods of assessment of subject's response are well defined and reliable. The report should explain the variables measured, the methods of observation and the criteria used to assess the response.
  7. There is an analysis of the results of the study adequate to assess the effects of the drug.

Copyright 1999-2000 Wild Iris Medical Education

Take the Test