Statistics 1.3 & 1.4
Home Up






Statistics 1.3 and 1.4   Comparing Proportions

By Arthur Johnson, Ph.D.

2.0 contact hours

Course Objectives 

At the conclusion of this 2.0 unit course the learner will be able to:

1.  Describe why estimates will vary from sample to sample about the population proportion.

2.  How does the standard error of an estimated proportion approximate the population proportion..

3.  Calculate the standard error of a proportion using the formula provided.

4.  Describe the statistical test that is used to compare the proportion of success in the control group with the proportion of success in the new treatment.

5.  Describe the difference between a Type I error and a Type II error.

Questions of Current Interest
Are there enough women in the fire department? Are there enough African Americans in graduate schools? Should the Supreme Court set aside a seat based on racial background? Are the members of city council representative of the community they serve? Statistical analysis can sometimes be of assistance in answering questions such as these but sometimes, as we will see, statistics may also be irrelevant.

Estimates and Standard Errors
Populations arise in a number of ways. The demographic populations of a city or a country are examples. We can try to estimate the characteristics of a various subgroups within a population using census enumeration. For example, is there a higher proportion of single mothers in one subgroup as compared to the overall population or compared to another subgroup. Assuming we have access to all members of the population enumeration is conceptually error free. In reality we may have limited access to members of a given population group.

For example, members of a group who are homeless are difficult to include in a count or members of a household may go intentionally unreported. There is also some disagreement about the classification of individual members of a group. How do we decide when an individual is Latino or when another is Asian-American or Caucasian?

Since enumerating the entire subgroup or the entire population is not ordinarily feasible, we may take a random sample of subjects from the population and determine the proportion of interest in the sample and in subgroups within this sample. We still need to have access to all members of the population in order to draw a truly random sample and the question of classification of individuals remains. We are then faced with relating the proportion of interest in the sample to the proportion in the population and with relating the proportions in the subgroups to the proportion in the sample or to each other. .

As we noted in Statistics 1.1, when we estimate a proportion by taking random samples of subjects from a population we recognize that our estimates will vary from sample to sample about the population proportion. Let us say that the overall proportion of Asian-Americans in several state university freshman classes is determined to be 0.2 or 20% by enumeration of records. Random samples of freshmen at different schools will show proportions of Asian-Americans distributed about the value 0.2.

Some schools will be closer and some further from this population value. (Note that the freshmen in a particular unit of the university system may not constitute a random sample and so the proportion of Asian-Americans will not reflect the 0.2 figure. The number will reflect, however the proportion of Asian-Americans in the catchment area for the particular unit of the university system e.g. rural, urban etc.)

Distribution of Estimates

1.     Distribution of estimates

2.     Standard error of estimate

3.     Confidence limit on the proportion 

The manner in which estimates scatter about the population value is called the distribution of estimates.

As we noted earlier, when we start to estimate a proportion by taking random samples of subjects from a population we recognize that our estimates will vary from sample to sample. The manner in which estimates scatter about  the population value is the distribution of estimates.

Many distributions of estimates, including the distribution of proportions follow closely to the normal distribution of statistics. The amount by which a proportions differs from the population  is measured in terms of the standard error of estimate of the proportion. 

The amount by which a proportion differs from the population value is measured in terms of the standard error of estimate.

The properties of normal distribution are well known. In a normal distribution about half the estimates will lie within two thirds of a standard error of the population proportion. About two thirds of the estimates will lie within one standard deviation of the population proportion.

 Using the standard error of estimate as a guideline we can state that in 95% (19 out of 20) of the cases the population value will fall within (+/-)1.96 standard errors of the estimate.  This region of  (+/-)1.96 standard errors is called the 95% confidence limit on the estimated proportion.

Calculating Standard Errors
The formula for the standard error of a proportion is:
SEp = SQRT((p)(1-p)/n)

Where SE = standard error

SQRT = square root

p = proportion

n = sample size

This formula says that the standard error of a proportion (p) is equal to the square root of the product of p and 1-p divided by n, the sample size.  Multiply the proportion (p) by its complement (1-p) and divide by the number of subjects in the sample (n). Then take the square root to arrive at the standard error of the proportion. (Square roots are available on even the simplest cheapest hand held calculators)

Thus if we have a sample of 50 subjects of which 27 are women then the proportion of women is 0.54 and the complement of the proportion is (1 - 0.54)= 0.46 and the standard error of this proportion would be

SE = SQRT((0.54)(0.46)/50) or 0.065

and the confidence limits would be +/- 1.96(0/065) or +/- 0.128 and we would say that the true value of the proportion being estimated lies between 0.668 and 0.412. (You will often see 1.96 approximated by 2.0) There is no evidence from the data of this sample that the proportion of women is different from 0.5 since the confidence limits include the value 0.5.

Evidence or No Evidence
Continuing with this situation we would say that when a sample estimate and its confidence limits include the value 0.5 there is no evidence that the population value is other than 0.5. On the other hand, if the estimate and its confidence limits fail to include the value 0.5 then there is evidence the population value is not 0.5.

It is both useful and important to focus on the phrases evidence or no evidence for a true value other than 0.5 rather than to fall into the habit of saying the true value IS or IS NOT 0.5.

Suppose we change the admission procedures for the state university
system mentioned above and wish to know whether the proportion of Asian-American students has been affected. The proportion of Asian-American students from the previous year is known to be 0.2 or 20%. We take a sample of 400 students and find that the number of Asian-American  students is 100 - this gives a proportion of 0.25 or 25%. Do we have evidence that the proportion of Asian-American students has increased more than might be expected by chance alone? To find out we can use the above formula and calculate as follows:

1.     Calculate the standard error from the product of the proportion (0.25) and its complement (1 - 0.25). This equals 0.1875

2.     Divide 0.1875by the sample size (400) giving 0.00046875

3.     Take the square root of 0,00046875 which is 0.022


The estimate and its confidence limits:

0.25 +/- 1.96(0.022) or 0.25 +/- 0.042

This figure covers the region 0.208 to 0.292. The previous year's proportion was 0.2 or 20% - which does not lie within this region. Thus we may say that we have evidence that the proportion of Asian-American students has increased slightly more than would be expected from chance alone.

You may remark that 0.208 is very close to 0.2 and this is certainly the case.  The statistical decision rule demands either evidence or no evidence as a conclusion.

Also the region from 0 .208 to 0.292.may seem wider than you feel comfortable with.  If you want narrower confidence limits it would be necessary to employ a larger sample. Considering the relationship for the standard error you can see that in order to cut the confidence interval in half you would have to take four times as large a sample.

In another course we will consider how to determine in the planning stage how large a sample needs to be. If you wish to play by the statistical rules you can't say "well, I got this result with 100 subjects - I'll just try 300 more subjects and see what happens". There are step by step statistical procedures that must be followed and these procedures will be covered in another course.

Comparing Two Proportions (p)  (p1 and p2)

Taking a sample of size (called n1) from the control group in a clinical trial we observe a proportion of success - called p1. Taking a sample of size (called n2) for the new treatment group in the same clinical trial we observe a different proportion of success - called p2.  Since we want to know which treatment is better we are interested in the difference between the new treatment and the control (p2 - p1).

The standard error of the difference between these two proportions is calculated using the following formula involving the variability of both proportions:

SE(p2 - p1) = SQRT{ p2(1 - p2)/n2 + p1(1 - p1/n1 }

Where SE = standard error, in this case of the difference in proportions

SQRT = square root

p = proportion

n1 = control sample

n2 = new treatment sample

Suppose now that we observe 20 out of 50 successes in the control

SE = SQRT{ 0.5(0.5)/60 + 0.4(0.6)/50 } = 0.09  

(p2 - p1) +/- 1.96 SE = 0.1 +/- 1.96(0.09) = 0.1 +/- 0.18

0.1 + 0.18 = (+) 0.28

0.1  - 0.18 = (-) 0.08

If the value zero is located between the two values, within the confidence limits, (+0.28 and 0.08) then there is no evidence that the difference is other than chance. The above range does include the value zero so it is concluded that there is no evidence of a difference between the two treatment regimens.

The range of uncertainty - the confidence interval - does include zero but also includes other values up to 0.28, which might well be of considerable interest.  How could we be sure that if there were a difference of, for example 0.25 we would be sure to find evidence for a difference?  This is a question of sample size and we will address in the next section.

Sample Size

Suppose now we want to conduct a trial that will distinguish between "no difference" and a difference of 0.25. We have already indicated how we can be correct in a decision of no evidence when there is no difference in 19 out of 20 trials by using the confidence limits (+/-) 1.96.  Let us now assume that we would like to be correct in our decision of evidence when there is a difference of 0.25 in 19 out of 20 trials.

The formula for this situation is:

n per group = 13{p2(1 - p2) + p1(1 - p1)} / (p2 - p1)2

Where n = sample size

p1 = proportion of successes in the control group

p2 = proportion of successes in the new treatment group

We usually have a reasonable idea of p1 as it represents the proportion of success expected with the usual procedure and our interest is in an improvement of 0.25.  Let us suppose p1= 0.35 and hence p2 = 0.6.

n per group = 13{(0.6)(0.4) + (0.35)(0.65)} / (0.6 - 0.35)2 = 97 subjects per group 

It would thus be sensible to plan for 100 subjects per group.  (The assumptions made about p1 and p2 - p1 make rounding off acceptable.)

Confidence and Risk

We have talked about the use of confidence limits on estimates to assure ourselves that on 19 of 20 occasions the limits will include the true value.  We have talked about calculating the sample size from the difference between treatments so that we are confident that on 19 of 20 occasions we will provide evidence for a difference. Statisticians talk about risk or confidence.  If we are confident on 19 of 20 occasions then this represents a 95% confidence and if we have 95% confidence we accept a 5% risk. 

If we are assured that 19 out of 20 occasions are correct then we are taking a risk that 1 out of 20 occasions will be in error. The 1 in 20 occasions in which the confidence intervals on an estimate will not include the true value is termed a Type I error and the 1 in 20 occasions when we fail to provide evidence for a difference is termed a Type II error. Risk choices of other than 5% can be made but there is little evidence to support the use of different risks.  In my own practice over the years I have used a 5% risk factor and this is quite standard practice.  In the literature you may observe that Type II risks are often chosen greater than 5% in order to keep sample sizes down.  This is rarely an acceptable maneuver.

Some statisticians and contemporary software programs have adopted the practice of calculating exact risks - for example calculating an exact risk of 7.13 for their programs. It is my feeling that this is specious and there is no practical use for this information.

Confidence Limits About the Target Proportion
Sometimes it is easier to interpret your data by working from the target proportion rather than the sample proportion. In this case we use the sample size and the target proportion and its complement to proceed to confidence intervals or limits around the target proportion. The decision then is made on the basis of whether the sample proportion falls within or outside of the confidence limits.

There is one hazard to be wary of and that is that confidence intervals as we have discussed them should only be used for a single decision. If you were interested in considering the proportion of women in several departments of a specific company it is tempting to set confidence intervals on the proportion of 0.5 for departments of various sizes and then consider the observed proportions department by department. This will result in increasing the risk of false positive results. For example, if you consider two departments the risk of one or more false positive is about 10% rather than 5% in a single application of the confidence intervals.

The procedures discussed above are, strictly speaking, applicable in the central range of proportions, say, 0.2 to 0.8 or perhaps 0.1 to 0.9. With values outside the central region one had best consult a professional.

When These Confidence Intervals are Inapplicable
There are times when the use of confidence limits is inadvisable. An example of this is when asking a question such as - are there enough minority members on a committee? Most committees are small. If the numbers are small then the confidence limits we have considered will be very wide, perhaps ridiculously wide i.e. extending to negative values or beyond 1.0. Even with appropriate statistical procedures the probabilities of ‘unusual' results will tend to be large and decision making in this manner fuzzy at best.

For such problems one should turn attention to the ways in which committee members are recruited or the hiring practices of an organization rather than to statistical analysis. In this case it is important to evaluate whether equality of opportunity is present and operable.


1.      We have shown how to determine the statistical uncertainty in a calculated proportion from data and how to compare a proportion with a standard or expected proportion as well as comparing two different proportions

2.       We have pointed out aspects of data collection that may introduce considerable additional uncertainty

3.       We have shown how to determine whether or not to attribute an observed difference to statistical uncertainty or to a genuine effect

4.      We have further shown how to determine the magnitude of samples so that we are sure to detect a difference of substantive importance in proportions

Copyright © 1999-2000 Wild Iris Medical Education

Take the Test