Here is a collection of formulas that are useful for computing chance errors for simple random samples. |
> Esum := proc() ndraws*AVEbox; end;
Esum := proc() ndraws*AVEbox end> SEsum := proc() sqrt(ndraws)*SDbox; end;
SEsum := proc() sqrt(ndraws)*SDbox end> SEave := proc() SDbox/sqrt(ndraws); end;
SEave := proc() SDbox/sqrt(ndraws) end> SEpercent := proc() SDbox*100/sqrt(ndraws); end;
SEpercent := proc() 100*SDbox/sqrt(ndraws) end
|
Problem1:
In a town there are 30,000 registered voters, 40% of whom
are Democrats (according to a previous census).
Compute the chance of observing 42% or more Democrats in
a sample of 1,000 registered voters from this town.
Repeat with a sample of 5,000 people.
|
Answer:
All we have to do is to transform the interval from 42% to the right into standard units for percents. We have that 42% in standar units is, |
> su42 := (42-40)/SEpercent();
1/2 ndraws su42 := 1/50 --------- SDbox> ndraws := 1000: SDbox := sqrt(.4*.6): 'su42' = evalf(su42,3);
su42 = 1.29
so the chance of observing 43% or more democrats in the sample
of 1000 voters is about the area under the normal curve from
1.29 to the right. The table gives:
z Height Area z Height Area z Height Area ___________________ __________________ ___________________ .... 1.25 18.26 78.87 2.75 0.91 99.40 4.25 0.005 99.9979 1.30 17.14 80.64 2.80 0.79 99.49 4.30 0.004 99.9983 1.35 16.04 82.30 2.85 0.69 99.56 4.35 0.003 99.9986 1.40 14.97 83.85 2.90 0.60 99.63 4.40 0.002 99.9989 |
> chance := (100 - 80.64)/2.;
chance := 9.680000000
|
so the chance is about 10%.
Now let us repeat the computation but when the number of draws (sample size) is increased to 5,000. |
> ndraws := 5000: 'SEpercent'= evalf(SEpercent(),3);
SEpercent = .691
and now 42% in standard units increases to: |
> su42 := (42-40)/0.691;
su42 := 2.894356006
the area from 2.89 to the right is given by looking at the table (above) again. |
> chance := (100 - 99.63)/2.;
chance := .1850000000
|
The chance is now about 2 in 1000.
Problem2:Consider Problem1 again but now suppose that the samples are WITHOUT REPLACEMENT and that the number of registered voters in the town is only 6,000. |
Answer2:
Now we need to correct the SEpercent by multiplying by the correction factor CF given by: |
> CF := proc() sqrt( (N - ndraws)/(N-1) ); end;
CF := proc() sqrt((N - ndraws)/(N - 1)) end
in the first case when the number of draws is 1000 we have |
> ndraws := 1000: N := 6000:
> correction_factor := evalf(CF(),4);
correction_factor := .9127> NewSE := evalf( SEpercent()*correction_factor, 4);
NewSE := 1.414> NewSU42 := (42-40)/1.414;
NewSU42 := 1.414427157> NewChance := (100 - 84)/2.;
NewChance := 8.000000000
|
so now the chance is 8% instead of 10%.. just a small change.
But when the number of draws is 5000 then things change considerably |
> ndraws := 5000: correction_factor := evalf(CF(),4);
correction_factor := .4082> NewSE := evalf( SEpercent()*correction_factor, 4);
NewSE := .2828> NewSU42 := (42-40)/0.2828;
NewSU42 := 7.072135785 NewChance := (100 - 100)/2; NewChance := 0
Moral:When the number of draws is small relative to the number of tickets in the box then it doesn't matter wether we draw the tickets with or without replacement. |