Prev | Current Page 222 | Next

Brad Ediger

"Advanced Rails"

Using the population standard deviation on our
sample would underestimate our population??™s actual standard deviation. Here is the
Ruby code for the sample standard deviation, which we will use from here on out:
module Enumerable
def stdev
Math.sqrt( map{|x| (x - mean) ** 2}.sum / (length-1) )
end
end
The standard deviation is a very useful way to get a feel for the amount of variation
in a data set. We see that the second set of samples from above has a much larger
standard deviation than the first:
samples1.stdev # => 1.13529242439509
samples2.stdev # => 6.7954232964384
The standard deviation has the same units as the sample data; if the original data were
in milliseconds, then the samples have standard deviations of 1.1 ms and 6.4 ms,
respectively.
We can use the standard deviations to estimate a confidence interval. The confidence
interval and mean will give us a good idea for the limits of the data. Assuming a normal
distribution,* the following guidelines apply:
??? Approximately 68% of the data points lie within one standard deviation (??) of
the mean.
??? 95% of the data is within 2?? of the mean.
??? 99.7% of the data is within 3?? of the mean.
Using the second rule, we will generate a 95% confidence interval from the statistics
we have generated. This Ruby code uses the mean and standard deviation to return a
range in which 95% of the data should lie:
module Enumerable
def confidence_interval
(mean - 2*stdev) .


Pages:
210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234