Fwd: Re: <OT> unix math function: norm
Klaus-Peter Schrage
kpschrage
Mon May 17 11:31:56 PDT 2004
Am Dienstag, 28. Mai 2002 02:42 schrieb Joel Hammer:
> I ran the data both on Excel and with my own bash script. They get close
> results.
> I have attached my data.txt and a better ps file, with the labels better
> spaced out.
Sorry, Joel, my statistics used to be quite good 25 years ago, but now they
got a bit rusty. I fed your data into OpenOffice Calc and got results similar
to yours (only my variance is a bit different, var=0.49772 ...).
Now Alan Jackson's posting gave me the clue: the discrepancy in your graph
between the plotted bar graph and the normal curve is a problem of scaling.
One idea behind the normal curve is that probabilities or relative frequencies
are represented as areas under the curve; that yields an area of 1 under the
total curve.
In a bar graph, relative frequencies often are not represented as AREAS, but
as HIGHTS of the respective rectangles. All these hights (in your graph you
have 21 of them) then add up to 1. Now your data are grouped into groups of
width 0.2, which can be seen as the width of each the rectangles. Thinking of
areas, that means that the rectangles have a total area of 0.2.
So, for bringing together the bar graph and the curve, you can either divide
all the bar heights by 0.2, or multiply the curve by that factor. In gnuplot,
plotting 0.2*f(x) instead of your f(x) should give you a reasonable result.
Klaus
More information about the Linux-users
mailing list