Fwd: Re: <OT> unix math function: norm

Klaus-Peter Schrage kpschrage
Mon May 17 11:31:56 PDT 2004


Am Dienstag, 28. Mai 2002 02:42 schrieb Joel Hammer:
> I ran the data both on Excel and with my own bash script. They get close
> results.
> I have attached my data.txt and a better ps file, with the labels better
> spaced out.

Sorry, Joel, my statistics used to be quite good 25 years ago, but now they 
got a bit rusty. I fed your data into OpenOffice Calc and got results similar 
to yours (only my variance is a bit different, var=0.49772 ...).

Now Alan Jackson's posting gave me the clue: the discrepancy in your graph 
between the plotted bar graph and the normal curve is a problem of scaling.

One idea behind the normal curve is that probabilities or relative frequencies 
are represented as areas under the curve; that yields an area of 1 under the 
total curve.
In a bar graph, relative frequencies often are not represented as AREAS, but 
as HIGHTS of the respective rectangles. All these hights (in your graph you 
have 21 of them) then add up to 1. Now your data are grouped into groups of 
width 0.2, which can be seen as the width of each the rectangles. Thinking of 
areas, that means that the rectangles have a total area of 0.2.
 
So, for bringing together the bar graph and the curve, you can either divide 
all the bar heights by 0.2, or multiply the curve by that factor. In gnuplot, 
plotting 0.2*f(x) instead of your f(x) should give you a reasonable result.
Klaus




More information about the Linux-users mailing list