[Thread Prev][Thread Next][Index]

Re: [ferret_users] Gaussian distribution

Hi Andre,
Well, the equation for the normal distribution of data with mean U and standard deviation S is  given by a formula,


So start by translating that into Ferret expressions.  The independent axis is the variable xx, coordinates on an axis.
yes? ! General definitions for a function for normal distribution
yes? ! with mean u and standard deviation s.

yes? let pi = 2.* acos(0.)

yes? let s = 1
yes? let u = 0

yes? let fac = (s*(2*pi)^0.5)
yes? let arg = ((xx-u)^2)/(2*s*s)
yes? let norm = exp(-1*arg)/fac

yes? def axis/x=-20:20:0.1 xax
yes? let xx = x[gx=xax]
yes? plot norm
Now look at it with different mean and std. The formula is normalized so the area under the curve is always 1. 
yes? let s = 2
yes? let u = 8
yes? plot/over/color=red norm
So, how to define the right curve to add to your frequency_histogram plot to show the normal distribution that has the same mean and std as your data?

Those who are statisticians may want to chime in on what I've done here with the scaling; I think it's correct.

We'll want to reset U and S in the definition of the variable "norm", so that they are the the mean and standard deviation of the variable from the histogram.  Say we're doing the histogram of sst[L=1] from coads_climatology. The mean and standard deviation of the variable can be set using the STAT command:
yes? use coads_climatology
yes? stat sst[L=1]

yes? let u = ($stat_mean)
yes? let s = ($stat_std)
The independent axis of the curve needs to match the histogram plot.  The frequency_histogram scripts define a set of bins and compute the data counts data in those bins.  After you run frequency_histogram2.jnl,  you'll see the script wrote a file to store those.
yes? GO  frequency_histogram2 sst[L=1] -2, 32, 2

yes? show data

     currently SET data sets:
    1> /home/users/tmap/ferret/linux/fer_dsets/data/coads_climatology.cdf  (default)
 name     title                             I         J         K         L         M         N
 SST      SEA SURFACE TEMPERATURE          1:180     1:90      ...       1:12      ...       ...
 AIRT     AIR TEMPERATURE                  1:180     1:90      ...       1:12      ...       ...
    2> ./bar_plot.dat
 name     title                             I         J         K         L         M         N
 BAR_X    BAR_X                            1:21      1:1       ...       ...       ...       ...
 BAR_Y    BAR_Y                            1:21      1:1       ...       ...       ...       ...

The second dataset contains the histogram bins and counts.  So we'll plot normal distribution for the the sample data using an x-axis defined from bar_x.  Redefine our axis xax, which will redefine xx in our variable norm.
yes? define axis/x xax = bar_x[d=2]
If you plot "norm" now, you'll see its vertical scale is very different from the histogram plot.  The normal distribution curve is scaled so its integral is 1.  We need to rescale it to have the same area under the curve as the frequency histogram.

The histogram counts are represented by the numbers in bar_y.  Use this variable to get the right scale factor to plot the normal distribution and the histogram plot together:
yes? let y_hist = bar_y[d=2,gx=xax@asn]
yes? let area_hist = y_hist[i=@din]
Finally we can plot the curve scaled to match the histogram plot
yes? plot/over/color=red/thick/nolabel norm*area_hist
I've attached that plot. This particular data is far from being normally distributed!

11/20/2014 10:36 AM, Andre Paim wrote:
Dear ferret community,

Is anyone aware of a way to plot a Gaussian distribution? I've plotted the histogram of my data set (using go frequency_histogram) and would like to plot the Gaussian curve which approaches the most the distribution of my data.

Andre Rodrigues

Attachment: histo_with_curve.gif
Description: GIF image

[Thread Prev][Thread Next][Index]
Contact Us
Dept of Commerce / NOAA / OAR / PMEL / Ferret

Privacy Policy | Disclaimer | Accessibility Statement