[Thread Prev][Thread Next][Index]

Re: [ferret_users] [FERRET]calculate spatial correlation of 2 variables



Hi Muyin,
Well, this is interesting. The missing values are what make the sums different in each direction. Create a test variable with some missing values

yes? def axis/x=1:8:1 xax
yes? def axis/y=1:5:1 yax
yes? let a = x[gx=xax] + 1.1*y[gy=yax]
yes? let b = if a gt 6 then a
yes? let bxavg = b[x=@ave]
yes? let byavg = b[y=@ave]

yes? list b
VARIABLE : IF A GT 6 THEN A
SUBSET : 8 by 5 points (X-Y)
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
1 / 1: .... .... .... .... 6.10 7.10 8.10 9.10
2 / 2: .... .... .... 6.20 7.20 8.20 9.20 10.20
3 / 3: .... .... 6.30 7.30 8.30 9.30 10.30 11.30
4 / 4: .... 6.40 7.40 8.40 9.40 10.40 11.40 12.40
5 / 5: 6.50 7.50 8.50 9.50 10.50 11.50 12.50 13.50

yes? list bxavg
VARIABLE : B[X=@AVE]
SUBSET : 5 points (Y)
X : 0.5 to 8.5
1 / 1: 7.60
2 / 2: 8.20
3 / 3: 8.80
4 / 4: 9.40
5 / 5: 10.00

yes? list byavg
VARIABLE : B[Y=@AVE]
SUBSET : 8 points (X)
Y : 0.5 to 5.5
1 / 1: 6.50
2 / 2: 6.95
3 / 3: 7.40
4 / 4: 7.85
5 / 5: 8.30
6 / 6: 9.30
7 / 7: 10.30
8 / 8: 11.30

yes? list bxavg[y=@ave], byavg[x=@ave]
X: 0.5 to 8.5
Y: 0.5 to 5.5
Column 1: BXAVG[Y=@AVE] is B[X=@AVE]
Column 2: BYAVG[X=@AVE] is B[Y=@AVE]
BXAVG BYAVG
I / *: 8.800 8.488


Now, when the data is averaged in X first, there are 4 function values that go into the first value, 5 that are used to make the second, and 8 that go into the last average. When the data is averaged in Y first, there is only one that's used to form the first average, 2 for the second, and so on. So the original function values are in effect weighted differently when you compute the averages in each of the two different orders. Trace through what happens to the 6.5 in the first column of the listing of b. In the case where we first form the X sum, it is one of 8 numbers averaged with all the other values in its row before that result is used to make the XY sum. In the case where we first form the Y sum, it becomes the j=1 result, and is weighted equally with the other y-averages which were computed with more data. We wind up over-weighting data in columns with a lot of missing data when forming the Y sum first, and over-weighting data in rows with a lot of missing data when forming the X sum first.

The correct answer is found either with a 2-dimensional average, or by averaging the data after unravelling it into a 1-dimensional list.

yes? list b[x=@ave,y=@ave]
VARIABLE : IF A GT 5 THEN A
X : 0.5 to 8.5 (XY ave)
Y : 0.5 to 5.5 (XY ave)
8.667
yes? let c = xsequence(b)
yes? list c[x=@ave]
VARIABLE : XSEQUENCE(B)
X : 0.5 to 40.5 (averaged)
8.667

If the data have no missing values, any of these ways of forming the sum will be the same.

yes? use etopo120
yes? let rxavg = rose[x=@ave]
yes? let ryavg = rose[y=@ave]
yes? list rxavg[y=@ave], ryavg[x=@ave]
DATA SET: /home/porter/tmap/ferret/linux/fer_dsets/data/etopo120.cdf
LONGITUDE: 20E to 20E(380)
LATITUDE: 90S to 90N
Column 1: RXAVG[Y=@AVE] is ROSE[X=@AVE]
Column 2: RYAVG[X=@AVE] is ROSE[Y=@AVE]
RXAVG RYAVG
I / *: -1896. -1896.



muyin wang wrote:
Hi, Andy:
Thank you for your regressxy script. One thing I don't understand is why does the order of averaging matter? See the example blow. The results are so different! What was wrong here?
Thanks,
Muyin



let p=eofs[d=1,l=1]
let q=eofs[d=2,l=1]

let pq=p*q
let pqmsk=pq-pq

let pmsk=p+pqmsk
let qmsk=q+pqmsk

let pxavg=pmsk[x=@ave]
let pyavg=pmsk[y=@ave]

let pxya=pxavg[y=@ave]
let pyxa=pyavg[x=@ave]

list pxya,pyxa

Column 1: PXYA is PXAVG[Y=@AVE]
Column 2: PYXA is PYAVG[X=@AVE]
PXYA PYXA
I / *: -0.01525 0.01854


Andy Chiodi wrote:

Hi Zhen,

The spatial correlation of two varaibles can be found with the following ferret script. This is simply the regresst script re-written to consider space rather than time.

To use this, copy the regressxy script below to a .jnl file and then type

yes? go regressxy
yes? let P = myvar1
yes? let Q = myvar2

in ferret. The variable "corr" is then the spatial correlation between myvar1 and myvar2.

I hope this helps,

Andy


regressxy.jnl script:

\CANCEL MODE VERIFY
! Description: define FERRET variables for regression along the X and Y axis

say ... Linear Regression In the XY Plane
say ... Instructions:
say Use the LET command to define new variables
say Define the variable P as your independent (X) variable
say Define the variable Q as your dependent (Y) variable
say Results will be variables "SLOPE", "INTERCEP" and "RSQUARE"
say QHAT will be the regression estimate
say Note: If "Y" is your independent variable then
say ... "SET GRID Q" after defining Q.
say ...

let pq = p*q
let pqmask = pq-pq ! 0 or "missing" so all variables share the same missing
let pmasked = p + pqmask
let qmasked = q + pqmask
let pp = pmasked*pmasked
let qq = qmasked*qmasked

let pxave = pmasked[x=@ave]
let qxave = qmasked[x=@ave]
let pave = pxave[y=@ave]
let qave = qxave[y=@ave]
let pqxave = pq[x=@ave]
let ppxave = pp[x=@ave]
let qqxave = qq[x=@ave]
let pqave = pqxave[y=@ave]
let ppave = ppxave[y=@ave]
let qqave = qqxave[y=@ave]
let pvar = ppave - pave*pave
let qvar = qqave - qave*qave
let pqvar = pqave - pave*qave

let slope = pqvar / pvar
let intercep = qave - slope*pave
let qhat = slope*p + intercep
let rsquare = (pqvar*pqvar) / (pvar*qvar)
let corr = pqvar/(pvar*qvar)^0.5

SET MODE/LAST VERIFY





[Thread Prev][Thread Next][Index]

Dept of Commerce / NOAA / OAR / PMEL / TMAP

Contact Us | Privacy Policy | Disclaimer | Accessibility Statement