# Re: [ferret_users] EOF: area weighting?

```Hi Andreas,

| Baldwin, Mark P.; Stephenson, David B.; Jolliffe, Ian T., 2009.	Spatial
| Weighting and Iterative Projection Methods for EOFs, Journal of Climate,
| 22: 234-243, DOI: 10.1175/2008JCLI2147.1

I was writing the following message when your message came in.
I've just browsed the reference above and found that probably
it's basically the same.  I've simplified the derivation
for a special case where the metric tensor is diagonal.

Well, instead of throwing it away, I just dump below what
I wrote.  There may be (or likely to be) errors, but the
basic idea should hold.

Regards,
Ryo
------------------------------------------------
| Good question. The area represented by each gridpoint should indeed be
| taken into account in computing an EOF. However, in the present EOF
| functions, it is not.
|
| You should therefore do the weighting explicitly before computing
| EOFs.
|
| But is your expression for the weighting (cos(y/(180/pi))^(0.5))
| correct?

I think the weight should be 1/sqrt(cosine).

Here's a rough sketch of the formulation and solution of
the problem.  To write linear algebra all in ASCII characters,
let me introduce some ad hoc notation:

* Capital alphabets (A, B, . . .) represent Matrices.
* Small alphabets: Column vectors
* "+" symbol as in "A+": the transpose of the matrix.
* "-" symbol as in "A-": the inverse of the matrix.
* B_{ij}: The (i,j) component of matrix B.
* [ a(i) ]: Row vector composed of components a(1), a(2), . . .
* [ a(i,j) ]: Matrix composed of components a(1,1), a(2,1), . . .

So, u = "[u(1), u(2), . . . , u(N)]+" is a column vector. (Notice
the "+" symbol at the end.)

Suppose we have observational data on the surface of the Earth
u(x,y,t).  Let us rearrange our N datapoints from the two spacial
dimensions to one: u = [u(1), u(2), . . . , u(N)]+.  Let us define
a weight function (to be determined later) for the datapoints and denote
it f(1), f(2), . . . , f(N).  Let F = diag(f(1), . . . , f(N)),
a diagonal matrix constructed with f's.

Then the weighted covariance matrix B is

B_{ij} = < f(i) u(i) f(j) u(j) >

That is

B = F [ <u(i) u(j)> ]  F
= F A F

where "< >" is an ensemble or time average and
A is the unweighted covariance matrix.

Note that B is symmetric, so it's eigenvectors
q_1, q_2, . . . , q_N form an orthogonal basis.  Let us construct
an orthonormal matrix Q from q's:

Q = [q_1, q_2, . . . , q_N]

where q_1, q_2, . . . , q_N are normalized in the sense that

(Q+) Q = I.     ---- (1)

With this, B is diagonalized:

L = (Q+) B Q,   ---- (2)

where L = diag(a(1), a(2), . . . , a(N)) is a diagonal matrix
constructed from the eigenvalues a(1), . . . , a(N),
and "I" is the identity matrix.

Now, let us relate (1) and (2) to unweighted variables.
First, let p = F q, so that

P = F Q,

where P = [p_1, p_2, . . . , p_N].  From (1),

I = (((F-)P)+) (F-)P
= (P+)(F-)(F-)P
= (P+) (G-) P

because F is symmetric.  Here G = F F  = F^2 .
That is,

p_i+ (G-) p_j = delta_{i j},   ---(3)

where "delta" is Kronecker's delta.
That is, p's are orthogonal with respect to the
"metric" G-.

>From (2),

L = (((F-)P)+) (F A F) (F-) P
= (P+) A P

That is, P diagonalizes A.  Good!

Now, we have to determine F .  Eq. (3) should correspond
to an area integral over the Earth's surface.  So, "G-" must
be the weight function containing cosines.  Since G = F^2,
F- must have the square root of cosines.  Therefore,
F must have 1/sqrt(cosine) in it.

So, the recipe is

1. Transform your variable u with v = F u,
where F = diag(1/sqrt(cosine), . . .)

2. Do the usual EOF on B = < v v+ >.  Normalize the
eigenvectors q's as usual: q_i q_j = delta_{i j}

3. The EOFs you want are p's, where p = F q .
p's are orthogonal with respect to the usual
area integral over (a portion of) the sphere (Eq. 3).

4. p's diagonalize the unweighted covariance matrix.
(That is, if you project the variable u onto the
p axes, via w = (P+)u, the new variables w's are
statistically independent of each other.)

```