National Oceanic and
Atmospheric Administration
United States Department of Commerce


 

FY 1983

Data Intercomparison Theory—Vol. I, Minimal spanning tree tests for location and scale differences

Preisendorfer, R.W., and C.D. Mobley

NOAA Tech. Memo ERL PMEL-38, NTIS: PB83-182311, 45 pp (1982)


When intercomparing two data sets, each of n samples of some field at p points in space, the question often arises about the relative sizes of their averages over time and about their relative variances. In this note we consider two geometric ways of answering this question. The basic geometric concept is that of a minimal spanning tree (MST) made from the union of the data sets when they are considered as n-point swarms in euclidean p-space Ep. The MST is the network of straight lines in Ep that connects the points of the pooled swarms with the least possible total length of its segments. The test of relative location of data sets based on the MST uses a generalized notion of run (which measures how much the points of the two sets intermingle in their MST) while the scale test for variance is based on the simple intuitive idea that the set with greater variance will have the branches of its part of the tree spread beyond those of the other. Power tests were run for the MST location and scale tests and it was found that the MST scale test is relatively powerful and useful. An application of the MST scale test was made to the problem of defining natural seasons over the U.S. mainland using a 46-year temperature record. The result is a novel partition of the 12 months of the year into new seasons based on months with comparable temperature variances.




Feature Publications | Outstanding Scientific Publications

Contact Sandra Bigley |