problems with big data sets

To: "Ferret Users" <ferret_users@xxxxxxxxxxxxxxxxxxx>
Subject: problems with big data sets
From: "Donald S. Dunbar" <dsd@xxxxxxx>
Date: Thu, 22 Oct 1998 10:05:00 -0700
Importance: Normal
Sender: owner-ferret_users

I'm running Ferret on a Sun Ultra 60 (360 MHz) with 384 MB RAM and 1 GB of swap using "ferret -memsize 260" (yes, I checked that 260 MB of memory is allocated using "show memory"). I'm analyzing a variable "A" with dimension 255x407x58x12 (XxYxZxT), for a total of 72,234,360 numbers. In particular, I'm attempting to shade a variable "B" derived from "A" by

let B=A[k=@max,l=@max].

After about 10 minutes of hearing the disk thrashing about, I get the message

XgksDuplicatePrimi() 300 Storage overflow has occurred in GKS

followed by a segmentation fault. If I replace "l=@max" by "l=1", for example, it works fine. Clearly I'm running into some sort of memory limit. This raises some questions concerning efficiency. How does Ferret go about extracting "B" from "A?" It seems to me that an inordinate amount of memory is being used for the operation of taking the maximum across two dimensions. Is there some way I (or Ferret) can do this more efficiently?

On a related topic. Has any thought been given to allowing for some sort of data compression for sparse arrays? For example, indirect addressing by specifying an index array IND[i,j,k,l] s.t. A[i,j,k,l] is given by A[IND[i,j,k,l]]. Can anything like this be done now?

Attachment: Donald S Dunbar.vcf
Description: Binary data

Follow-Ups:
- Re: problems with big data sets
  - From: Jeremy S Pal

Previous by thread: Tricky time averaging...
Next by thread: Re: problems with big data sets

[Thread Prev][Thread Next][Index]

Dept of Commerce / NOAA / OAR / PMEL / TMAP