[Thread Prev][Thread Next][Index]

Ferret how-to: speed of large calculations

(for advanced users) How to get the best performance from Ferret on large


On Aug 31, 11:48am, Billy Kessler wrote:

	> Subject: why is this code so slow?
	> Steve,
	> I have a go script which takes more than 6 minutes
	> to execute. Is there something I could do to speed
	> it up, or is that inevitable for this [large] calculation?
	> Thanks, Billy


Hi Billy,

Here's how you figure out where Ferret is spending its time during a
large calculation:


On the commands that follow this Ferret will list diagnostic information
describing its internal actions.  I'll paste a fragment of the output with
explanation, below.

Watch for messages about "gathering".  These indicate that Ferret felt the
calculation was too large to fit into memory and it has broken the calculation
into slices.  This can slow down the IO considerably (but the calculation does
get finished without any intervention.)  You can influence how Ferret makes its
decision to use gathering -- section 22.6.4 (MODE DESPERATE), Users Guide V3.1

Use SET MEMORY/SIZE=<value> to allocate large memory.  (see
$FER_DIR/release_notes* -- version 3.2 or later).  In conjunction with the
information in section 22.6.4 this can greatly enhance performance by fully
utilizing the physical memory of your system.

For maximum performance be explicit in telling Ferret the region of interest
(i.e. don't leave the limits of axes unspecified to imply the full domain).
When limits are not explicit Ferret misses opportunities for reusing cached
results -- slower processing and potentially more memory needed due to
fragmentation may result.

Explicit limits are especially important when you see "gathering" (above).
Watch the diagnostic output to see if Ferret is re-reading the same data.  If
so, consider using LOAD (and LOAD/PERMANENT) to read file variables into memory
prior to requesting a result.  This is especially useful when calculations
involve derivatives and shift operators.  For example, using MY_VAR[I=101:110]
and MY_VAR[I=101:110@SHF:2] in the same calculation may require two separate
reading operations; pre-reading with LOAD MY_VAR[I=101:112] will cache a single
block of data big enough for both.

	- steve


P.S. Here's a fragment of MODE DIAGNOSTIC output and some explanation:

yes? LOAD SST[l=1:400@AVE]
 getgrid EX#1   5 D: 2  I:    1    1  J:    1    1  K:    1    1  L:    1    1
 eval    EX#1   4 D: 2  I: -111 -111  J: -111 -111  K: -111 -111  L: -111 -111
 strip gathering SST on Y axis:     1    90
 strip --> SST[T=01-JAN-1946:01-MAY-1979@AVE,D=2]
 reading SST    3 D: 2  I:    1  180  J:    1    4  K: -111 -111  L:    1  400
 doing --> SST[T=01-JAN-1946:01-MAY-1979@AVE,D=2]
 doing gathering SST on Y axis:     1     4


- "D: 2" tells us that the operations are from data set #2

-  the "getgrid" pass determines the grid of the result

- "eval EX#1" is the start of evaluation for the first expression (there may be
	multiple comma-separated expressions)

- "I: -111 -111" means that the limits on the I (X) axis are IMPLIED (unknown).
	Use explicit limits for better performance and memory management.

- "strip" means that Ferret has detected the need for an operation and
	allocated space for it on a stack.

- "strip gathering SST on Y axis" means this calculation is being broken into
	slicess along the Y axis (to fit into memory)

- "reading SST ..." means IO is in progress. The I,J,K,L limits are shown.
	You may see delays during large IO operations.

- "doing" tells us the operation that was "stripped" above is being done

- "gathering  SST on Y axis:  1  4" tells us that the J=1:4 slice of the
	calculation has been completed.  J=5:8 will probably follow.


		|  NOAA/PMEL               |  ph. (206) 526-6080  
Steve Hankin	|  7600 Sand Point Way NE  |  FAX (206) 526-6744
		|  Seattle, WA 98115-0070  |  hankin@pmel.noaa.gov

[Thread Prev][Thread Next][Index]

Dept of Commerce / NOAA / OAR / ERL / PMEL / TMAP

Contact Us | Privacy Policy | Disclaimer | Accessibility Statement