[Thread Prev][Thread Next][Index]

Re: [las_users] Aggregation with Ferret



Hi Ying,

Just adding a little to Ansley's response. The reason that we no longer recommend the use of descriptor files for aggregating netCDF files is that we are phasing out our support for them. The alternative approach of aggregations of netCDF using the XML flavor called ncML works in for applications written in Java and in the most popular OPeNDAP servers. **We have been promised by Unidata** that it will be support in a future version of the netCDF c libraries, at which time we can truly phase out Ferret's descriptor files.

It is entirely possible that Ferret descriptor files may be faster than access to the same data via TDS as TDS delivers the data to Ferret via OPeNDAP. We have not had reason to time the two approaches against each other. (If you try this, please let us know.)

It has been some time since we have used aggregations of netCDF created with descriptors. They should presumably support "native strides" on the X, Y, and Z axes, but probably not on the time axis (the aggregation axis). Thus in the LAS context descriptor-based aggregation would be suitable only for modest length time axes, for which striding will never be requested by LAS. Since these descriptors are old and poorly supported, use of them should be in a "buyer beware" mode.

   - Steve

==============================================

Ansley Manke wrote:
hi Ying,

On 7/14/2010 12:34 PM, Roland Schweitzer wrote:
Ying at NCCS wrote:
Hi Roland and Ansley

I remember many years ago I used ferret jnl to aggregate model run files and configured into LAS. Does new ferret/las still support this kind aggregation ?
We (at least I) don't recommend this type of aggregation.

You may be thinking of the multi-dataset "descriptor files". There are some Unix command-line tools to create those from a set of netCDF files and which you should be able to find in the Users Guide. But we do not recommend descriptor files for use in LAS. I can't imagine it would be faster. Aggregation via TDS is the preferred method; it allows for "native striding" that is used by LAS for large datasets (see the Ferret Documentation "netCDF and strides" http://ferret.pmel.noaa.gov/Ferret/documentation/users-guide/data-set-basics/NETCDF-DATA#_VPINDEXENTRY_167 ) and also allows for analysis and difference operations done by F-TDS, which I don't believe will work for data that's accessed using a descriptor file.


It seems to me that LAS/Ferret is fast enough but access model data that are aggregated through either TDS/GDS is very slow, in particular for large number of files. It might be better if just let Ferret read the file directly. Also I wonder if this new ferretnc4 supports parallel IO (as netcdf 4 does).
With all the new caching and other optimizations in TDS I think the performance it quite good even for large data sets. There are always trade-offs, but I think the performance is adequate the the other characteristics like ease of installation and maintenance make TDS a better solution than "aggregating" with a Ferret descriptor file.


I am interested at running LAS backend service on our center's powerful machine with GPFS support, so it may speed up data analysis function on the server side.

If your LAS F-TDS on this backend, then you will be able to do server-side analysis on that powerful machine and LAS will take advantage of those server side functions for regridding when doing comparisons.

   Thanks again for all these years support to LAS users

Roland


--
Steve Hankin, NOAA/PMEL -- Steven.C.Hankin@xxxxxxxx
7600 Sand Point Way NE, Seattle, WA 98115-0070
ph. (206) 526-6080, FAX (206) 526-6744

"The only thing necessary for the triumph of evil is for good men
to do nothing." -- Edmund Burke



[Thread Prev][Thread Next][Index]


Contact Us
Dept of Commerce / NOAA / OAR / PMEL / TMAP

Privacy Policy | Disclaimer | Accessibility Statement