[Thread Prev][Thread Next][Index]

Re: Best strategy for long-time series



  GDS seems to do an aggregation on thousands files very well.  for example, we used GDS aggregate ecmwf_ds111.1.  the files are in grib format and each file contains 15-30 days... because GrADS template does not cover this kind files.. so we used symbolic to "break down" file into daily file so that template can understand.  it is over 3000 files in 10 years data and it seems GDS do the job very well.   If all files are local.. i would recommand to use GDS to do the aggregation. 
----- Original Message -----
Sent: Tuesday, June 10, 2003 5:15 AM
Subject: Re: Best strategy for long-time series

Jean-Francois,

You have touched on a problem that we in the development group haven't yet developed much expertise in.  I would like to rephrase the question to the group (and John Caron):

Can the DODS aggregation server efficiently serve up time series that require opening hundreds or thousands of files?

Here are various potential solutions to the problem, each with its own drawbacks:

  • Set up LAS to only allow selection of XY views while still describing the dataset as XYT so that the time selector appears.  Then use custom code to map requests for a particular time onto a particular file name to access.  You should get a nice usable interface but will lose the ability to create time series.  The documentation on customizing LAS code has a section that is relevant:


    http://ferret.pmel.noaa.gov/Ferret/LAS/Documentation/manual/customize.html#Customizing_LAS_code

    Clearly, I'll need to create an FAQ that addresses your problem more specifically.
     

  • Create files that contain more than a single time step and aggregate those.


    I suspect that the DODS aggregation server is slow in part because it is being asked to do  file IO on thousands of files when you ask for a time series.  If this is the case, creating yearly or even monthly files out of your snapshots and then stringing those together with aggregation should solve the problem.  The drawback here is that you have to reformat your data.
     

  • Any other suggestions out there?


-- Jon
 
 

The advantage of this

Jean-Francois PIOLLE wrote:

I would like to serve with LAS a long time series of SST maps over
atlantic : we have up to 8 maps/a day (every 3 hours but some are
sometimes missing). This time series is daily updated with new maps. So
far, we have more than 4500 netCDF map files to serve.
I tried to serve them through a DODS aggregated server and set up the
LAS (v6.0) to access the data through this DODS server. But it appears
then that the first access to a map (actually the first access to a DODS
dataset) is really slow, exceeding the LAS default time-out. This is
because (I guess) each time the DODS catalog is updated, a DODS dataset
has to be re-aggregated and this first step is very time-consumming.
My question is : is DODS the best strategy to serve a long time-series
through LAS?  It seems that a Ferret descriptor file can't be used here
since it is limited to 500 files. What other strategy? I guess many
people have already experienced this problem...

regards

Jean-francois Piolle

--
-------------------------------------------------------------
Jean-Francois PIOLLE
CERSAT  French ERS Processing and Archiving Facility
IFREMER
BP70
29280 Plouzane
FRANCE

Tel.:  (+33) 2 98 22 46 91   email: jfpiolle@ifremer.fr
Fax:   (+33) 2 98 22 45 33   WWW:   http://www.ifremer.fr/cersat

-------------------------------------------------------------


[Thread Prev][Thread Next][Index]

Dept of Commerce / NOAA / OAR / PMEL / TMAP
Contact Us | Privacy Policy | Disclaimer | Accessibility Statement