[Thread Prev][Thread Next][Index]

Re: Best strategy for long-time series

Hi Yingshuo,

Just for the record, if you want to add your own metadata to GDS datasets, you can do so by providing a supplementary DAS file: http://www.iges.org/grads/gds/doc/tag-ref.html#dataset

- Joe

On Tuesday, June 10, 2003, at 11:53 AM, yingshuo shen wrote:

   gds is doing a great job with grib and other binary files on aggregation.  caron's aggregation server can do things that gds is not able to such as adding some metadata (during aggretation aggType="JoinNew") and also it can aggregate remote data (via DODS type).   These two OPeNDAP servers are becoming apdrc data service infrastructure (http://apdrc.soest.hawaii.edu:9090).  i am looking forward to seeing the release of caron's NCML and it might be able to help us to put hdf file on to our OPeDAP service.

----- Original Message -----
From: jma
To: Jonathan Callahan
Cc: Jean-Francois PIOLLE ; las_users@ferret.wrc.noaa.gov ; John Caron
Sent: Tuesday, June 10, 2003 7:56 AM
Subject: Re: Best strategy for long-time series

I agree with Shen Yingshuo. It would work if you put the SST data behind a GrADS DODS Server (GDS) and then serve the GDS data set via LAS. Grads can handle the aggregation of the 3-hourly files with ease, and missing datasets and daily updates are no problem. We have similar setups here at COLA that are operational; I'd be happy to help you get started. When you open the GDS data set, only the metadata is read initially, there is no I/O until you actually make a data request. A long time series request might be a little bit slow, but perhaps not more than the LAS time-out. Jon's suggestion of aggregating the 3-hour files into monthly or daily chunks might speed things up. There are some handy netcdf operator programs that could do this pretty easily ( http://nco.sourceforge.net ).

Jennifer Miletta Adams
4041 Powder Mill Road, Suite 302
Calverton, MD 20705

On Tuesday, June 10, 2003, at 11:15 AM, Jonathan Callahan wrote:


You have touched on a problem that we in the development group haven't yet developed much expertise in.  I would like to rephrase the question to the group (and John Caron):

Can the DODS aggregation server efficiently serve up time series that require opening hundreds or thousands of files?

Here are various potential solutions to the problem, each with its own drawbacks:

• Set up LAS to only allow selection of XY views while still describing the dataset as XYT so that the time selector appears.  Then use custom code to map requests for a particular time onto a particular file name to access.  You should get a nice usable interface but will lose the ability to create time series.  The documentation on customizing LAS code has a section that is relevant:

http://ferret.pmel.noaa.gov/Ferret/LAS/Documentation/manual/ customize.html#Customizing_LAS_code

Clearly, I'll need to create an FAQ that addresses your problem more specifically.

• Create files that contain more than a single time step and aggregate those.

I suspect that the DODS aggregation server is slow in part because it is being asked to do  file IO on thousands of files when you ask for a time series.  If this is the case, creating yearly or even monthly files out of your snapshots and then stringing those together with aggregation should solve the problem.  The drawback here is that you have to reformat your data.

• Any other suggestions out there?

-- Jon

The advantage of this

Jean-Francois PIOLLE wrote:

I would like to serve with LAS a long time series of SST maps over
atlantic : we have up to 8 maps/a day (every 3 hours but some are
sometimes missing). This time series is daily updated with new maps. So
far, we have more than 4500 netCDF map files to serve.
I tried to serve them through a DODS aggregated server and set up the
LAS (v6.0) to access the data through this DODS server. But it appears
then that the first access to a map (actually the first access to a DODS
dataset) is really slow, exceeding the LAS default time-out. This is
because (I guess) each time the DODS catalog is updated, a DODS dataset
has to be re-aggregated and this first step is very time-consumming.
My question is : is DODS the best strategy to serve a long time-series
through LAS?  It seems that a Ferret descriptor file can't be used here
since it is limited to 500 files. What other strategy? I guess many
people have already experienced this problem...


Jean-francois Piolle

Jean-Francois PIOLLE
CERSAT  French ERS Processing and Archiving Facility
29280 Plouzane

Tel.:  (+33) 2 98 22 46 91   email: jfpiolle@ifremer.fr
Fax:   (+33) 2 98 22 45 33   WWW:   http://www.ifremer.fr/cersat


Joe Wielgosz / joew@cola.iges.org
Center for Ocean-Land-Atmosphere Studies (COLA)
Institute for Global Environment and Society (IGES)

[Thread Prev][Thread Next][Index]

Dept of Commerce / NOAA / OAR / PMEL / TMAP
Contact Us | Privacy Policy | Disclaimer | Accessibility Statement