[Thread Prev][Thread Next][Index]

Re: When...



Adam,

The script that converts GDS xml output into LAS xml input may be working, but we first need to do some data aggregation.  The GDS as set up does not automatically aggregate the data files so the GDS-->LAS script is trying to create a separate configuration file for each entry.

The whole point of LAS is to allow users to slice and dice the data any way they want.  This will require you (or whoever is managing these datasets) to create aggregations of individual files.   This can be done by creating large files or by using an aggregation tool like the OPeNDAP Aggregation server or our own FDS.  I expect that GDS can be instructed to do these aggregations as well but I haven't done this myself.

Looking at the GHCN directory
http://nomads.ncdc.noaa.gov:9090/dods/NCDC_CLIMATE_GHCN
There are 128 separate files that differ only in the year they represent.  At the bottom of this list are two composite files for individual variables.  These composite files are what you want to use when generating xml files for LAS.  (The composite files in this directory aren't working at the moment.)

I'm afraid there will not be any magic bullet scripts that get you around the need to aggregate the data.  Once that job is done you should have a small number of datasets (or dataset aggregations) that you can point addXml.pl at to generate the LAS dataset configuration files.  You may need (or want) to modify these configuration files by hand to create the best possible presentation of your data.

Sorry we can't automate everything for you but this is the state of the art so far.


-- Jon


Adam Baxter wrote:
Jon,
   The xmls were generated by the script that I was pointed to. I then pointed that script at our first GDS server which resides at http://nomads.ncdc.noaa.gov:9090/dods/.
   The script read from http://nomads.ncdc.noaa.gov:9090/dods/xml and after close to a day of processing (I killed it this morning) the resulting gds-datasets.xml weighs in at around 310mbs.

Let's see. Where to begin?
We have the GHCN - 128 entries, 12 months worth of monthly max, min, and mean temperatures for each entry.
The GHCNP - 106 entries, 12 months worth of monthly precipitation anomoly for each entry.
The Extended Reconstructed Global SST - 153 entries, 12 months worth of  temp. in celcuis for each entry.
The NOAAPort ETA - 22 months worth, every day has ~16 entries with a variable count ranging from 11-53, possibly higher.
The NOAAPort GFS is similar. 22 months worth, every day has ~23 entries with a variable count ranging from 18-60.
The NOAAPort RUC has 22 months worth, every day having ~26 entries with variable counts starting at 30.
The SRRS has 20 months worth, every day having ~17 entries with one variable each - pwatclm, rhprs, hgtprs (1000mb), hgtprs (500mb), and cwdi.*

*3 - I'll have to get back to you on that.

4 - I'm going to try that now. Here's hoping.
*
*Thanks,
Adam

Jonathan S Callahan wrote:

Adam,

I'm surprised that you have 300 Megabytes of .xml files.  Our NVODS server, the largest collection of data in an LAS that we are aware of, has only 1.8 Megabytes of .xml files which represent several Terabytes of data.

Can you point us to the data repository you are trying to set up in LAS so that we have some idea what your up against.  It would also be helpful to know the following:

   1. What version of LAS are you using?
   2. How many datasets and variables are you trying to set up in LAS?
   3. Can you describe the grids these variables are on?
   4. Have you gotten LAS set up with a subset of your data?

Thanks,


-- Jon


Adam Baxter wrote:

Oh fun,
When your set of xml files together is roughly 300 mbs (and that's still not all of it), what does one do? genLas dies after a few minutes after sucking up around 3gigs of system memory, complaining about being out of memory. This takes roughly six minutes.
Any ideas?

Adam


[Thread Prev][Thread Next][Index]

Dept of Commerce / NOAA / OAR / PMEL / TMAP
Contact Us | Privacy Policy | Disclaimer | Accessibility Statement