[Thread Prev][Thread Next][Index]

[ferret_users] Re: [las_users] limit on storage



On Aug 23, 2006, at 3:47 PM, Jonathan Callahan wrote:
<rant>
This topic has me immediatlely climbing on my soapbox to talk about data management in general. The following opinions are therefore my own and not necessarily shared by others in the LAS group.
The same applies to my comments.

[...]
In the best of all possible worlds, data managers would take the data that is created by data providers and, where necessary, reformat it so as to provide optimal performance for data users. After all, the work of reformatting only has to be done once but the work of opening 10K separate snapshot files has to be done every single time a user makes a time series request.
I concur - however, there are a number of issues involved in transposing data from a many-fields-one-time to a one-field-many- times format. One concern to us, as data providers, is that many of our analysis packages require multiple fields for each time sample processed. It is not a trivial exercise to rewrite all of our codes.

The bigger concern is archival storage costs. Basically, what we end up with is 2X the data volume - the original data, and the transposed data. Considering that as a data manager, I have to keep track of literally hundreds of terabytes of data, and we do get charged for each and every byte, generally it's just not practical at this time for us to double our data storage charges.

As usual, there's nothing technically difficult in creating long time- series files from single time multifield files, however, there are policy and other issues that make the "best of all possible worlds" a difficult one to attain.

Gary Strand
strandwg@ucar.edu
http://www.cgd.ucar.edu/ccr/strandwg



[Thread Prev][Thread Next][Index]

Dept of Commerce / NOAA / OAR / PMEL / TMAP

Contact Us | Privacy Policy | Disclaimer | Accessibility Statement