[Thread Prev][Thread Next][Index]

Re: [ferret_users] Reduce the size of the data



Nicolas, and all,

There are a couple more things that can explain a larger file size on output.  So just to complete the discussion:

1) The original file may have been written with netCDF compression, see "LIST/DEFLATE" in the Ferret Users Guide.  If that was the case, the netCDF-4 library that is linked to Ferret will uncompress the data as it is read in.  Ferret doesn't write data compressed unless the SAVE command includes qualifiers to force that.  The ncdump "-s" option lists the special virtual attributes that describe the netCDF_4 settings that were used in creating the file. Note the _Storage, _ChunkSizes, and _Endianness attributes on the variables, and the global attribute _Format = "netCDF-4" ;


> netcdf nc4_deflate4 {
dimensions:
        yaxis = 180 ;
        xaxis = 90 ;
        TIME = UNLIMITED ; // (12 currently)
variables:
        double yaxis(yaxis) ;
                yaxis:units = "degrees_east" ;
                yaxis:modulo = " " ;
                yaxis:point_spacing = "even" ;
                yaxis:axis = "X" ;
                yaxis:standard_name = "longitude" ;
                yaxis:_Storage = "contiguous" ;
                yaxis:_Endianness = "little" ;
        double xaxis(xaxis) ;
                xaxis:units = "degrees_north" ;
                xaxis:point_spacing = "even" ;
                xaxis:axis = "Y" ;
                xaxis:standard_name = "latitude" ;
                xaxis:_Storage = "contiguous" ;
                xaxis:_Endianness = "little" ;
        double TIME(TIME) ;
                TIME:units = "hour since 0000-01-01 00:00:00" ;
                TIME:time_origin = "01-JAN-0000 00:00:00" ;
                TIME:modulo = " " ;
                TIME:axis = "T" ;
                TIME:standard_name = "time" ;
                TIME:_Storage = "chunked" ;
                TIME:_ChunkSizes = 512 ;
                TIME:_Endianness = "little" ;
        float SST(TIME, xaxis, yaxis) ;
                SST:missing_value = -1.e+34f ;
                SST:_FillValue = -1.e+34f ;
                SST:long_name = "deflate_x30_y30" ;
                SST:history = "From /home/users/ansley/infile.nc" ;
                SST:units = "Deg C" ;
                SST:_Storage = "chunked" ;
                SST:_ChunkSizes = 1, 30, 30 ;
                SST:_DeflateLevel = 1 ;
                SST:_Endianness = "little" ;

// global attributes:
                :history = "FERRET V7.1 (optimized) 20-Mar-17" ;
                :Conventions = "CF-1.6" ;
                :_Format = "netCDF-4" ;
}


2) A second kind of compression is "Packed data", via the use of scale_factor and add_offset attributes on individual variables.  This lets the file creator store the data in an integer format, saving space in the file.  If Ferret reads a variable that has these attributes, it applies the scale and offset to the data as it is read in.  Then if a SAVE command is used to write out a subset of a file variable, Ferret will re-scale it back and use the original data type on output, but that is not done with user-defined variables unless you define scale_factor and add_offset attributes for the variable before writing it.

-Ansley


On 3/17/2017 10:14 AM, Ansley C. Manke wrote:

Hi Nicolas,

I think you've got the answer you need, but you can always also just check what's in the file.   Change the repeat-loop range to write out just the first few values.  Then cancel all the definitions and open the file in Ferret, listing the data. Or get out of Ferret and use ncdump to write out everything in the file.

  > ncdump Output.nc

will write all the header, the coordinates, and the data values, showing the data types, attributes, and so forth. (This is why I suggested writing just a few values.)  This way you'd have confidence in your definitions and in what is being written.

-Ansley


On 3/17/2017 3:26 AM, Nicolas Freychet wrote:
Hi all,

I computed the daily minimum of temperature based on 3h datasets (8 values/day), using the methodology (by steps of 8, I pick up the minimum temperature, and associate it to the first step to record it)


define axis/calendar=julian/t=01-jan-1979:31-dec-1995:3/units=hours tmodel
repeat/range=1:49672:8/name=it (  let tmin=temp[l=`it`]*0+tmin[l=`it`:`it+7`@min] ; save/append/file=Output.nc Tmin[l=`it`,gt=tmodel@asn] )

The problem is my output file is much bigger than it should. I got the impression that Ferret is recording the whole "formula" and not just the value of Tmin. Is there a way to solve that?

(my way to define Tmin may not be optimal though, that may cause the problem...)

Thanks,
Nicolas



----------------------------------------------------
Nicolas Freychet
PDRA, School of Geosciences
University of Edinburgh
----------------------------------------------------



[Thread Prev][Thread Next][Index]
Contact Us
Dept of Commerce / NOAA / OAR / PMEL / Ferret

Privacy Policy | Disclaimer | Accessibility Statement