[Thread Prev][Thread Next][Index]

Re: [ferret_users] How to read ASCII data accurately



Hi Martin,
I think you've nailed the problem. The delimited read is not reading the data as double precision but the set data/ez does. Prior to Ferret v6.8 the non-delimited read would have read the data into single precision variables and so also not given you enough accuracy.

We will get that fixed but in the meantime there is another workaround. If you need the delimited read to handle data types for other columns in your file, you could work around it by reading the date as a text field and converting it to a double precision numeric variable using the STRFLOAT function

set data/ez/columns=28/skip=1/format=deli/del=" "/type="text,..." "RAW/davis_msm184_30min.dat"
let v1double = strfloat(v1)



On 5/3/2013 2:30 AM, Martin Schmidt wrote:
The following works! It should not, but it does.

!Read only the first column into variable v1. Define the time axis
set data/ez/skip=1/var=v1 "RAW/davis_msm184_30min.dat"
define axis/t/units=day/t0="1-jan-0000 00:00"/from_data time=v1[d=1]
define grid/t=time tgrid
let tdummy = t[gt=tgrid]

!Open the same data set again, now as delimited data set. Now only real*4 accuracy is used set data/ez/columns=28/skip=1/format=deli/del=" " "RAW/davis_msm184_30min.dat"
let v21  = reshape(v2,tdummy)
let v31  = reshape(v3,tdummy)
let v41  = reshape(v4,tdummy)
...
....

!Listing v1 and time gives different results
list/i=1/prec=12 v1
             VARIABLE : V1
             FILENAME : davis_msm184_30min.dat
             FILEPATH : RAW/
             X        : 1
          734707.000000
This is wrong!
yes? list/l=1/prec=12 t[gt=time]
             VARIABLE : T
                        axis TIME
             TIME     : 24-JUL-2011 00:14
          734707.010069
This is correct, but v1 is defined twice! Here the first definition (correct) is used.

list/l=1:10 tair
 24-JUL-2011 00:14:29 /  1:  23.58
 24-JUL-2011 00:45:00 /  2:  23.32
 24-JUL-2011 01:14:59 /  3:  23.22
 24-JUL-2011 01:44:29 /  4:  23.22
 24-JUL-2011 02:15:00 /  5:  23.18
 24-JUL-2011 02:44:59 /  6:  23.12
 24-JUL-2011 03:14:29 /  7:  23.01
 24-JUL-2011 03:45:00 /  8:  22.98
 24-JUL-2011 04:14:59 /  9:  22.80
 24-JUL-2011 04:44:29 / 10:  22.69

gives correct temperature and correct times.

yes? sh/brief data
     currently SET data sets:
    1> RAW/davis_msm184_30min.dat  (default)

The data set is only opened once.

Something is going on in the background, which is not ferret-like. Usually I would expect, changing v1 would imply changes in
the definition of the time axis too.
____________

Nevertheless, I think the fact that a line like
set data/ez/columns=28/skip=1/format=deli/del=" " "RAW/davis_msm184_30min.dat"
truncates the data should be considered as a bug?

Martin

Martin Schmidt wrote:
Thanks Russ and Akshay for the suggestions.

It turns out that I need to be mor specific.

The file contains a space delimited list of 28 variables in 28 columns. The first column is the critical time variable.
My command to open the file is:

set data/ez/format=delimited/col=28/skip=1/del=" " "RAW/davis_msm184_30min.dat"
define axis/t/units=day/t0="1-jan-0000 00:00"/from_data time=v1
 *** NOTE: Axis has repeated values -- micro-adjusting ...

This is from the missing accuracy of the time variable in v1. Adding the /type="numeric" argument does not change this.

Alternatively,

set data/ez/col=28/skip=1/var=v1,v2,v3,v4.v5,v6.v7.v8.v9,v10,v11,v12,v13,v14,v15,v16,v17,v18,v19,v20,v21,v22,v23,v24,v25,v26,v27,v28 "RAW/davis_msm184_30min.dat"
 *** NOTE: attempt to initalize 24 variables
 *** NOTE: maximum allowed is 20 variables
However, reading data this way
define axis/t/units=day/t0="1-jan-0000 00:00"/from_data time=v1

generates the desired time axis. Hence, reading the variables separately without a format specifier gives the correct answer. But now I have to split the file.

Generating a testfile with one column only

set data/ez test.dta

reads data correctly.

So my conclusion is, that the missing precision is coupled to the fact, that I open the file in the delimited format.

Cheers,
Martin





Russ Fiedler wrote:
Hi Martin,

Version 6.82 RH6 reads this to full precision so it looks like a
difference between OS or a bug has crept in.

russ-hf% ferret -nojnl
         NOAA/PMEL TMAP
         FERRET v6.82
         Linux 2.6.32-279.1.1.el6.x86_64 64-bit - 08/03/12
          3-May-13 09:55
yes? file/var=xx dummy
yes? list/prec=10 xx
              VARIABLE : XX
              FILENAME : dummy
              X        : 1
           734707.0101

There also the /TYPE=r8 qualifier which might work or try specifying the
Fortran format explicitly.

Russ

On Thu, 2013-05-02 at 17:50 +0200, Martin Schmidt wrote:
Hi,
I have an ASCII input file. One variable is the Julian day (from 0000)
and looks like

734707.010069

Reading this with set data/ez I find

list/prec=10 v1

1   / 1:  734707.0000

which is not, what I would like to get.

I am using ferret_v6842_rh5, which should use real*8 accuracy for
operations.

Is there a way to read ASCII data with the same accuracy?

I do not know, who had the idea to use this far away time origin, the
data are as they are. Possibly
for compatibility with EXCEL.
For sure, I could edit the input file, remove the first four digits of
the time variable and add this
after reading the data. This workaround is fine for my example, but for
a larger project I am interested in a
more general solution with ferret - if it exists.

Many thanks,
Martin Schmidt






[Thread Prev][Thread Next][Index]
Contact Us
Dept of Commerce / NOAA / OAR / PMEL / Ferret

Privacy Policy | Disclaimer | Accessibility Statement