# Re: [ferret_users] R-square problem

 Hi everyone, Ana sent me her data, with P and Q.  These are variables that look quite well correlated but with one being of larger magnitude than the other. This is causing a loss of numeric accuracy; Ferret only operates in single-precision, and so the variations in the data of smaller magnitude are overwhelmed by the larger numbers. The result is improved with: yes? use pq yes? set var/name=q_in q  ! I am going to re-define q yes? let q = q_in/100 yes? go regresst yes? stat rsquared I'm asking some of our data-analysis experts to chime in:  What more can we do to work with this data?  I'm attaching the netCDF file. yes? can data/all; use pq.nc yes? stat p              VAR_1              LONGITUDE: 118.8E to 61.2W              LATITUDE: 36.2S to 16.2N              Z:  N/A              TIME: 01-JUN-1958 00:00 to 01-JUN-2002 00:00              DATA SET: ./pq.nc    Total # of data points: 798336 (72*21*1*528)  # flagged as bad  data: 0  Minimum value: 1000  Maximum value: 1031.3  Mean    value: 1012.9 (unweighted average)  Standard deviation: 3.723 yes? stat q              VAR_2              LONGITUDE: 118.8E to 58.8W              LATITUDE: 36.2S to 16.2N              Z:  N/A              TIME: 01-JUN-1958 00:00 to 01-JUN-2002 00:00              DATA SET: ./pq.nc    Total # of data points: 809424 (73*21*1*528)  # flagged as bad  data: 0  Minimum value: 100000  Maximum value: 103278  Mean    value: 101288 (unweighted average)  Standard deviation: 394.51 yes? set var/name=q_in q yes? let q = q_in/100 yes? go regresst ... yes? stat rsquare                (PQVAR*PQVAR) / (PVAR*QVAR)              LONGITUDE: 118.8E to 61.2W              LATITUDE: 36.2S to 16.2N              Z:  N/A              TIME: 01-JUN-1958 00:00 to 01-JUN-2002 00:00              DATA SET: ./pq.nc    Total # of data points: 1512 (72*21*1*1)  # flagged as bad  data: 0  Minimum value: 0.27778  Maximum value: 1.0833  Mean    value: 0.8884 (unweighted average)  Standard deviation: 0.1003 Ana Redondo wrote: Re: [ferret_users] R-square problem Hi Ansley, I am attaching you the file with the data Thank you very much!! Ana On 20/10/09 10:30 AM, "Ansley Manke" wrote: Hi Ana, Would it be possible for you to send me the data you are using?  If you do the following in Ferret, then you will produce a file that I should be able to test this with: yes? Let P=var_1   yes? Let Q=var_2   yes? save/file=pq.nc p,q and then just attach the netcdf file to an email.  This is not a large amount of data, and these files email just fine.  We can figure out what's happening and then report back to the Ferret group. I'm at the end of my day, and I see you're in Australia. This is always the hard part about working with others around the world, but maybe within another day we can get this figured out! Ansley Ana Redondo wrote: Re: [ferret_users] R-square problem Hi Ansley,   Yes the values of Rsquare are larger than 1:     yes? stat rsquare  Total # of data points: 1512 (72*21*1*1)  # flagged as bad  data: 0  Minimum value: 0.34722   Maximum value: 1.2   Mean    value: 0.89056 (unweighted average)  Standard deviation: 0.10312   Any ideas??   Thanks   Ana     On 20/10/09 2:16 AM, "Ansley Manke" wrote:      Hi Ana, Yes, the values of rsquare should be between 0 and 1.   If you look at the result of      yes? STAT rsquare   are the values really larger than 1?  I wonder if the automatically chosen color fill levels in the FILL command are making the color bar extend higher than 1 even though the values of the data field are not actually larger than 1.   Another possibility is that the accuracy of numeric calculations is putting values slightly larger than 1.  Here too the STAT command might tell us more.   Ansley     Ana Redondo wrote:    R-square problem Hi,   I am doing a linear regression between two independent variables in order to find out the R-square. This is how I am doing it:   yes? Let P=var_1   yes? Let Q=var_2   yes? Go regresst   yes? show var   yes? fill rsquare   But when filling the R-square (and I also list some values to test) I realized that some values are greater than 1!!!!!!!!  Isn’t it R-square the coefficient of determination? If so it should be 0

Attachment: pq.nc
Description: Binary data