Re: Issue with sortl: [ferret_users] Percentile along time axis for gridded data

To: ferret_users@xxxxxxxx

Subject: Re: Issue with sortl: [ferret_users] Percentile along time axis for gridded data

From: "Ansley C. Manke" <ansley.b.manke@xxxxxxxx>

Date: Mon, 4 Feb 2019 14:42:07 -0800

Arc-authentication-results: i=2; mx.google.com; dkim=pass header.i=@noaa.gov header.s=google header.b=VxA+Z46z; spf=pass (google.com: domain of ansley.b.manke@xxxxxxxx designates 209.85.220.41 as permitted sender) smtp.mailfrom=ansley.b.manke@xxxxxxxx; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=noaa.gov

Arc-authentication-results: i=1; mx.google.com; dkim=pass header.i=@noaa.gov header.s=google header.b=VxA+Z46z; spf=pass (google.com: domain of ansley.b.manke@xxxxxxxx designates 209.85.220.41 as permitted sender) smtp.mailfrom=ansley.b.manke@xxxxxxxx; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=noaa.gov

Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-archive:list-help:list-post:list-id:mailing-list:precedence :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:to:from:subject:dkim-signature; bh=W+myxHUaVE1VLqtubcIYOnmEl9ISmOYspVWeDeME2X4=; b=GFmyCRphDMScnMnwuZeo2ZzFibrPHSLSkzPIp6AJj55puEg94IAonIBIPc3MJG/qAW LEe1ptkgGzIabhg0lqdAjpKNnM5v3TmaUHoeo3PZW4u3lV/x3C8mqb2H7t14sPTCBlGc JW7FSwQuBpSx2JKU/dCt7iaClxeuDD48EIHp9w8zyfRX84hoRnPSawTKxxzd4wV9CeIw cSSC4vRNK2Xtlyu5rygemR46qFXWqvhx1u1f5+9ZUHaw7vrhw57H/f6k5VjjWLPFTa/F WGFpdD6zqw6vOmiJKFShPgkYlj/w3j+6fnsBqItbYJtxIoCAj5DLsDLwrjsGcM9cD6k8 Srlw==

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-language:in-reply-to:mime-version:user-agent:date :message-id:references:to:from:subject:dkim-signature; bh=W+myxHUaVE1VLqtubcIYOnmEl9ISmOYspVWeDeME2X4=; b=M0wyvhu/3MmII3FOis8miS0jE7EZ4NlH8SYdO6Lg9nJi9RLx/KD5T7TkMLine7/m0V bNCxywDy9PnJeN4De6TC/o9G6YRmktFA9PN9MY7Of/IAnQ1Llpv2v9kMeVMKdz9x++nl rIK1MeioECnk7JCzf1hdgbvxck4xrh/RZKqmv94/QQN2MHMyY9funDlSpMIZtdZmEF/8 m85oWvhEMCH7FgtUNerCOFSG8aRjI+mLZH8pQIlGeadiwpCw9hYgY4Hjg7Cb59PEbq93 +crzYxOIWIjivhJNXuZs9TysyamLEFtNaZHscJNzv0VIVl1o6iytMM8dWxOVceonAwPa 3geQ==

Arc-seal: i=2; a=rsa-sha256; t=1549320130; cv=pass; d=google.com; s=arc-20160816; b=oEJ+SeAcvg4UrNwib8lCGULbppovGHsiUfs1NHgo0VdaseNIJRMs2j48y/Q7cti01r wMKg/AWKAM2/msOyNhqSi0DBr7JOUjJ9zLRoivw82S8bPbeIMIkxu4E2Y109trnqvIq+ VnCIGmM0/4iPCFLmUN+4myC2kuGrM1bFHI0rnewaXZfy29BOuNfcVh/K+7tJ6cqCrWhA yjj23BwbJoYrYgb8iq/w1HDbg+MIH2TeVv7q6hn17QZ80DKcS+1b+yHvg4UKDZRhVa4/ EISk+6wUt071RO0AqIZWg/ekkTPDAhcVs5W72WbEBFwXzYpafrHRGqIFDQd2iBJ1hL7s cChA==

Arc-seal: i=1; a=rsa-sha256; t=1549320129; cv=none; d=google.com; s=arc-20160816; b=bI2IuwSXNz2FKVihjzWSY61tNfF4FsHBgY9+uLYqI8FScu6dopeICoK71DIIUPBxrC ETFLuMEa2A8mEpO3vOVLRbTwCFtwwNEwsCtO9IKJQnsR0D95UwzN+6uKEiJC4ogIY57E anZoxXLEaJ5eg+Qz2jRGZtfaoyhfD8rHl0egjLrt9V0E7Yu1zlWoByTp8dr4w7NpXp2U 0LZcIqKteSPjDa6rifQnIohlzGS25XYRiEURKz67+TqlYLztFA3ykgelktZaVvMhj4Wm aHBJCZy7Kyb+uHHHehUXPguIfuDh1Ab8nYqhdWJ3hPp6n7+ZVcj7akT/0sWy3MkZT2av 48+A==

Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=noaa.gov; s=google; h=subject:from:to:references:message-id:date:user-agent:mime-version :in-reply-to:content-language:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive; bh=W+myxHUaVE1VLqtubcIYOnmEl9ISmOYspVWeDeME2X4=; b=jdzITQQH1h4OlvusWyieBD8R6VLBzlJtgC4KGQVMqe/l7RMHR1JR7e/p8sedCobvc5 WDnhXq9Dz5xxkAXo9v1oviNOYJXa3NIDn2t9hH+hJ6ZX8QGn/0vpNUD9uhCs8l5iUulw KVKmO9VxrwZZe0DTiR+iOk/Cec7kPwBozXN8poStFDDLU+Ux+V4kmXKm/IQfc3VLHkpD HF7oUsijmYiKvOZMoSKz+F/14Aw9ZR1X6pZ4sZcknXlPAKYER+9sYsHoVxMtA5HyS4nr KXTTjO7Mh85xiYFYQ9UBosq7h/+22g8AKjgdur1fObsDIKT6ELrJV44z/PI74Nk2U2US HiSg==

In-reply-to: <30711013-c2a7-04b7-52b8-d45f1312101b@noaa.gov>

List-archive: <https://groups.google.com/a/noaa.gov/group/ferret_users/>

List-help: <https://support.google.com/a/noaa.gov/bin/topic.py?topic=25838>, <mailto:ferret_users+help@noaa.gov>

List-id: <ferret_users.noaa.gov>

List-post: <https://groups.google.com/a/noaa.gov/group/ferret_users/post>, <mailto:ferret_users@noaa.gov>

Mailing-list: list ferret_users@xxxxxxxx; contact ferret_users+owners@xxxxxxxx

References: <CAOs80NxHO=P0JV8V3E3KSsVVZ3ipdLx7f-oTDqqtBjEAp2Y3RQ@mail.gmail.com> <56FB2F57.2020603@csiro.au> <CAOs80Nwqf+8Btc3T5JcyBvxNtA6m_1qjNxfqrN_xwPA3xDsq+A@mail.gmail.com> <30711013-c2a7-04b7-52b8-d45f1312101b@noaa.gov>

Sender: owner-ferret_users@xxxxxxxx

User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.0

Hi Peng,
The results are correct. The SORTL values are telling how to sample the original list to get the data into ascending order. So,

yes? list sst[i=90,j=45],sorted_by_time[i=90,j=45] WARNING: Listed variables have ambiguous coordinates on axes: T DATA SET: /home/users/tmap/ferret/linux/fer_dsets/data/coads_climatology.cdf LONGITUDE: 161W LATITUDE: 1S Column 1: SST[T=01-JAN 00:45:31-DEC 06:34] is SEA SURFACE TEMPERATURE (Deg C) Column 2: SORTED_BY_TIME[T=0.5:12.5] is sorted indices SST SORTED_BY_TIMEL / 1: 26.82 1.00L / 2: 26.99 2.00L / 3: 27.47 12.00L / 4: 27.83 11.00L / 5: 27.96 9.00L / 6: 28.22 8.00L / 7: 27.96 3.00L / 8: 27.42 10.00L / 9: 27.33 4.00L / 10: 27.55 5.00L / 11: 27.22 7.00L / 12: 27.02 6.00

says, to sort the data, we will first take element 1, then element 2, then element 12, then element 11, and so forth, ending with element 7 then element 6 which is the largest. Like you I first thought they were a ranking of the sizes, but they're the index locations of the sorted data.

By the way, the note, WARNING: Listed variables have ambiguous coordinates on axes: T

is telling us that the SST data is on a formatted-time axis, time in hours since the origin, or whatever it is for a given axis, but the SORTL function returns a simple list L=1:12 on an abstract axis. Ferret is able to reconcile those two time axes, but the warning's there in case the user doesn't want it to assume it can associate the two variables just index by index.

Ansley

On 2/4/2019 1:44 PM, Ansley C. Manke wrote:

Hi Peng,

I want to add my thanks to those of you who conversed among yourselves during the time when we were shut down. We're looking into possibilities for making the documentation remaining available even when the server where it resides is off-line.

I don't see why those indices are showing up as the result of the sortl function. It seems to get the right answer when they're used to sample the data and sort the values themselves. However as I'm sure you noticed, the sampled result is correct!

I'll look into what's happening here.

yes? list sst[i=90,j=45],sorted_by_time[i=90,j=45], samplel(sst[i=90,j=45],sorted_by_time[i=90,j=45])
WARNING: Listed variables have ambiguous coordinates on axes: T,T
             DATA SET: /home/users/tmap/ferret/linux/fer_dsets/data/coads_climatology.cdf
             LONGITUDE: 161W
             LATITUDE: 1S
Column 1: SST[T=01-JAN 00:45:31-DEC 06:34] is SEA SURFACE TEMPERATURE (Deg C)
Column 2: SORTED_BY_TIME[T=0.5:12.5] is sorted indices
Column 3: SAMPLEL(SST[I=90,J=45],SORTED_BY_TIME[I=90,J=45])[T=0.5:12.5] is SAMPLEL(SST[I=90,J=45],SORTED_BY_TIME[I=90,J=45])
          SST SORTED_ (C001,V008)
L / 1: 26.82    1.00   26.82
L / 2: 26.99    2.00   26.99
L / 3: 27.47   12.00   27.02
L / 4: 27.83   11.00   27.22
L / 5: 27.96    9.00   27.33
L / 6: 28.22    8.00   27.42
L / 7: 27.96    3.00   27.47
L / 8: 27.42   10.00   27.55
L / 9: 27.33    4.00   27.83
L / 10: 27.55    5.00   27.96
L / 11: 27.22    7.00   27.96
L / 12: 27.02    6.00   28.22

On 2/4/2019 11:23 AM, Ge Peng - NOAA Affiliate wrote:

I am using the method provided by Russ a couple of years ago to compute percentile along time axis for gridded data of sea ice concentrations.

The resultant Q-tiles are a bit of too scattered in the seasonal varying ice zone. I have worked with Russ directly during the shutdown – Thank you, Russ, for your helpful suggestions.

looking into it in more details, I am a bit of confused by the behavior of function sortl which does not seem to sort correctly in ldim at each I,j grid cell in the sense that the largest value should be sorted as the last. It can be demonstrated by the following ferret output. (I was using a much earlier version but have reinstalled the latest per Russ’s suggestion. The results are the same.)

Is this correct?

Thanks,

--- Peng

$ ferret

            NOAA/PMEL TMAP

            FERRET v7.44 (optimized)

            Linux 3.10.0-957.1.3.el7.x86_64 64-bit - 12/07/18

            4-Feb-19 14:01

yes? use coads_climatology

yes? let numpoints=`sst,return=lend`

!-> DEFINE VARIABLE numpoints=12

yes? let numlats=`sst,return=jend]`

!-> DEFINE VARIABLE numlats=90

yes? let/title="sorted indices" sorted_by_time=sortl(sst[l=1:`numpoints`])

!-> DEFINE VARIABLE/title="sorted indices" sorted_by_time=sortl(sst[l=1:12])

yes? list sst[i=90,j=45],sorted_by_time[i=90,j=45]

WARNING: Listed variables have ambiguous coordinates on axes: T

             DATA SET: /snfs1/users/gpeng/myFerret/fer_dsets/FerretDatasets-7.4/data/coads_climatology.cdf

             LONGITUDE: 161W

             LATITUDE: 1S

Column 1: SST[T=01-JAN 00:45:31-DEC 06:34] is SEA SURFACE TEMPERATURE (Deg C)

Column 2: SORTED_BY_TIME[T=0.5:12.5] is sorted indices

          SST SORTED_BY_TIME

L / 1: 26.82    1.00

L / 2: 26.99    2.00

L / 3: 27.47   12.00

L / 4: 27.83   11.00

L / 5: 27.96    9.00

L / 6: 28.22    8.00

L / 7: 27.96    3.00

L / 8: 27.42   10.00

L / 9: 27.33    4.00

L / 10: 27.55    5.00

L / 11: 27.22    7.00

L / 12: 27.02    6.00

On Tue, Mar 29, 2016 at 10:16 PM Russ Fiedler <russell.fiedler@xxxxxxxx> wrote:

Hi Peng,

I did this in the dim distant past by saving the sorted data to a file and then picked the appropriate values out.

use coads_climatology
let numpoints=`sst,return=lend`
let numlats=`sst,return=jend]`
let/title="sorted indices" sorted_by_time=sortl(sst[l=1:`numpoints`])

! I found that I ran out of memory if I tried to sort too much for some GB sized datasets. It depends on the size of your dataset.
! You might be able to make larger windows or even do it in one go.

save/j=1/jlimits=1:`numlats`/clob/file=sorted.nc sorted_by_time
repeat/j=2:`numlats` save/app/file=sorted.nc sorted_by_time

can var sorted_by_time
use sorted

! Make a variable with time indices

let index_2d=0*x[g=sorted_by_time] + 0*y[g=sorted_by_time] + l[g=sorted_by_time]

! Create an integrating kernel for the percentile wanted 1 for the limit we want and missing elsewhere.

!25 percentile say. Make sure at least 1 valid point is used

let k25 = if l[g=index_2d] eq max(int(0.25*ignore0(sorted_by_time[l=1:`numpoints`@ngd])),1) then 1

! Show the indices used in the sorted data. Note that missing values are accounted for.

shade k25[l=@loc:1]

! Multiply by SST and sum up

let sst_k25 = k25[gl=sst[d=coads_climatology]@asn]*sst[d=coads_climatology]

let/title="sst at 25 percentile" sst25=sst_k25[l=1:`numpoints`@sum]

shade sst25

Cheers,
Russ

On 30/03/16 05:32, Ge Peng - NOAA Affiliate wrote:

Found this message showing how to find percentiles for 2-dimensional gridded data:

http://ferret.pmel.noaa.gov/Ferret/Mail_Archives/fu_2004/msg00632.html

However, I would like to compute quantiles/percentiles along my time axis at each grid cell of the 2-dimensional gridded data. I.e., for each x and y, the percentiles are done along the time axis with valid data points. (We can ignore the z dimension for now.)

I could take the time series at each grid cell using nested repeat loop in x and y dimensions, following the above example to sort the data and place the ordered data onto the tiled axis.

It does not sound very efficient to me. Has anyone done something similar in ferret? Is there a better way to do this, perhaps without nested repeat loops in both x and y directions?

Appreciate any help.

--- Peng

--

Ge Peng, Ph.D
Research Scholar

Cooperative Institute for Climate and Satellites, NC (CICS-NC)

North Carolina State University (NCSU) and

NOAA’s National Centers for Environmental Information (NCEI)

Center for Weather and Climate (CWC)

151 Patton Ave, Asheville, NC 28801
ge.peng@xxxxxxxx
o: +1 828 257 3009
f: +1 828 257 3002

Following CICS-NC on Facebook

--

Ge Peng, PhD
Research Scholar
Cooperative Institute for Climate and Satellites - NC (CICS-NC)/NCSU at

NOAA’s National Centers for Environmental Information (NCEI)

Center for Weather and Climate (CWC)

151 Patton Ave, Asheville, NC 28801
+1 828 257 3009; ge.peng@xxxxxxxx

ORCID: http://orcid.org/0000-0002-1986-9115

Following CICS-NC on Facebook