RFROM: Random Forest Regression Ocean Maps


John M. Lyman and Gregory C. Johnson

RFROM is a collection of Random Forest Regression Ocean Maps. These maps are computed from random forests constructed in different ocean basins that have been trained on profile ocean heat content anomaly (OHCA) measurements integrated vertically over select depth levels. RFROM uses satellite sea surface height (SSH), satellite sea surface temperature (SST), time, latitude, and longitude as predictors. The results are near-global ¼-degree x ¼-degree x 7-day resolution maps of OHCA for ten different depth layers.

The OHCA profile data comes from the UK Met Office Hadley Centre (EN.4.2.2) and the Argo Program, SSH delayed-mode maps from the Copernicus Marine Environment Monitoring Service (SEALEVEL_GLO_PHY_L4_MY_008_047) and SST maps from NOAA (OISST V2.1). Real time maps use real time SSH (SEALEVEL_GLO_PHY_L4_NRT_008_046) where delayed mode SSH is not available.

0-2000 m Ocean Heat Content Relative to 1993-2022 Mean

Download animation: avi (220.69 MB) | mp4 (141.26 MB)

RFROM is described in the paper below:

Lyman, J. M. and G. C. Johnson, 2023: Global High-Resolution Random Forest Regression Maps of Ocean Heat Content Anomalies Using in Situ and Satellite Data. J. Atmos. Oceanic Tech., DOI: 10.1175/JTECH-D-22-0058.1.

RFROM V2.1 follows the original methodology described in the paper above, with the following improvements.

  1. SST is only used as a predictor in the upper 100 m, where it is important.
  2. SSH is only used as a predictor in the upper 1000 m, where it is important.
  3. To reduce spatial artifacts, four random forest models are made for each depth level. These models use 5° x 5° degree tiles instead of exact latitudes and longitudes as prediction variables. In a given tile longitudes and latitudes of profile positiions have been rounded to the center values of the tile. The center of the tiles in each of the four models differ so that the tiles are offset from one another and there is only 25% area overlap between any two models. When constructing the final maps, a weighted average of all four models is used. The weights are set by the distance to the center of a tile to point where the final map for that point is estimated.
  4. The all-year model for a region was used for years after 2006 if data coverage was low for that year in the region being mapped.


Feel free to use RFROM for any non-commercial purposes, but do so at your own risk, we cannot take any liability on the use of RFROM.

Acknowledging use:

Please cite Lyman and Johnson (2023) when using RFROM.

Links to RFROM V2.1 data