Skip to article frontmatterSkip to article content

Project Pythia Logo ECMWF logo Google logo

Interactive Visualization using GeoViews


Overview

A team at Google Research & Cloud are making parts of the ECMWF Reanalysis version 5 (aka ERA-5) accessible in a Analysis Ready, Cloud Optimized (aka ARCO) format.

In this notebook, we will do the following:

  1. Access the ERA-5 ARCO catalog
  2. Select a particular dataset and variable from the catalog
  3. Convert the data from Gaussian to Cartesian coordinates
  4. Plot a map of sea-level pressure contours and 2-meter temperature mesh using Geoviews.

Prerequisites

ConceptsImportanceNotes
CartopyNecessary
XarrayNecessary
[Geoviews]Necessary
  • Time to learn: 30 minutes

Imports

import fsspec
import xarray as xr
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import scipy.spatial
import numpy as np
import geoviews as gv
from geoviews import opts
from pyproj import Transformer
from holoviews.operation.datashader import rasterize

Access the ARCO ERA-5 catalog on Google Cloud

Let’s open the single-level-reanalysis Zarr file and save two variables from it

reanalysis = xr.open_zarr(
    'gs://gcp-public-data-arco-era5/co/single-level-reanalysis.zarr', 
    chunks={'time': 48},
    consolidated=True,
)

msl = reanalysis.msl
t2m = reanalysis.t2m

Regrid to Cartesian coordinates

These reanalyses are in their native, Guassian coordinates. We will define and use several functions to convert them to a lat-lon grid, as documented in the ARCO ERA-5 GCP example notebooks

def mirror_point_at_360(ds):
  extra_point = (
      ds.where(ds.longitude == 0, drop=True)
      .assign_coords(longitude=lambda x: x.longitude + 360)
  )
  return xr.concat([ds, extra_point], dim='values')

def build_triangulation(x, y):
  grid = np.stack([x, y], axis=1)
  return scipy.spatial.Delaunay(grid)

def interpolate(data, tri, mesh):
  indices = tri.find_simplex(mesh)
  ndim = tri.transform.shape[-1]
  T_inv = tri.transform[indices, :ndim, :]
  r = tri.transform[indices, ndim, :]
  c = np.einsum('...ij,...j', T_inv, mesh - r)
  c = np.concatenate([c, 1 - c.sum(axis=-1, keepdims=True)], axis=-1)
  result = np.einsum('...i,...i', data[:, tri.simplices[indices]], c)
  return np.where(indices == -1, np.nan, result)

Select a particular time range from the dataset.

msl93 = msl.sel(time=slice('1993-03-13T18:00:00','1993-03-14T00:00:00')).compute().pipe(mirror_point_at_360)
t2m93 = t2m.sel(time=slice('1993-03-13T18:00:00','1993-03-14T00:00:00')).compute().pipe(mirror_point_at_360)

Regrid to a lat-lon grid.

tri = build_triangulation(msl93.longitude, msl93.latitude)

longitude = np.linspace(0, 360, num=360*4+1)
latitude = np.linspace(-90, 90, num=180*4+1)

mesh = np.stack(np.meshgrid(longitude, latitude, indexing='ij'), axis=-1)

grid_mesh_msl = interpolate(msl93.values, tri, mesh)
grid_mesh_t2m = interpolate(t2m93.values, tri, mesh)

Construct an Xarray DataArray from the regridded array.

da_msl = xr.DataArray(data=grid_mesh_msl,
                dims=["time", "longitude", "latitude"],
                coords=[('time', msl93.time.data), ('longitude', longitude), ('latitude', latitude)],
                name='msl')

da_t2m = xr.DataArray(data=grid_mesh_t2m,
                dims=["time", "longitude", "latitude"],
                coords=[('time', t2m93.time.data), ('longitude', longitude), ('latitude', latitude)],
                name='t2m')

Convert to hPa and deg C.

slp = da_msl/100
t2m = da_t2m-273.15

Plot the data using geoviews

Add an interactive element by using the holoviz/geoviews libraries.

We need to choose the rendering backend that we want to use in Geoviews, in this case we are using Bokeh.


gv.extension('bokeh')

If you want the plot restricted to a specific part of the world, you can specify where, however, we need to transform the coordinates from PlateCarree to WebMercator (aka EPSG-3857), which is the default projection for Geoviews with the Bokeh backend.

lonW, lonE, latS, latN = -130, -60, 20, 55 
transformer = Transformer.from_crs(ccrs.PlateCarree(), "EPSG:3857")
lon1, lat1 = transformer.transform(lonW,latS)
lon2, lat2 = transformer.transform(lonE, latN)

Convert our Xarray datasets/data arrays into Geoviews dataset objects, specifying the different dimensions of the dataset (lat, lon, time) as well as the variable we want to plot. For this first example, we’ll just visualize the first time in the selected time range.

Note:

By default, Geoviews expects a dataset whose coordinates are lat-lon (i.e., using a PlateCarree projection) to have the longitude dimension vary from -180 to 180. In this case, the ERA-5's longitudes range from 0 to 360.
In this case, ensure that the crs of the element declares the actual central_longitude, e.g. 180 degrees rather than the default 0 degrees.
dataset_era = gv.Dataset(slp.isel(time=0), kdims=['longitude','latitude'],vdims='msl') #create a geoviews dataset
contour = gv.project(dataset_era.to(gv.LineContours, ['longitude', 'latitude'],crs=ccrs.PlateCarree(central_longitude=180))) #create a Geoviews contour object

Plotting MSLP

cint = np.arange(900,1080,4)
(gv.tile_sources.OSM * contour).opts(
    opts.LineContours(tools=['hover'], levels=cint, frame_width=700, frame_height=400,show_legend=False, line_width=3,xlim=(lon1,lon2),ylim=(lat1,lat2)))

Use the Bokeh tools to the right of the plot to zoom in/out or reset to the original dimensions. The bottom-most tool tip (the hover tool) will produce a pop-up window that shows the sea-level pressure value at the closest grid point to where your cursor lies.

Info

Notice how this only plots a single time. We can iterate over the time dimension by specifying time as a dimension when we create the Geoviews dataset, as shown below!
dataset_era = gv.Dataset(slp, kdims=['longitude','latitude','time'],vdims='msl') #create a geoviews dataset; here we do not select just a single time
contour = gv.project(dataset_era.to(gv.LineContours, ['longitude', 'latitude'])) #create line contours

Create the interactive visualization. We overlay the SLP contours on a map from a web tile server. We specify various options, such as frame size, spatial extent, and line thickness as well.

(gv.tile_sources.OSM * contour).opts(
    opts.LineContours(tools=['hover'], levels=cint, frame_width=700, frame_height=400,show_legend=False, line_width=3,xlim=(lon1,lon2),ylim=(lat1,lat2)))

Since there are multiple times in the dataset, we now get a time slider that you can manipulate. Notice that as you change the time, the plot automatically updates!

Note

Did you notice that it took a little longer for the graphic to appear? That is because we have loaded seven total time steps, instead of just one.

Plotting T2M

Now let’s plot 2 meter temperatures. This time, we’ll visualize the data via a Quadmesh (gv.Quadmesh) instead of using contour lines (gv.LineContours).

Quadmesh is more computationally expensive, so it is good practice to subset the data from its original global extent to an area that you’re interested in. Here, we select the same region as we did earlier for sea-level pressure.

def lon_to_360(dlon: float) -> float:
    return ((360 + (dlon % 360)) % 360)

cond = (t2m.longitude > lon_to_360(lonW)) & (t2m.latitude > latS) & \
       (t2m.longitude < lon_to_360(lonE)) & (t2m.latitude < latN)

t2m_cropped = t2m.where(cond, drop=True)
dataset_era = gv.Dataset(t2m_cropped, kdims=['longitude','latitude','time'],vdims='t2m')
qm = gv.project(dataset_era.to(gv.QuadMesh, ['longitude', 'latitude']))
(gv.tile_sources.OSM().opts(xlim=(lon1, lon2),ylim=(lat1, lat2),title=f'ERA-5 2mT', frame_width=700, frame_height=400)  * (qm.opts(tools=['hover'], axiswise=True, cmap='coolwarm', colorbar=True, clim=(-50,50), alpha=0.8)))

You probably noticed this visualization took a while to render. A more performant strategy is to rasterize the Geoviews dataset.

Warning

If you plot a large dataset without rasterizing it, e.g. the entire globe for multiple timeplots of data, load times and RAM will scale up in tandem.
qm_raster = rasterize(qm, precompute=True)
(gv.tile_sources.OSM().opts(xlim=(lon1, lon2),ylim=(lat1, lat2),title=f'ERA-5 2mT', frame_width=700, frame_height=400)  * (qm_raster.opts(tools=['hover'], axiswise=True, cmap='coolwarm', colorbar=True, clim=(-50,50), alpha=0.8)))

Time slider bug!

While the visualization loaded much faster, the rasterized dataset does not update as we manipulate the time slider!

What’s next?

In the next notebook, we will add more user-configurable options to our Geoviews-powered visualizations, using another key player in the Holoviz ecosystem, Panel.

Summary

In this notebook, we have accessed one of the ARCO ERA-5 datasets, regridded from the ECMWF native spectral to cartesian lat-lon coordinates, and used geoviews to create interactive maps of sea-level pressure and 2-meter temperature.

Resources and references

References
  1. Stern, C., Abernathey, R., Hamman, J., Wegener, R., Lepore, C., Harkins, S., & Merose, A. (2022). Pangeo Forge: Crowdsourcing Analysis-Ready, Cloud Optimized Data Production. Frontiers in Climate, 3. 10.3389/fclim.2021.782909