Notebook 1: Data Retrieval from FDB and Preprocessing¶
This notebook serves as a guide to accessing data from FDB (Fields Database) object storage and preprocessing. In the first part, it demonstrates the computation of median ensembles of precipitations aggregated over 6 hours, followed by a more complex computational process, the computation of potential vorticity.
Installation¶
To access the data from FDB, the kernel of the notebooks configures the necessary libraries and environmental variables. See instruction in FDB installation
Accessing Data from FDB¶
First we import few libraries:
from meteodatalab import mars, mch_model_data
We can use the fdb-utils in order to inspect the data that is present in the ICON-CH1-EPS and ICON-CH2-EPS collections. fdb-utils will return keys and values that can be used in the queries to FDB.
!fdb-utils list --filter number=0,step=0
The realtime data FDB only stores the latest forecast. Using fdb-utils you will find an ICON-CH1-EPS forecast reference time every 3 hours and ICON-CH2-EPS forecast reference time every 6 hours. We aim at retrieving the latest forecast started ~12 hours ago.
from datetime import datetime, timedelta
# Current time
now = datetime.now()
# Subtract 12 hours
past_time = now - timedelta(hours=12)
# Round down to the nearest multiple of 3
rounded_hour = (past_time.hour // 3) * 3
rounded_time = past_time.replace(hour=rounded_hour, minute=0, second=0, microsecond=0)
# Format as YYYYMMDD and HHMM
date = rounded_time.strftime('%Y%m%d')
time = rounded_time.strftime('%H%M')
date,time
Retrieving Data¶
Use query functions to retrieve the required data.
The request for data is made by specifying the values of MARS keys.
MARS keys are derived from GRIB keys and serve as a base for the FDB index.
The meteodatalab.mars module provides helpers to build valid MARS request in the context of MeteoSwiss.
Note that the available data in the test FDB instance is typically limited to the 2 last runs so make sure to update the date and time.
request = mars.Request(
param="TOT_PREC",
date=date,
time=time,
number=tuple(range(1)),
step=tuple(str(i) for i in range(10)), # minutes
levtype=mars.LevType.SURFACE,
model=mars.Model.ICON_CH1_EPS,
)
meteodatalab mars request fills some of the key:values required by FDB with defaults for ICON-CH1-EPS and ICON-CH2-EPS, and provides syntax verification of the request. You can inspect the raw FDB request:
request.to_fdb()
The meteodatalab.mch_model_data module provides some convenience functions to access model data.
Earthkit-data is used in the background to read the data that is being returned from FDB.
ds = mch_model_data.get_from_fdb(request)
The data is returned as dictionary of xarray DataArrays where the keys are set to the param short name.
ds["TOT_PREC"]
Data Preprocessing for Computing Median Ensembles¶
We will compute the ensembles median of total precipitation aggregated over 6 hours:
Data Aggregation¶
For aggregation of data over 6-hour intervals meteodatalab implements operators that transform the data.
Total precipitation of the direct model output of ICON is accumulated from the reference time. We use the delta to reaggregate to 6 hour intervals.
import numpy as np
from meteodatalab.operators import time_operators as time_ops
tot_prec_6h = time_ops.delta(ds["TOT_PREC"], np.timedelta64(6, "h"))
tot_prec_6h
Ensemble Calculation¶
Next we compute the ensembles median using the 6h aggregation:
data = tot_prec_6h.isel(lead_time=8).median(dim="eps").clip(min=0)
Finally we plot the results using earthkit-plots
from earthkit.plots.geo import bounds, domains
from earthkit.plots.styles import Style
import earthkit
import cartopy.crs as ccrs
from plot_utils.load_colormaps import load_ncl_rgb_colors
from plot_utils.plot_valid_data_frame import get_valid_data_frame
import pandas as pd
# Define the spatial extent of the target grid
# (we retain the original ICON-CH1/CH2-EPS domain)
xmin, xmax = -0.757, 17.583 # Longitude bounds
ymin, ymax = 42.183, 50.583 # Latitude bounds
bbox = bounds.BoundingBox(xmin, xmax, ymin, ymax, ccrs.Geodetic())
domain = domains.Domain.from_bbox(
bbox=bbox,
name="CH2"
)
# === Load custom colormap with levels ===
colors, levels = load_ncl_rgb_colors("precip_1h_11lev")
chart = earthkit.plots.Map(domain=domain)
chart.grid_cells(data, x="lon", y="lat", style=Style(colors=colors, levels=levels))
# === Add Frame ===
frame_polygon = get_valid_data_frame(data)
if frame_polygon:
x, y = frame_polygon.exterior.xy
chart.ax.plot(x, y, color='black', linewidth=1, transform=ccrs.PlateCarree())
else:
print("No valid frame polygon could be computed.")
# === Add Map Features ===
chart.land()
chart.coastlines()
chart.borders()
chart.gridlines()
# === Annotate Chart ===
ref_time = pd.to_datetime(data.coords["ref_time"].values[0]).strftime("%Y-%m-%d %H:%M UTC")
lead_time = data.coords["lead_time"].values.astype('timedelta64[h]')
title = f"Hourly Precipitation | {ref_time} (+{lead_time})"
legend_label = f"tot_prec_6h ({'mm'})"
chart.title(title)
chart.legend(label=legend_label)
chart.show()
Potential Vorticity Calculation and Wind Field Rotation¶
This notebook introduces another example that requires a multiple-field FDB request, in order to to compute the potential vorticity (PV) and rotating the wind field, representing a more intricate computational process compared to Notebook 1, which primarily focused on straightforward data retrieval and preprocessing.
Querying Data¶
Utilize query functions to smoothly retrieve the nine required fields spanning all model levels.
request = mars.Request(
param=("P", "T", "U", "V", "W", "QV", "QC", "QI"),
date=date,
time=time,
number=0,
step=7,
levtype=mars.LevType.MODEL_LEVEL,
levelist=tuple(range(1, 82)),
model=mars.Model.ICON_CH1_EPS,
)
We also need the height field of the vertical levels. Since this is a constant field, it has some different key values than the previous fields (for example step). Therefore, we issue a separate request to obtain the constant HHL field:
request_hhl_const = mars.Request(
param="HHL",
date=date,
time=time,
number=0,
step=0,
levtype=mars.LevType.MODEL_LEVEL,
levelist=tuple(range(1, 82)),
model=mars.Model.ICON_CH1_EPS,
)
Issue the two requests and obtain the data:
ds = mch_model_data.get_from_fdb(request)
ds |= mch_model_data.get_from_fdb(request_hhl_const)
ds
hhl = ds["HHL"].squeeze(drop=True)
hhl
Computing Potential Vorticity¶
The next Jupyter cell will tackle the computation of potential vorticity, a more complex process that isn't directly computed by the model.
from meteodatalab import metadata
from meteodatalab.operators.rho import compute_rho_tot
from meteodatalab.operators.theta import compute_theta
from meteodatalab.operators.pot_vortic import compute_pot_vortic
theta = compute_theta(ds["P"], ds["T"])
rho_tot = compute_rho_tot(ds["T"], ds["P"], ds["QV"], ds["QC"], ds["QI"])
metadata.set_origin_xy(ds, "HHL")
pot_vortic = compute_pot_vortic(ds["U"], ds["V"], ds["W"], theta, rho_tot, hhl)
Interpolate to potential temperature levels¶
It's possible to interpolate the potential vorticity on isotherms of potential temperature.
from meteodatalab.operators.destagger import destagger
from meteodatalab.operators.vertical_interpolation import interpolate_k2theta
# use mid-levels of hhl for the interpolation
hfl = destagger(hhl, "z")
theta_values = [310.0, 315.0, 320.0, 325.0, 330.0, 335.0]
pot_vortic_th = interpolate_k2theta(pot_vortic, "low_fold", theta, theta_values, "K", hfl)
chart = earthkit.plots.Map(domain=domain)
# === Load custom colormap with levels ===
colors, levels = load_ncl_rgb_colors("pot_vortic")
chart.quickplot(pot_vortic_th.sel(z=320)*1e6, x="lon", y="lat", style=Style(colors=colors, levels=levels))
# === Add Frame ===
frame_polygon = get_valid_data_frame(data)
if frame_polygon:
x, y = frame_polygon.exterior.xy
chart.ax.plot(x, y, color='black', linewidth=1, transform=ccrs.PlateCarree())
else:
print("No valid frame polygon could be computed.")
# === Add Map Features ===
chart.land()
chart.coastlines()
chart.borders()
chart.gridlines()
ref_time = pd.to_datetime(pot_vortic_th.coords["ref_time"].values[0]).strftime("%Y-%m-%d %H:%M UTC")
lead_time = pot_vortic_th.coords["lead_time"].values.astype('timedelta64[h]')
title = f"Potential Vorticity at $\\theta$ = 320K | {ref_time} (+{lead_time})"
legend_label = f"potential vorticity units (PUV)"
chart.title(title)
chart.legend(label=legend_label)
chart.show()
Compute the mean between pressure levels¶
There's also an option to compute the mean potential vorticity between two isobars (or pressure levels).
from meteodatalab.operators.vertical_interpolation import interpolate_k2p
from meteodatalab.operators.vertical_reduction import integrate_k
isobars = interpolate_k2p(hfl, "linear_in_lnp", ds["P"], [700, 900], "hPa")
h700, h900 = isobars.transpose("z", ...)
pot_vortic_mean = integrate_k(pot_vortic, "normed_integral", "z2z", hhl, (h900, h700))
chart = earthkit.plots.Map(domain=domain)
# === Load custom colormap with levels ===
colors, _ = load_ncl_rgb_colors("pot_vortic")
levels = [-10.0,-8.9,-7.4,-5.6,-3.6,-1.6,-0.4,-0.1,0.0,0.1,0.4,1.6,3.6,5.6,7.4,8.9,10.0]
chart.quickplot(pot_vortic_mean*1e6, x="lon", y="lat", style=Style(colors=colors, levels=levels))
# === Add Frame ===
frame_polygon = get_valid_data_frame(data)
if frame_polygon:
x, y = frame_polygon.exterior.xy
chart.ax.plot(x, y, color='black', linewidth=1, transform=ccrs.PlateCarree())
else:
print("No valid frame polygon could be computed.")
# === Add Map Features ===
chart.land()
chart.coastlines()
chart.borders()
chart.gridlines()
ref_time = pd.to_datetime(pot_vortic_th.coords["ref_time"].values[0]).strftime("%Y-%m-%d %H:%M UTC")
lead_time = pot_vortic_th.coords["lead_time"].values.astype('timedelta64[h]')
title = f"Mean potential vorticity between 900 and 700 hPa | {ref_time} (+{lead_time})"
legend_label = f"potential vorticity units (PUV)"
chart.title(title)
chart.legend(label=legend_label)
chart.show()
Summary¶
- retrieve data from FDB in python
- read GRIB data into xarray
- process the data with meteorological operators that are aware of the grib metadata
- keep the GRIB metadata consistent thoughout operations