dvas.tools.gdps package

Copyright (c) 2020-2022 MeteoSwiss, contributors listed in AUTHORS.

Distributed under the terms of the GNU General Public License v3.0 or later.

SPDX-License-Identifier: GPL-3.0-or-later

This sub-package contains GRUAN-related tools.

Submodules

dvas.tools.gdps.correlations module

Copyright (c) 2020-2022 MeteoSwiss, contributors listed in AUTHORS.

Distributed under the terms of the GNU General Public License v3.0 or later.

SPDX-License-Identifier: GPL-3.0-or-later

This module contains GRUAN-correlations-related utilities, including the means to compute correlation coefficients.

dvas.tools.gdps.correlations.corr_coeff_matrix(sigma_name, step_ids, oids=None, mids=None, rids=None, eids=None)

Computes the correlation coefficient(s) between distinct GDP measurement, for a specifc uncertainty type.

Parameters:
  • sigma_name (str) – uncertainty type. Must be one of [‘ucs’, ‘uct’, ‘ucu’].

  • step_ids (1D numpy.ndarray of int|float) – synchronized time, step, or altitude id of each measurement.

  • oids (1D numpy.ndarray of int|str, optional) – object id from measurement 1.

  • mids (1D numpy.ndarray of int|str, optional) – GDP model from measurement 1.

  • rids (1D numpy.ndarray of int|str, optional) – rig id of measurement 1.

  • eids (1D numpy.ndarray of int|str, optional) – event id of measurement 1.

Warning

  • If no oids are specified, the function will assume that the data originates from the exact same radiosonde. Idem for the GDP model ids, rig ids and event ids.

Returns:

numpy.ndarray of float(s)

the square correlation coefficient(s) array, in the range [0, 1],

with shape (len(step_ids), len(step_ids)).

The supported uncertainty types are:

  • ‘ucs’: spatial-correlated uncertainty.

    Full correlation between measurements acquired during the same event at the same site, irrespective of the time step/altitude, rig, or serial number. No correlation between distinct radiosonde models.

  • ‘uct’: temporal-correlated uncertainty.

    Full correlation between measurements acquired at distinct sites during distinct events, with distinct and serial numbers. No correlation between distinct radiosonde models.

  • ‘ucu’: un-correlated uncertainty.

    No correlation whatsoever between distinct measurements.

dvas.tools.gdps.gdps module

Copyright (c) 2020-2022 MeteoSwiss, contributors listed in AUTHORS.

Distributed under the terms of the GNU General Public License v3.0 or later.

SPDX-License-Identifier: GPL-3.0-or-later

This module contains GRUAN-related routines, including correlation rules for GDP uncertainties.

dvas.tools.gdps.gdps.combine(gdp_prfs, binning=1, method='weighted arithmetic mean', mask_flgs=None, chunk_size=150, n_cpus=1)

Combines and (possibly) rebins GDP profiles, with full error propagation.

Parameters:
  • gdp_profs (dvas.data.data.MultiGDPProfile) – synchronized GDP profiles to combine.

  • binning (int, optional) – the number of profile steps to put into a bin. Defaults to 1.

  • method (str, optional) – combination rule. Can be one of [‘weighted arithmetic mean’, ‘arithmetic mean’, weighted circular mean’, ‘circular mean’, or ‘delta’]. Defaults to ‘weighted arithmetic mean’.

  • mask_flgs (str|list of str, optional) – (list of) flag(s) to ignore when combining profiles.

  • chunk_size (int, optional) – to speed up computation, Profiles get broken up in chunks of that length. The larger the chunks, the larger the memory requirements. The smaller the chunks the more items to process. Defaults to 150.

  • n_cpus (int|str, optional) – number of cpus to use. Can be a number, or ‘max’. Set to 1 to disable multiprocessing. Defaults to 1.

Returns:

(dvas.data.data.MultiCWSProfile, dict)

the combined working standard profile, and the

a dictionnary with the full covariance matrices for the different uncertainty types.

Note

This function requires profiles that have been resampled (if applicable) and synchronized beforehand. This implies that the _idx index must be identical for all Profiles.

dvas.tools.gdps.stats module

Copyright (c) 2020-2022 MeteoSwiss, contributors listed in AUTHORS.

Distributed under the terms of the GNU General Public License v3.0 or later.

SPDX-License-Identifier: GPL-3.0-or-later

This module contains statistical routines and tools for handling GDPs.

dvas.tools.gdps.stats.ks_test(gdp_pair, alpha=0.0027, m_val=1, method='arithmetic delta', **kwargs)

Runs a scipy.stats.kstest() two-sided test on the normalized-delta between two GDPProfile instances, against a normal distribution.

The KS test is being run on a level-per-level basis.

Note

See the dvas documentation for more details about the scientific motivation for this function.

Parameters:
  • gdp_pair (list of dvas.data.strategy.data.GDPProfile) – GDP Profiles to compare.

  • alpha (float, optional) – The significance level for the KS test. Must be 0<alpha<1. Defaults to 0.27%=0.0027.

  • m_val (int, optional) – Binning strength for the Profile delta (Important: the binning is performed before running the KS test). Defaults to 1 (=no binning).

  • method (str, optional) – ‘arithmetic delta’ or ‘circular delta’ (the latter wraps angles)

  • **kwargs – mask_flgs and/or n_cpus and/or chunk_size, that will get fed to dvas.tools.gdps.gdps.combine().

Returns:

pandas DataFrame – a DataFrame containing Delta_pqei, sigma_pqei, k_pqei, pks_pqei, and f_pqei values:

  • Delta_pqei is the (binned) profile delta with sigma_pqei its (total) uncertainty,

  • k_pqei=Delta_pqei/sigma_pqei is the (binned) profile delta normalized by the total uncertainty,

  • pks_pqei contains the corresponding p-value from the KS test, and

  • f_pqi contains 1 where the KS test failed, and 0 otherwise. That is: 1 <=> the p-value of the KS test is <= alpha.

dvas.tools.gdps.stats.gdp_incompatibilities(gdp_prfs, alpha=0.0027, m_vals=None, method='arithmetic delta', rolling=True, do_plot=False, fn_prefix=None, fn_suffix=None, **kwargs)

Runs a series of KS tests to assess the consistency of several GDP profiles.

Parameters:
  • gdp_prfs (dvas.data.data.MultiGDPProfile) – synchronized GDP profiles to check.

  • alpha (float, optional) – The significance level for the KS test. Defaults to 0.27%

  • m_vals (ndarray of int, optional) – The rolling binning sizes “m”. Defaults to None==[1].

  • method (str, optional) – ‘arithmetic delta’ or ‘circular delta’ (the latter wraps angles).

  • rolling (bool, optional) – if True and len(m_vals)>1, any incompatibility found for a specific m value will be forwarded to the subsequent ones. Else, each m value is treated independantly. Defaults to True. If rolling is True, the order of the m_vals list thus matters.

  • do_plot (bool, optional) – Whether to create the diagnostic plot, or not. Defaults to False.

  • fn_prefix (str, optional) – if set, the prefix of the plot filename.

  • fn_suffix (str, optional) – if set, the suffix of the plot filename.

  • **kwargs – n_cpus and/or chunk_size and/or mask_flgs, that will get fed to dvas.tools.gdps.gdps.combine().

Returns:

pd.DataFrame – the values of Delta_pqei, sigma_pqei, k_pqei, pks_pqei and f_pqei for each pair of GDPs and each m value.

GDP pairs are identified using their oids, as: oid_1_vs_oid_2.

f_pqei==1 indicates that the p-value of the KS test is <= alpha for this measurement, i.e. that the profiles are incompatible.

dvas.tools.gdps.stats.gdp_validities(incompat, m_vals=None, strategy='all-or-none')

Given GDP incompatibilities, identifies valid measurements (suitable for the assembly of a combined working standard) given a specific combination strategy.

Valid strategies include:
  • ‘all-or-none’: either all GDP measurements from a certain bin are compatible with each

    others, or all of them are dropped.

  • ‘force-all-valid’: combine all GDPs, irrespective of the reported incompatibilities.

Parameters:
  • incompat (dict) – outcome of dvas.tools.gdps.stats.gdp_incompatibilities().

  • m_vals (list of int, oprtional) – list of m values to take into account when checking incompatibilities. Defaults to None = [1].

  • strategy (str, optional) – name of a validation strategy. Defaults to ‘all-or-none’.

Returns: