dvas.data package

Copyright (c) 2020-2022 MeteoSwiss, contributors listed in AUTHORS.

Distributed under the terms of the GNU General Public License v3.0 or later.

SPDX-License-Identifier: GPL-3.0-or-later

Subpackages

Submodules

dvas.data.data module

Copyright (c) 2020-2023 MeteoSwiss, contributors listed in AUTHORS.

Distributed under the terms of the GNU General Public License v3.0 or later.

SPDX-License-Identifier: GPL-3.0-or-later

Module contents: Data management

class dvas.data.data.MultiProfileAC(load_stgy=None, sort_stgy=None, save_stgy=None, plot_stgy=None, rebase_stgy=None)

Bases: object

Abstract MultiProfile class

REQUIRED_ATTRIBUTES = {'_DATA_TYPES': <class 'type'>}
property profiles

list of Profile

property db_variables

Correspondence between DataFrame and DB parameter

Type:

dict

property var_info

Variable informations

Type:

dict

property info

Data info

Type:

List of ProfileManger info

rm_info_tags(val, *, inplace=True)

Remove some tags from all info tag lists.

Parameters:

val (str|list of str) – Tag value(s) to remove

— Decorating function infos —

Parameters:

inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.

— — — — — —

add_info_tags(val, *, inplace=True)

Add tag from all info tags

Parameters:

val (str|list of str) – Tag values to add.

— Decorating function infos —

Parameters:

inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.

— — — — — —

copy()

Return a deep copy of the object

extract(inds)

Return a new MultiProfile instance with a subset of the Profiles.

Parameters:

inds (int|list of int) – indices of the Profiles to extract.

Returns:

dvas.data.data.MultiProfile – the new instance.

load_from_db(*args, inplace=True, **kwargs)

Load data from the database.

Parameters:
  • *args – positional arguments

  • **kwargs – key word arguments

— Decorating function infos —

Parameters:

inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.

— — — — — —

sort(*, inplace=True)

Sort method

— Decorating function infos —

Parameters:

inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.

— — — — — —

save_to_db(add_tags=None, rm_tags=None, prms=None)

Save method to store the entire content of the Multiprofile instance back into the database with an updated set of tags.

Parameters:
  • add_tags (list of str, optional) – list of tags to add to the entity when inserting it into the database. Defaults to None.

  • rm_tags (list of str, optional) – list of existing tags to remove from the entity before inserting ot into the database. Defaults to None.

  • prms (list of str, optional) – list of column names to save to the database. Defaults to None (= save all possible parameters).

Notes

The TAG_ORIGINAL will always be removed and the ‘derived’ tag will always be added by default when saving anything into the database.

update(db_df_keys, data)

Update the whole Multiprofile list with new Profiles.

Parameters:
  • db_df_keys (dict) – Relationship between database parameters and Profile.data columns.

  • data (list of Profile) – Data

append(db_df_keys, val)

Append method

Parameters:
  • db_df_keys (dict) – Relationship between database parameters and Profile.data columns.

  • val (Profile) – Data

get_prms(prm_list=None, mask_flgs=None, request_flgs=None, with_metadata=None, pooled=False)

Convenience getter to extract specific columns from the DataFrames and/or class properties of all the Profile instances.

Parameters:
  • prm_list (str|list of str, optional) – names of the column(s) to extract from all the Profile DataFrames. Defaults to None (=returns all the columns from the DataFrame).

  • mask_flgs (str|list of str, optional) – name(s) of the flag(s) to NaN-ify in the extraction process. Defaults to None.

  • request_flgs (str|list of str, optional) – if set, will only return points that have these flag values set (AND rule applied, if multiple values are provided).

  • with_metadata (str|list, optional) – name of the metadata fields to include in the table. Defaults to None.

  • pooled (bool, optional) – if True, all profiles will be gathered together. If False, Profiles are kept distinct using a MultiIndex. Defaults to False.

Returns:

pd.DataFrame – the requested data as a MultiIndex pandas DataFrame.

Warning

The resulting DataFrame has only dvas.hardcoded.PRF_IDX (=’_idx’) as an index. Since the values of dvas.hardcoded.PRF_TDT (=’tdt’) and dvas.hardcoded.PRF_ALT (=’alt’) are not necessarily the sames for all the Profiles, these cannot be used as common indexes here.

get_info(prm=None)

Convenience function to extract Info from all the Profile instances.

Parameters:

prm (str, optional) – Info attribute to extract. Default to None.

Returns:

dict of list – idem to self.profiles, but with only the requested metadata.

has_tag(tag)

Convenience method to check if the different Profile each have a specific tag, or not.

Parameters:

tag (str) – tag to search for.

Returns:

list of bool – one bool for each Profile.

plot(**kwargs)

Plot method

Parameters:

**kwargs – Keyword arguments to be passed down to the plotting function.

Returns:

None

rebase(new_lengths, shifts=None, *, inplace=True)

Rebase method, which allows to map Profiles on new set of integer indices.

This will move the values around, including the non-integer indices (i.e. anything other than ‘_idx’) if applicable.

Parameters:
  • new_lengths (int|list of int) – The length of the DataFrame to rebase upon. If specifiying an int, the same length will be applied to all Profiles. Else, the list should specify a length for each Profile.

  • shifts (int|list of int, optional) – row n of the existing data will become row n+shift. If specifiying an int, the same shift will be applied to all Profiles. Else, the list should specify a shift for each Profile. Defaults to None (=no shift).

— Decorating function infos —

Parameters:

inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.

— — — — — —

class dvas.data.data.MultiProfile

Bases: MultiProfileAC

Multi profile base class, designed to handle multiple Profile.

class dvas.data.data.MultiRSProfileAC(load_stgy=None, sort_stgy=None, save_stgy=None, plot_stgy=None, rebase_stgy=None, resample_stgy=None)

Bases: MultiProfileAC

Abstract MultiRSProfile class

resample(freq='1s', interp_dist=1, chunk_size=150, n_cpus=1, *, inplace=True)

Resample the profiles (one-by-one) onto regular timesteps using linear interpolation.

Parameters:
  • freq (str) – see pandas.timedelta_range(). Defaults to ‘1s’.

  • interp_dist (int|float) – Distance beyond which to not interpolate, and use NaNs. Defaults to 1s.

Note

Will unwrap angles if self.var_info[PRF_VAL][‘prm_name’] == ‘wdir’.

— Decorating function infos —

Parameters:

inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.

— — — — — —

class dvas.data.data.MultiRSProfile

Bases: MultiRSProfileAC

Multi RS profile manager, designed to handle multiple RSProfile instances.

class dvas.data.data.MultiGDPProfileAC(load_stgy=None, sort_stgy=None, save_stgy=None, plot_stgy=None, rebase_stgy=None, resample_stgy=None)

Bases: MultiRSProfileAC

Abstract MultiGDPProfile class

property uc_tot

Convenience getter to extract the total uncertainty from all the Profile instances.

Returns:

list of DataFrame – idem to self.profiles, but with only the requested data.

class dvas.data.data.MultiGDPProfile

Bases: MultiGDPProfileAC

Multi GDP profile manager, designed to handle multiple GDPProfile instances.

class dvas.data.data.MultiCWSProfile

Bases: MultiGDPProfileAC

Multi CWS profile manager, designed to handle multiple GDPProfile instances.

class dvas.data.data.MultiDeltaProfile

Bases: MultiGDPProfileAC

Multi Delta profile manager, designed to handle multiple DeltaProfile instances.

dvas.data.io module

Copyright (c) 2020-2022 MeteoSwiss, contributors listed in AUTHORS.

Distributed under the terms of the GNU General Public License v3.0 or later.

SPDX-License-Identifier: GPL-3.0-or-later

Module contents: IO management

dvas.data.io.update_db(search, strict=False)

Update database.

Parameters:
  • search (str) – Parameter name search criteria.

  • strict (bool, optional) – If False, match for any sub-string. If True match for entire string. Default to False.

../_images/fb894d4463d16d8efe4388554a6f2f04615ade3b740c8beed75c86d54ab09b65.svg

dvas.data.linker module

Copyright (c) 2020-2023 MeteoSwiss, contributors listed in AUTHORS.

Distributed under the terms of the GNU General Public License v3.0 or later.

SPDX-License-Identifier: GPL-3.0-or-later

Module contents: Data linker classes

class dvas.data.linker.Handler

Bases: ABC

The Handler interface declares a method for building the chain of handlers. It also declares a method for executing a request.

Note

Source

abstract set_next(handler)

Method to set next handler

Parameters:

handler (Handler) – Handler class

Returns:

Handler

abstract handle(request, prm_name)

Handle method

Parameters:
  • request (object) – Request

  • prm_name (str) – Parameter name

Returns:

Optional – ‘object’

class dvas.data.linker.AbstractHandler

Bases: Handler

The default chaining behavior can be implemented inside a base handler class.

../_images/ea0888bc860083ce10b5a56196064f4e4f10b0edb90ead5cfbf585e7ccb4c099.svg
set_next(handler)

Returning a handler from here will let us link handlers in a convenient way like this: handler1.set_next(handler2).set_next(handler3)

abstract handle(*args)

Super handler behavior

class dvas.data.linker.FileHandler(orig_data_cfg)

Bases: AbstractHandler

File handler

__init__(orig_data_cfg)
Parameters:

orig_data_cfg (config.config.OrigData) – Original data config manager

property origdata_config_mngr

Yaml original metadata manager

Type:

config.config.OrigData

property file_suffix_re

Handled data file suffix.

Type:

re.compile

property prm_re

Handled parameter name.

Type:

re.compile

property file_model_pat

File model pattern. Group #1 must correspond to model name.

Type:

re.compile

check_file(file)

Check if file as the correct suffix pattern.

Parameters:

file (pathlib.Path) – File path or file name

Returns:

bool – True if file name match

check_prm(prm_name)

Check if parameter as correct parameter pattern

Parameters:

prm_name (str) – Parameter name

Returns:

bool – True if file parameter name match

property data_ok_tags

Tags list to add to metadata when data reading is successful

Type:

list of str

check_file_mdl(file)

Check if file name as correct model pattern

Parameters:

file (pathlib.Path) – File path or file name

Returns:

bool – True if file parameter name match

handle(data_file_path, prm_name)

Handle method

Parameters:
  • data_file_path (pathlib.Path) – Data file path

  • prm_name (str) – Parameter name

Returns:

dict

abstract get_data(*args, **kwargs)

Method used to get data

abstract get_metadata_item(*args, **kwargs)

Method to get metadata item

abstract get_metadata_filename(data_file_path)

Method to get metadata file name

abstract get_metadata(file_path, mdl_name, prm_name)

Method to get metadata

get_model(file_path)

Get instrument type from file path

Parameters:

file_path

Returns:

filter_files(path_list, prm_name)

Filter files already load.

Parameters:
  • path_list (pathlib.Path) – List of file path to be load.

  • prm_name (str) – Corresponding parameter name.

Returns:

list

read_metaconfig_fields(mdl_name, prm_name)

Read field from metaconfig

static get_source_unique_id(file_path)

Return string use to determine if a file have already be read.

Note

Stem is used to have same hash for data file and flag file)

Parameters:

file_path (pathlib.Path) – Original file path

Returns:

int

class dvas.data.linker.CSVHandler(orig_data_cfg)

Bases: FileHandler

CSV Handler class

CFG_FILE_SUFFIX = ['.yml', '.yaml']
__init__(orig_data_cfg)
Parameters:

orig_data_cfg (config.config.OrigData) – Original data config manager

property origmeta_mngr

Yaml original CSV file metadata manager

Type:

config.config.CSVOrigMeta

get_metadata_item(item)

Implementation of abstract method

get_metadata_filename(data_file_path)

Implementation of abstract method

get_metadata(file_path, mdl_name, prm_name)

Implementation of abstract method

get_data(field_id, data_file_path, mdl_name, prm_name)

Implementation of abstract method

class dvas.data.linker.GDPHandler(orig_data_cfg)

Bases: FileHandler

GDP Handler class

__init__(orig_data_cfg)
Parameters:

orig_data_cfg (config.config.OrigData) – Original data config manager

get_metadata_item(item)

Implementation of abstract method

get_metadata_filename(data_file_path)

Implementation of abstract method

get_metadata(file_path, mdl_name, prm_name)

Method to get file metadata

get_data(field_id, data_file_path, mdl_name, prm_name)

Implementation of abstract method

class dvas.data.linker.FlgCSVHandler(orig_data_cfg)

Bases: CSVHandler

CSV flag file handler class

__init__(orig_data_cfg)
Parameters:

orig_data_cfg (config.config.OrigData) – Original data config manager

class dvas.data.linker.FlgGDPHandler(orig_data_cfg)

Bases: GDPHandler

GDP flag file handler class

__init__(orig_data_cfg)
Parameters:

orig_data_cfg (config.config.OrigData) – Original data config manager

get_metadata_filename(data_file_path)

Implementation of abstract method

get_data(field_id, data_file_path, mdl_name, prm_name)

Implementation of abstract method

class dvas.data.linker.DataLinker

Bases: ABC

Data linker abstract class

abstract load(*args, **kwargs)

Data loading method

abstract save(*args, **kwargs)

Data saving method

class dvas.data.linker.LocalDBLinker

Bases: DataLinker

Local DB data linker

load(search, prm_name, filter_empty=True)

Load parameter method

Parameters:
  • search (str) – Data loader search criterion

  • prm_name (str) – Parameter name

  • filter_empty (bool, optional) – Filter empty data from search. Default to True.

Returns:

list of dict

../_images/98e54a0f52bec72207aa87a75ee0823fa22a16f8d9c9d71f10e20fb693315ed2.svg
save(data_list)

Save data method

Parameters:

data_list (list of dict) – dict mandatory items are ‘index’ (np.array), ‘value’ (np.array), ‘info’ (InfoManager|dict), ‘prm_name’ (str). dict optional key are ‘source_info’ (str), force_write (bool)

class dvas.data.linker.CSVOutputLinker

Bases: DataLinker

CSV output data linker class

load()

Data loading method

save(data)
Parameters:

data

Returns:

class dvas.data.linker.LoadExprInterpreter

Bases: ABC

Abstract config expression interpreter class

Notes

This class and subclasses construction are based on the interpreter design pattern.

classmethod set_callable(fct, *args, **kwargs)

Set strategy :Parameters: fct (callable) – Function/Methode called by ‘get’ expression

abstract interpret()

Interpreter method

static eval(expr, get_fct, *args, **kwargs)

Evaluate str expression

Parameters:
  • expr (str|ConfigExprInterpreter) – Expression to evaluate

  • get_fct (callable) – Function use by ‘get’

Examples

>>> import re
>>> mymatch = re.match('^a(\d)', 'a1b')
>>> print(ConfigExprInterpreter.eval("cat('My test', ' ', get(1))", mymatch.group))
My test 1
class dvas.data.linker.NonTerminalLoadExprInterpreter(*args)

Bases: LoadExprInterpreter

Implement an interpreter operation for non terminal symbols in the grammar.

interpret()

Non terminal interpreter method

abstract fct(*args)

Function between expression args

class dvas.data.linker.AddExpr(*args)

Bases: NonTerminalLoadExprInterpreter

Addition

fct(a, b)

Implement fct method

class dvas.data.linker.SubExpr(*args)

Bases: NonTerminalLoadExprInterpreter

Subtractions

fct(a, b)

Implement fct method

class dvas.data.linker.MulExpr(*args)

Bases: NonTerminalLoadExprInterpreter

Multiplication

fct(a, b)

Implement fct method

class dvas.data.linker.DivExpr(*args)

Bases: NonTerminalLoadExprInterpreter

Division

fct(a, b)

Implement fct method

class dvas.data.linker.PowExpr(*args)

Bases: NonTerminalLoadExprInterpreter

Power operator

fct(a, b)

Implement fct method

class dvas.data.linker.SqrtExpr(arg)

Bases: PowExpr

Square root

class dvas.data.linker.TerminalLoadExprInterpreter(arg)

Bases: LoadExprInterpreter

Implement an interpreter operation for terminal symbols in the grammar.

class dvas.data.linker.NoneExpr(arg)

Bases: TerminalLoadExprInterpreter

Apply none interpreter

interpret()

Implement fct method

class dvas.data.linker.GetExpr(arg, op='nop')

Bases: TerminalLoadExprInterpreter

Get catch value

interpret()

Implement fct method

class dvas.data.linker.GetgeomaltExpr(arg, lat=None)

Bases: TerminalLoadExprInterpreter

Geometric altitude to geopotential height convertor

__init__(arg, lat=None)

Init function

Parameters:
  • args (str) – expression to process.

  • lat (float, optional) – geodetic latitude of launch site, in degrees

interpret()

Implement fct method

class dvas.data.linker.GetreldtExpr(arg, fmt=None, round_lvl=None)

Bases: TerminalLoadExprInterpreter

Absolute datetimes to relative seconds

__init__(arg, fmt=None, round_lvl=None)

Initialization function

Parameters:
  • args (str) – expression to process.

  • fmt (str, optional) – specify the datetime str format. Defaults to None.

  • round_lvl (int, optional) – Specify the time step rounding level, as (1/10)**round_lvl seconds. Defaults to None = full accuracy.

Note

If set, the round_lvl parameter will be fed to the decimals argument of the pandas.round() routine. If the rounding leads to an error larger than (1/10)**(round_lvl+1) seconds, a critical log message will be created.

interpret()

Implement fct method

exception dvas.data.linker.ConfigInstrIdError

Bases: Exception

Error for missing instrument id

exception dvas.data.linker.OutputDirError

Bases: Exception

Error for bad output directory path

exception dvas.data.linker.OrigConfigError

Bases: Exception

Error for bad orig config