dvas.data package
Copyright (c) 2020-2022 MeteoSwiss, contributors listed in AUTHORS.
Distributed under the terms of the GNU General Public License v3.0 or later.
SPDX-License-Identifier: GPL-3.0-or-later
Subpackages
- dvas.data.strategy package
- Submodules
- dvas.data.strategy.data module
- dvas.data.strategy.load module
- dvas.data.strategy.plot module
- dvas.data.strategy.rebase module
- dvas.data.strategy.resample module
- dvas.data.strategy.save module
- dvas.data.strategy.sort module
Submodules
dvas.data.data module
Copyright (c) 2020-2023 MeteoSwiss, contributors listed in AUTHORS.
Distributed under the terms of the GNU General Public License v3.0 or later.
SPDX-License-Identifier: GPL-3.0-or-later
Module contents: Data management
- class dvas.data.data.MultiProfileAC(load_stgy=None, sort_stgy=None, save_stgy=None, plot_stgy=None, rebase_stgy=None)
Bases:
object
Abstract MultiProfile class
- REQUIRED_ATTRIBUTES = {'_DATA_TYPES': <class 'type'>}
- property profiles
list of Profile
- property db_variables
Correspondence between DataFrame and DB parameter
- Type:
dict
- property var_info
Variable informations
- Type:
dict
- property info
Data info
- Type:
List of ProfileManger info
- rm_info_tags(val, *, inplace=True)
Remove some tags from all info tag lists.
- Parameters:
val (str|list of str) – Tag value(s) to remove
— Decorating function infos —
- Parameters:
inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.
— — — — — —
- add_info_tags(val, *, inplace=True)
Add tag from all info tags
- Parameters:
val (str|list of str) – Tag values to add.
— Decorating function infos —
- Parameters:
inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.
— — — — — —
- copy()
Return a deep copy of the object
- extract(inds)
Return a new MultiProfile instance with a subset of the Profiles.
- Parameters:
inds (int|list of int) – indices of the Profiles to extract.
- Returns:
dvas.data.data.MultiProfile – the new instance.
- load_from_db(*args, inplace=True, **kwargs)
Load data from the database.
- Parameters:
*args – positional arguments
**kwargs – key word arguments
— Decorating function infos —
- Parameters:
inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.
— — — — — —
- sort(*, inplace=True)
Sort method
— Decorating function infos —
- Parameters:
inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.
— — — — — —
- save_to_db(add_tags=None, rm_tags=None, prms=None)
Save method to store the entire content of the Multiprofile instance back into the database with an updated set of tags.
- Parameters:
add_tags (list of str, optional) – list of tags to add to the entity when inserting it into the database. Defaults to None.
rm_tags (list of str, optional) – list of existing tags to remove from the entity before inserting ot into the database. Defaults to None.
prms (list of str, optional) – list of column names to save to the database. Defaults to None (= save all possible parameters).
Notes
The TAG_ORIGINAL will always be removed and the ‘derived’ tag will always be added by default when saving anything into the database.
- update(db_df_keys, data)
Update the whole Multiprofile list with new Profiles.
- Parameters:
db_df_keys (dict) – Relationship between database parameters and Profile.data columns.
data (list of Profile) – Data
- append(db_df_keys, val)
Append method
- Parameters:
db_df_keys (dict) – Relationship between database parameters and Profile.data columns.
val (Profile) – Data
- get_prms(prm_list=None, mask_flgs=None, request_flgs=None, with_metadata=None, pooled=False)
Convenience getter to extract specific columns from the DataFrames and/or class properties of all the Profile instances.
- Parameters:
prm_list (str|list of str, optional) – names of the column(s) to extract from all the Profile DataFrames. Defaults to None (=returns all the columns from the DataFrame).
mask_flgs (str|list of str, optional) – name(s) of the flag(s) to NaN-ify in the extraction process. Defaults to None.
request_flgs (str|list of str, optional) – if set, will only return points that have these flag values set (AND rule applied, if multiple values are provided).
with_metadata (str|list, optional) – name of the metadata fields to include in the table. Defaults to None.
pooled (bool, optional) – if True, all profiles will be gathered together. If False, Profiles are kept distinct using a MultiIndex. Defaults to False.
- Returns:
pd.DataFrame – the requested data as a MultiIndex pandas DataFrame.
Warning
The resulting DataFrame has only
dvas.hardcoded.PRF_IDX
(=’_idx’) as an index. Since the values ofdvas.hardcoded.PRF_TDT
(=’tdt’) anddvas.hardcoded.PRF_ALT
(=’alt’) are not necessarily the sames for all the Profiles, these cannot be used as common indexes here.
- get_info(prm=None)
Convenience function to extract Info from all the Profile instances.
- Parameters:
prm (str, optional) – Info attribute to extract. Default to None.
- Returns:
dict of list – idem to self.profiles, but with only the requested metadata.
- has_tag(tag)
Convenience method to check if the different Profile each have a specific tag, or not.
- Parameters:
tag (str) – tag to search for.
- Returns:
list of bool – one bool for each Profile.
- plot(**kwargs)
Plot method
- Parameters:
**kwargs – Keyword arguments to be passed down to the plotting function.
- Returns:
None
- rebase(new_lengths, shifts=None, *, inplace=True)
Rebase method, which allows to map Profiles on new set of integer indices.
This will move the values around, including the non-integer indices (i.e. anything other than ‘_idx’) if applicable.
- Parameters:
new_lengths (int|list of int) – The length of the DataFrame to rebase upon. If specifiying an int, the same length will be applied to all Profiles. Else, the list should specify a length for each Profile.
shifts (int|list of int, optional) – row n of the existing data will become row n+shift. If specifiying an int, the same shift will be applied to all Profiles. Else, the list should specify a shift for each Profile. Defaults to None (=no shift).
— Decorating function infos —
- Parameters:
inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.
— — — — — —
- class dvas.data.data.MultiProfile
Bases:
MultiProfileAC
Multi profile base class, designed to handle multiple Profile.
- class dvas.data.data.MultiRSProfileAC(load_stgy=None, sort_stgy=None, save_stgy=None, plot_stgy=None, rebase_stgy=None, resample_stgy=None)
Bases:
MultiProfileAC
Abstract MultiRSProfile class
- resample(freq='1s', interp_dist=1, chunk_size=150, n_cpus=1, *, inplace=True)
Resample the profiles (one-by-one) onto regular timesteps using linear interpolation.
- Parameters:
freq (str) – see pandas.timedelta_range(). Defaults to ‘1s’.
interp_dist (int|float) – Distance beyond which to not interpolate, and use NaNs. Defaults to 1s.
Note
Will unwrap angles if self.var_info[PRF_VAL][‘prm_name’] == ‘wdir’.
— Decorating function infos —
- Parameters:
inplace (bool, optional) – If False, will return a deepcopy. Defaults to True.
— — — — — —
- class dvas.data.data.MultiRSProfile
Bases:
MultiRSProfileAC
Multi RS profile manager, designed to handle multiple RSProfile instances.
- class dvas.data.data.MultiGDPProfileAC(load_stgy=None, sort_stgy=None, save_stgy=None, plot_stgy=None, rebase_stgy=None, resample_stgy=None)
Bases:
MultiRSProfileAC
Abstract MultiGDPProfile class
- property uc_tot
Convenience getter to extract the total uncertainty from all the Profile instances.
- Returns:
list of DataFrame – idem to self.profiles, but with only the requested data.
- class dvas.data.data.MultiGDPProfile
Bases:
MultiGDPProfileAC
Multi GDP profile manager, designed to handle multiple GDPProfile instances.
- class dvas.data.data.MultiCWSProfile
Bases:
MultiGDPProfileAC
Multi CWS profile manager, designed to handle multiple GDPProfile instances.
- class dvas.data.data.MultiDeltaProfile
Bases:
MultiGDPProfileAC
Multi Delta profile manager, designed to handle multiple DeltaProfile instances.
dvas.data.io module
Copyright (c) 2020-2022 MeteoSwiss, contributors listed in AUTHORS.
Distributed under the terms of the GNU General Public License v3.0 or later.
SPDX-License-Identifier: GPL-3.0-or-later
Module contents: IO management
- dvas.data.io.update_db(search, strict=False)
Update database.
- Parameters:
search (str) – Parameter name search criteria.
strict (bool, optional) – If False, match for any sub-string. If True match for entire string. Default to False.
dvas.data.linker module
Copyright (c) 2020-2023 MeteoSwiss, contributors listed in AUTHORS.
Distributed under the terms of the GNU General Public License v3.0 or later.
SPDX-License-Identifier: GPL-3.0-or-later
Module contents: Data linker classes
- class dvas.data.linker.Handler
Bases:
ABC
The Handler interface declares a method for building the chain of handlers. It also declares a method for executing a request.
Note
- abstract set_next(handler)
Method to set next handler
- Parameters:
handler (Handler) – Handler class
- Returns:
Handler
- abstract handle(request, prm_name)
Handle method
- Parameters:
request (object) – Request
prm_name (str) – Parameter name
- Returns:
Optional – ‘object’
- class dvas.data.linker.AbstractHandler
Bases:
Handler
The default chaining behavior can be implemented inside a base handler class.
- set_next(handler)
Returning a handler from here will let us link handlers in a convenient way like this: handler1.set_next(handler2).set_next(handler3)
- abstract handle(*args)
Super handler behavior
- class dvas.data.linker.FileHandler(orig_data_cfg)
Bases:
AbstractHandler
File handler
- __init__(orig_data_cfg)
- Parameters:
orig_data_cfg (config.config.OrigData) – Original data config manager
- property origdata_config_mngr
Yaml original metadata manager
- Type:
- property file_suffix_re
Handled data file suffix.
- Type:
re.compile
- property prm_re
Handled parameter name.
- Type:
re.compile
- property file_model_pat
File model pattern. Group #1 must correspond to model name.
- Type:
re.compile
- check_file(file)
Check if file as the correct suffix pattern.
- Parameters:
file (pathlib.Path) – File path or file name
- Returns:
bool – True if file name match
- check_prm(prm_name)
Check if parameter as correct parameter pattern
- Parameters:
prm_name (str) – Parameter name
- Returns:
bool – True if file parameter name match
- property data_ok_tags
Tags list to add to metadata when data reading is successful
- Type:
list of str
- check_file_mdl(file)
Check if file name as correct model pattern
- Parameters:
file (pathlib.Path) – File path or file name
- Returns:
bool – True if file parameter name match
- handle(data_file_path, prm_name)
Handle method
- Parameters:
data_file_path (pathlib.Path) – Data file path
prm_name (str) – Parameter name
- Returns:
dict
- abstract get_data(*args, **kwargs)
Method used to get data
- abstract get_metadata_item(*args, **kwargs)
Method to get metadata item
- abstract get_metadata_filename(data_file_path)
Method to get metadata file name
- abstract get_metadata(file_path, mdl_name, prm_name)
Method to get metadata
- get_model(file_path)
Get instrument type from file path
- Parameters:
file_path
Returns:
- filter_files(path_list, prm_name)
Filter files already load.
- Parameters:
path_list (pathlib.Path) – List of file path to be load.
prm_name (str) – Corresponding parameter name.
- Returns:
list
- read_metaconfig_fields(mdl_name, prm_name)
Read field from metaconfig
- static get_source_unique_id(file_path)
Return string use to determine if a file have already be read.
Note
Stem is used to have same hash for data file and flag file)
- Parameters:
file_path (pathlib.Path) – Original file path
- Returns:
int
- class dvas.data.linker.CSVHandler(orig_data_cfg)
Bases:
FileHandler
CSV Handler class
- CFG_FILE_SUFFIX = ['.yml', '.yaml']
- __init__(orig_data_cfg)
- Parameters:
orig_data_cfg (config.config.OrigData) – Original data config manager
- property origmeta_mngr
Yaml original CSV file metadata manager
- get_metadata_item(item)
Implementation of abstract method
- get_metadata_filename(data_file_path)
Implementation of abstract method
- get_metadata(file_path, mdl_name, prm_name)
Implementation of abstract method
- get_data(field_id, data_file_path, mdl_name, prm_name)
Implementation of abstract method
- class dvas.data.linker.GDPHandler(orig_data_cfg)
Bases:
FileHandler
GDP Handler class
- __init__(orig_data_cfg)
- Parameters:
orig_data_cfg (config.config.OrigData) – Original data config manager
- get_metadata_item(item)
Implementation of abstract method
- get_metadata_filename(data_file_path)
Implementation of abstract method
- get_metadata(file_path, mdl_name, prm_name)
Method to get file metadata
- get_data(field_id, data_file_path, mdl_name, prm_name)
Implementation of abstract method
- class dvas.data.linker.FlgCSVHandler(orig_data_cfg)
Bases:
CSVHandler
CSV flag file handler class
- __init__(orig_data_cfg)
- Parameters:
orig_data_cfg (config.config.OrigData) – Original data config manager
- class dvas.data.linker.FlgGDPHandler(orig_data_cfg)
Bases:
GDPHandler
GDP flag file handler class
- __init__(orig_data_cfg)
- Parameters:
orig_data_cfg (config.config.OrigData) – Original data config manager
- get_metadata_filename(data_file_path)
Implementation of abstract method
- get_data(field_id, data_file_path, mdl_name, prm_name)
Implementation of abstract method
- class dvas.data.linker.DataLinker
Bases:
ABC
Data linker abstract class
- abstract load(*args, **kwargs)
Data loading method
- abstract save(*args, **kwargs)
Data saving method
- class dvas.data.linker.LocalDBLinker
Bases:
DataLinker
Local DB data linker
- load(search, prm_name, filter_empty=True)
Load parameter method
- Parameters:
search (str) – Data loader search criterion
prm_name (str) – Parameter name
filter_empty (bool, optional) – Filter empty data from search. Default to True.
- Returns:
list of dict
- save(data_list)
Save data method
- Parameters:
data_list (list of dict) – dict mandatory items are ‘index’ (np.array), ‘value’ (np.array), ‘info’ (InfoManager|dict), ‘prm_name’ (str). dict optional key are ‘source_info’ (str), force_write (bool)
- class dvas.data.linker.CSVOutputLinker
Bases:
DataLinker
CSV output data linker class
- load()
Data loading method
- save(data)
- Parameters:
data
Returns:
- class dvas.data.linker.LoadExprInterpreter
Bases:
ABC
Abstract config expression interpreter class
Notes
This class and subclasses construction are based on the interpreter design pattern.
- classmethod set_callable(fct, *args, **kwargs)
Set strategy :Parameters: fct (callable) – Function/Methode called by ‘get’ expression
- abstract interpret()
Interpreter method
- static eval(expr, get_fct, *args, **kwargs)
Evaluate str expression
- Parameters:
expr (str|ConfigExprInterpreter) – Expression to evaluate
get_fct (callable) – Function use by ‘get’
Examples
>>> import re >>> mymatch = re.match('^a(\d)', 'a1b') >>> print(ConfigExprInterpreter.eval("cat('My test', ' ', get(1))", mymatch.group)) My test 1
- class dvas.data.linker.NonTerminalLoadExprInterpreter(*args)
Bases:
LoadExprInterpreter
Implement an interpreter operation for non terminal symbols in the grammar.
- interpret()
Non terminal interpreter method
- abstract fct(*args)
Function between expression args
- class dvas.data.linker.AddExpr(*args)
Bases:
NonTerminalLoadExprInterpreter
Addition
- fct(a, b)
Implement fct method
- class dvas.data.linker.SubExpr(*args)
Bases:
NonTerminalLoadExprInterpreter
Subtractions
- fct(a, b)
Implement fct method
- class dvas.data.linker.MulExpr(*args)
Bases:
NonTerminalLoadExprInterpreter
Multiplication
- fct(a, b)
Implement fct method
- class dvas.data.linker.DivExpr(*args)
Bases:
NonTerminalLoadExprInterpreter
Division
- fct(a, b)
Implement fct method
- class dvas.data.linker.PowExpr(*args)
Bases:
NonTerminalLoadExprInterpreter
Power operator
- fct(a, b)
Implement fct method
- class dvas.data.linker.TerminalLoadExprInterpreter(arg)
Bases:
LoadExprInterpreter
Implement an interpreter operation for terminal symbols in the grammar.
- class dvas.data.linker.NoneExpr(arg)
Bases:
TerminalLoadExprInterpreter
Apply none interpreter
- interpret()
Implement fct method
- class dvas.data.linker.GetExpr(arg, op='nop')
Bases:
TerminalLoadExprInterpreter
Get catch value
- interpret()
Implement fct method
- class dvas.data.linker.GetgeomaltExpr(arg, lat=None)
Bases:
TerminalLoadExprInterpreter
Geometric altitude to geopotential height convertor
- __init__(arg, lat=None)
Init function
- Parameters:
args (str) – expression to process.
lat (float, optional) – geodetic latitude of launch site, in degrees
- interpret()
Implement fct method
- class dvas.data.linker.GetreldtExpr(arg, fmt=None, round_lvl=None)
Bases:
TerminalLoadExprInterpreter
Absolute datetimes to relative seconds
- __init__(arg, fmt=None, round_lvl=None)
Initialization function
- Parameters:
args (str) – expression to process.
fmt (str, optional) – specify the datetime str format. Defaults to None.
round_lvl (int, optional) – Specify the time step rounding level, as (1/10)**round_lvl seconds. Defaults to None = full accuracy.
Note
If set, the round_lvl parameter will be fed to the decimals argument of the pandas.round() routine. If the rounding leads to an error larger than (1/10)**(round_lvl+1) seconds, a critical log message will be created.
- interpret()
Implement fct method
- exception dvas.data.linker.ConfigInstrIdError
Bases:
Exception
Error for missing instrument id
- exception dvas.data.linker.OutputDirError
Bases:
Exception
Error for bad output directory path
- exception dvas.data.linker.OrigConfigError
Bases:
Exception
Error for bad orig config