Processing
Processings are powerful tools to modify the config at each step of the lifecycle of a configuration. More precisely, you can use processings to modify the full configuration before and after each merge, after loading, and before saving the config.
The processings are applied via a processing object that have five methods
(called “processing” to simplify): premerge
, postmerge
, endbuild
, postload
and presave
. These names correspond to the timing they are applied. Each processing
has the signature:
def premerge(self, flat_config: Config) -> Config:
...
return flat_config
Where Config
is a simple class containing only two attributes (and no methods):
dict
that is the configuration dict and process_list
, the list of processing objects
(we discuss this in a section below). Note that it is
also the class of the object returned by the make_config
function.
They only take a flat config as input i.e a config containing a dict of depth 1 with dot-separated keys and return the modified flat dict (and keep it flat!).
In this section, you will learn how they work and how to create your own to make whatever you want with the config (we hope!).
Why a flat dict?
The idea is that when we construct a config, we manipulate dictionaries that contain both nested sub-dictionaries and flat keys simultaneously. To simplify this process, the dictionaries are systematically flattened before merging. This approach makes things simpler and prevents duplicated keys within the same configuration, as shown in the example:
config = {'a': {'b': 1}, 'a.b': 2}
More generally, all config modifications are performed using flat dictionaries during config construction, and the same applies to processings. For processings, it is even more interesting as you can have access to the full sub-config names to make your processing if needed.
However, it’s important to note that after building your config with make_config
,
the dict will be unflattened to its normal nested configuration structure.
Processing order
The order in which the processings are triggered is crucial because they modify the config and consequently affect the behavior of subsequent processings. To manage this order, the processing class have five float attributes representing the order of the five processing methods: premerge, postmerge, endbuild, postload, and presave.
Here’s a basic example to illustrate the significance of the order:
# config1.yaml
merge@merge_add@delete: config2.yaml
param: 1
# config2.yaml
param2: 2
In this example, we want to build a global config using config1.yaml
. This file contains only
half of the parameters, and the other half is in config2.yaml
. Then, we add a key
with the name of our choice, here “merge”, tagged with @merge_add
to merge
config2.yaml
before the global config update. We add the @delete
tag to delete
the key “merge” before merging with the global config because in this case, there is
no key with the name “merge” in the global config, and it would raise an error as
it is not possible to ass new keys.
@merge_add
and @delete
has both only a pre-merge effect. Let’s check the orders.
It is -20.0
for merge and 30.0
for delete. So merge trigger first, add param2
and
the “merge” key is deleted after it. If the orders were reversed, the key would have been
deleted before merge processing and so the param2
would not have been updated with the
value of 2 and the resulting configuration would potentially not have been
the expected one at all.
Therefore, it is crucial to carefully manage the order when creating your own processings!
Some useful ranges to choose your order:
not important: order = 0 (default)
if it checks/modifies the config before applying any processing: order < -25
if it adds new parameters: -25 < order < -5
if it updates a value based on itself: -5 < order < 5
if it updates a value based on other keys: 5 < order < 15
if it checkss a property on a value: 15 < order < 25
if it deletes other key(s) but you want to trigger the tags before: 25 < order < 35
final check/modification after all processings: order > 35
Note: a pre-merge order should not be greater than 1000, the order of the default
processing ProcessCheckTags
that raise an error if tags still exist at the end
of the pre-merge step.
Create basic processing
Processing that modify a single value
One of the most useful kind of processing look for parameters which names match a certain pattern (e.g a prefix or a suffix) or contain a specific tag and modify their values depending on their current ones.
To simplify the creation of such a process, we provide the cliconfig.create_processing_value
function.
This function allows you to quickly create a processing that matches a regular
expression or a specific tag name (in which case the tag is removed after pre-merging).
You specify the function to be applied on the value to modify it, and optionally,
the order of the processing. Additionally, there is a persistent
argument, which is
a boolean value indicating whether encountering the tag (if a tag is used) once in
a parameter name will continue to trigger the processing for this parameter
even after the tag is removed. By default, it is False
. Finally, you can set
the processing type (pre-merge, post-merge, etc.) at your convenience. Default is pre-merge.
Here’s an example to illustrate:
proc = create_processing_value(lambda x: str(x), 'premerge', tag_name='convert_str', persistent=True)
config = make_config(default_config, process_list=[proc])
In this example, the config {"subconfig.param@convert_str": 1}
will
be converted to {"subconfig.param": "1"}
. Moreover, the keys subconfig.param
will be permanently converted to strings before every merge.
It’s worth noting that you can also use functions that have side effects without necessarily changing the value itself. For example, you can use a function to check if a certain condition is met by the value.
It is also possible to pass the flat config as a second argument to the function. For example:
# config.yaml
param: 1
param2@eval: "config.param + 1"
proc = create_processing_value(
lambda x, config: eval(x, {"config": config}),
tag_name="eval",
persistent=False,
)
# (Note that the `eval` function is not safe and the code above
# should not be used in case of untrusted config)
Here the value of param2
will be evaluated to 2 at pre-merge step.
Pre-merge/post-merge processing that protect a property from being modified
Another useful kind of processing is a processing that ensure to keep a certain
property on the value. For this kind of processing, you can use
cliconfig.create_processing_keep_property
. It takes a function that returns
the property from the value, the regex or the tag name like the previous function,
and the order of the pre-merge and the post-merge.
The pre-merge processing looks for keys that match the tag or the regex, apply the function on the value and store the result (= the “property”). The post-merge and end-build processing will check that the property is the same as the one stored during pre-merge. If not, it will raise an error.
Examples:
A processing that enforce the types of all the parameters to be constant (equal to the type of the first value encountered):
create_processing_keep_property(type, regex=".*", premerge_order=15.0,
postmerge_order=15.0, endbuild_order=15.0)
A processing that protect parameters tagged with @protect from being changed:
create_processing_keep_property(lambda x: x, tag_name="protect",
premerge_order=15.0, postmerge_order=15.0,
endbuild_order=15.0)
Each time you choose the order 15.0
because it is a good value for processing that
made checks on the values. Indeed, processings that change the values such as
ProcessCopy
have an order that is generally $\leq$ 10.0.
It is also possible to pass the flat config as a second argument to the function
similarly to create_processing_value
.
Create your processing classes (Advanced)
To create your own processing classes and unlock more possibilities, you simply need to overload the methods of the Processing class to modify the config at the desired timings. To do so, you often need to manipulate tags.
Merge, save or load configs in processing
The key concept is that as long as we deal with processings, the elementary operations on the config are not actually to merge, save, and load a config, but rather:
Applying pre-merge processing, then merging, then applying post-merge processing.
Applying pres-ave processing and then saving a config.
Loading a config and then applying post-load processing.
These three operations are in cliconfig.process_routines
and called
merge_processing
, save_processing
, and load_processing
, respectively. They
take as input a Config object that contains as we see the list of processing.
Now, the trick is that sometimes we want to apply these operations to the processing themselves, particularly when we want to modify a part of the configuration instead of just a single parameter (such as merging two configurations). This is why it is particularly useful to have access to the full Config object and not only the dict.
For example, consider the tag @merge_add
, which triggers a processing before
merging and merges the config loaded from a specified path (the value) into the
current config. We may want to see what happens if we merge a config that also
contains a @merge_add
tag within it:
# main.yaml
config_path1@merge_add: path1.yaml
# path1.yaml
param1: 1
config_path2@merge_add: path2.yaml
# path2.yaml
param2: 2
Now, let’s consider we want to merge the config main.yaml
with another config.
During the pre-merge processing, we encounter the tag @merge_add
. This tag is
removed, and the config found at path1.yaml
will be merged into the main.yaml
config. However before this, it triggers the pre-merging.
Therefore, before the merge path1.yaml
, the processing discovers the key
config_path2@merge_add
and merges the config found at path2.yaml
into path1.yaml
.
Then, path1.yaml
is merged into main.yaml
. Finally, the resulting configuration
can be interpreted as follows:
{'param1': 1, 'param2': 2, 'config_path1': 'path1.yaml', 'config_path2': 'path2.yaml'}
before being merged itself with another config. Note that is not only a processing that allows to organize the configuration on multiple files. In fact, it also allows you for instance to choose a particular configuration among several ones by setting the path as value of the tagged key (as long as this config is on the default configs).
Change processing list in processing (Still more advanced)
Note that the processing functions receive the list of processing objects as an input and update as an attribute of the processing object. This means that it is possible to manually modify this list in processing functions.
Warning: The processing list to apply during pre/post-merge, pre-save and post-load are determined before the first processing is applied. Therefore, you can’t add or remove processing and expect it to be effective during the current merge/save/load. However, if you modify their internal variables it will be effective immediately.
Here an example of a processing that remove the type check of a parameter in
ProcessTyping
processing. It is then possible for instance to force another
type (it is not possible otherwise).
from cliconfig.processing.builtin import ProcessTyping
class ProcessBypassTyping(Processing):
"""Bypass type check of ProcessTyping for parameters tagged with "@bypass_typing".
In pre-merge it looks for a parameter with the tag "@bypass_typing",
removes it and change the internal ProcessTyping variables to avoid
checking the type of the parameter with ProcessTyping.
"""
def __init__(self) -> None:
super().__init__()
self.bypassed_forced_types: Dict[str, tuple] = {}
# Before ProcessTyping pre-merge to let it change the type
self.premerge_order = 1.0
def premerge(self, flat_config: Config) -> Config:
"""Pre-merge processing."""
items = list(flat_config.dict.items())
for flat_key, value in items:
if is_tag_in(flat_key, "bypass_typing"):
new_key = clean_tag(flat_key, "bypass_typing")
flat_config.dict[new_key] = value
del flat_config.dict[flat_key]
clean_key = clean_all_tags(flat_key)
for processing in flat_config.process_list:
if (isinstance(processing, ProcessTyping)
and clean_key in processing.forced_types):
forced_type = processing.forced_types.pop(clean_key)
self.bypassed_forced_types[clean_key] = forced_type
return flat_config
# Without bypass:
config1 = Config({"a@type:int": 0}, [ProcessBypassTyping(), ProcessTyping()])
config2 = Config({"a@type:str": "a"}, [])
config = merge_flat_processing(config1, config2)
# > Error: try to change the forced type of "a" from int to str
# With bypass:
config1 = Config({"a@type:int": 0}, [ProcessBypassTyping(), ProcessTyping()])
config2 = Config({"a@bypass_typing@type:str": "a"}, [])
config = merge_flat_processing(config1, config2)
# > No error