`culebra.trainer.abc.DistributedTrainer` class¶

class DistributedTrainer(fitness_function: FitnessFunction, subtrainer_cls: Type[SingleSpeciesTrainer], max_num_iters: int | None = None, custom_termination_func: Callable[[SingleSpeciesTrainer], bool] | None = None, num_subtrainers: int | None = None, representation_size: int | None = None, representation_freq: int | None = None, representation_selection_func: Callable[[List[Solution], Any], Solution] | None = None, representation_selection_func_params: Dict[str, Any] | None = None, checkpoint_enable: bool | None = None, checkpoint_freq: int | None = None, checkpoint_filename: str | None = None, verbose: bool | None = None, random_seed: int | None = None, **subtrainer_params: Any)¶

Create a new trainer.

Parameters:

fitness_function (FitnessFunction) – The training fitness function
subtrainer_cls (Any subclass of SinglePopEA) – Single-species trainer class to handle the subtrainers.
max_num_iters (int, optional) – Maximum number of iterations. If set to None, DEFAULT_MAX_NUM_ITERS will be used. Defaults to None
custom_termination_func (Callable, optional) – Custom termination criterion. If set to None, _default_termination_func() is used. Defaults to None
num_subtrainers (int, optional) – The number of subtrainers. If set to None, the number of CPU cores will be used. Defaults to None
representation_size (int, optional) – Number of representative solutions that will be sent to the other subtrainers. If set to None, DEFAULT_REPRESENTATION_SIZE will be used. Defaults to None
representation_freq (int, optional) – Number of iterations between representatives sendings. If set to None, DEFAULT_REPRESENTATION_FREQ will be used. Defaults to None
representation_selection_func (Callable, optional) – Policy function to choose the representatives from each subtrainer. If set to None, DEFAULT_REPRESENTATION_SELECTION_FUNC will be used. Defaults to None
representation_selection_func_params (dict, optional) – Parameters to obtain the representatives with the selection policy function. If set to None, DEFAULT_REPRESENTATION_SELECTION_FUNC_PARAMS will be used. Defaults to None
checkpoint_enable (bool, optional) – Enable/disable checkpoining. If set to None, DEFAULT_CHECKPOINT_ENABLE will be used. Defaults to None
checkpoint_freq (int, optional) – The checkpoint frequency. If set to None, DEFAULT_CHECKPOINT_FREQ will be used. Defaults to None
checkpoint_filename (str, optional) – The checkpoint file path. If set to None, DEFAULT_CHECKPOINT_FILENAME will be used. Defaults to None
verbose (bool, optional) – The verbosity. If set to None, __debug__ will be used. Defaults to None
random_seed (int, optional) – The seed, defaults to None
subtrainer_params (keyworded variable-length argument list) – Custom parameters for the subtrainers trainer

Raises:

TypeError – If any argument is not of the appropriate type
ValueError – If any argument has an incorrect value

Class attributes¶

DistributedTrainer.stats_names = ('Iter', 'NEvals')¶: Statistics calculated each iteration.

DistributedTrainer.objective_stats = {'Avg': <function mean>, 'Max': <function max>, 'Min': <function min>, 'Std': <function std>}¶: Statistics calculated for each objective.

Class methods¶

classmethod DistributedTrainer.load_pickle(filename: str) → Base¶

Load a pickled object from a file.

Parameters:

filename (str) – The file name.

Raises:

TypeError – If filename is not a valid file name
ValueError – If the filename extension is not PICKLE_FILE_EXTENSION

Properties¶

property DistributedTrainer.fitness_function: FitnessFunction¶

Get and set the training fitness function.

Getter:: Return the fitness function
Setter:: Set a new fitness function
Type:: FitnessFunction
Raises:: TypeError – If set to a value which is not a fitness function

property DistributedTrainer.subtrainer_cls: Type[SingleSpeciesTrainer]¶

Get and set the trainer class to handle the subtrainers.

Each subtrainer will be handled by a single-species trainer.

Getter:: Return the trainer class
Setter:: Set new trainer class
Type:: A SingleSpeciesTrainer subclass
Raises:: TypeError – If set to a value which is not a SingleSpeciesTrainer subclass

property DistributedTrainer.max_num_iters: int¶

Get and set the maximum number of iterations.

Getter:

Return the current maximum number of iterations

Setter:

Set a new value for the maximum number of iterations. If set to None, the default maximum number of iterations, DEFAULT_MAX_NUM_ITERS, is chosen

Type:

int

Raises:

TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not a positive number

property DistributedTrainer.current_iter: int¶

Return the current iteration.

Type:: int

property DistributedTrainer.custom_termination_func: Callable[[Trainer], bool]¶

Get and set the custom termination criterion.

The custom termination criterion must be a function which receives the trainer as its unique argument and returns a boolean value, True if the search should terminate or False otherwise.

If more than one arguments are needed to define the termniation condition, functools.partial() can be used:

from functools import partial

def my_crit(trainer, max_iters):
    if trainer.current_iter < max_iters:
        return False
    return True

trainer.custom_termination_func = partial(my_crit, max_iters=10)

Getter:: Return the current custom termination criterion
Setter:: Set a new custom termination criterion. If set to None, the default termination criterion is used. Defaults to None
Type:: Callable
Raises:: TypeError – If set to a value which is not callable

property DistributedTrainer.num_subtrainers: int¶

Get and set the number of subtrainers.

Getter:

Return the current number of subtrainers

Setter:

Set a new value for the number of subtrainers. If set to None, DEFAULT_NUM_SUBTRAINERS is chosen

Type:

int

Raises:

TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not a positive number

property DistributedTrainer.representation_size: int¶

Get and set the representation size.

The representation size is the number of representatives sent to the other subtrainers

Getter:

Return the current representation size

Setter:

Set the new representation size. If set to None, DEFAULT_REPRESENTATION_SIZE is chosen

Type:

int

Raises:

TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not positive

property DistributedTrainer.representation_freq: int¶

Get and set the number of iterations between representatives sendings.

Getter:

Return the current frequency

Setter:

Set a new value for the frequency. If set to None, DEFAULT_REPRESENTATION_FREQ is chosen

Type:

int

Raises:

TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not a positive number

abstract property DistributedTrainer.representation_topology_func: Callable[[int, int, Any], List[int]]¶

Get the representation topology function.

This property must be overridden by subclasses to return a correct value.

Type:: Callable

abstract property DistributedTrainer.representation_topology_func_params: Dict[str, Any]¶

Get the parameters of the representation topology function.

This property must be overridden by subclasses to return a correct value.

Type:: dict

property DistributedTrainer.representation_selection_func: Callable[[List[Solution], Any], Solution]¶

Get and set the representation selection policy function.

The representation selection policy func chooses which solutions are selected as representatives of each subtrainer.

Getter:: Return the representation selection policy function
Setter:: Set new representation selection policy function. If set to None, DEFAULT_REPRESENTATION_SELECTION_FUNC is chosen
Type:: Callable
Raises:: TypeError – If set to a value which is not callable

property DistributedTrainer.representation_selection_func_params: Dict[str, Any]¶

Get and set the parameters of the representation selection function.

Getter:: Return the current parameters for the representation selection policy function
Setter:: Set new parameters. If set to None, DEFAULT_REPRESENTATION_SELECTION_FUNC_PARAMS is chosen
Type:: dict
Raises:: TypeError – If set to a value which is not a dict

property DistributedTrainer.checkpoint_enable: bool¶

Enable or disable checkpointing.

Getter:: Return True if checkpoinitng is enabled, or False otherwise
Setter:: New value for the checkpoint enablement. If set to None, DEFAULT_CHECKPOINT_ENABLE is chosen
Type:: bool
Raises:: TypeError – If set to a value which is not boolean

property DistributedTrainer.checkpoint_freq: int¶

Get and set the checkpoint frequency.

Getter:

Return the checkpoint frequency

Setter:

Set a value for the checkpoint frequency. If set to None, DEFAULT_CHECKPOINT_FREQ is chosen

Type:

int

Raises:

TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not a positive number

property DistributedTrainer.checkpoint_filename: str¶

Get and set the checkpoint file path.

Getter:

Return the checkpoint file path

Setter:

Set a new value for the checkpoint file path. If set to None, DEFAULT_CHECKPOINT_FILENAME is chosen

Type:

str

Raises:

TypeError – If set to a value which is not a a valid file name
ValueError – If set to a value whose extension is not PICKLE_FILE_EXTENSION

property DistributedTrainer.verbose: bool¶

Get and set the verbosity of this trainer.

Getter:: Return the verbosity
Setter:: Set a new value for the verbosity. If set to None, __debug__ is chosen
Type:: bool
Raises:: TypeError – If set to a value which is not boolean

property DistributedTrainer.random_seed: int¶

Get and set the initial random seed used by this trainer.

Getter:: Return the seed
Setter:: Set a new value for the random seed
Type:: int

property DistributedTrainer.logbook: Logbook | None¶

Get the training logbook.

Return a logbook with the statistics of the search or None if the search has not been done yet.

Type:: Logbook

property DistributedTrainer.num_evals: int | None¶

Get the number of evaluations performed while training.

Return the number of evaluations or None if the search has not been done yet.

Type:: int

property DistributedTrainer.runtime: float | None¶

Get the training runtime.

Return the training runtime or None if the search has not been done yet.

Type:: float

property DistributedTrainer.index: int¶

Get and set the trainer index.

The trainer index is only used by distributed trainers. For the rest of trainers DEFAULT_INDEX is used.

Getter:

Return the trainer index

Setter:

Set a new value for trainer index. If set to None, DEFAULT_INDEX is chosen

Type:

int

Raises:

TypeError – If set to a value which is not an integer
ValueError – If set to a value which is a negative number

property DistributedTrainer.container: Trainer | None¶

Get and set the container of this trainer.

The trainer container is only used by distributed trainers. For the rest of trainers defaults to None.

Getter:: Return the container
Setter:: Set a new value for container of this trainer
Type:: Trainer
Raises:: TypeError – If set to a value which is not a valid trainer

property DistributedTrainer.representatives: Sequence[Sequence[Solution | None]] | None¶

Return the representatives of the other species.

Only used by cooperative trainers. If the trainer does not use representatives, None is returned.

property DistributedTrainer.subtrainer_params: Dict[str, Any]¶

Get and set the custom parameters of the subtrainers.

Getter:: Return the current parameters for the subtrainers
Setter:: Set new parameters
Type:: dict
Raises:: TypeError – If set to a value which is not a dict

property DistributedTrainer.subtrainer_checkpoint_filenames: Generator[str, None, None]¶: Checkpoint file name of all the subtrainers.

property DistributedTrainer.subtrainers: List[SingleSpeciesTrainer] | None¶

Return the subtrainers.

One single-species trainer for each subtrainer

Type:: list of SingleSpeciesTrainer trainers

Private properties¶

property DistributedTrainer._subtrainer_suffixes: Generator[str, None, None]¶

Return the suffixes for the different subtrainers.

Can be used to generate the subtrainers’ names, checkpoint files, etc.

Returns:: A generator of the suffixes
Return type:: A generator of str

Static methods¶

abstract static DistributedTrainer.receive_representatives(subtrainer) → None¶

Receive representative solutions.

This method must be overridden by subclasses.

Parameters:: subtrainer (SingleSpeciesTrainer) – The subtrainer receiving representatives

abstract static DistributedTrainer.send_representatives(subtrainer) → None¶

Send representatives.

This method must be overridden by subclasses.

Parameters:: subtrainer (SingleSpeciesTrainer) – The sender subtrainer

Methods¶

DistributedTrainer.save_pickle(filename: str) → None¶

Pickle this object and save it to a file.

Parameters:

filename (str) – The file name.

Raises:

TypeError – If filename is not a valid file name
ValueError – If the filename extension is not PICKLE_FILE_EXTENSION

DistributedTrainer.reset() → None¶

Reset the trainer.

Delete the state of the trainer (with _reset_state()) and also all the internal data structures needed to perform the search (with _reset_internals()).

This method should be invoqued each time a hyper parameter is modified.

DistributedTrainer.evaluate(sol: Solution, fitness_func: FitnessFunction | None = None, index: int | None = None, representatives: Sequence[Sequence[Solution | None]] | None = None) → None¶

Evaluate one solution.

Its fitness will be modified according with the fitness function results. Besides, if called during training, the number of evaluations will be also updated.

Parameters:

sol (Solution) – The solution
fitness_func (FitnessFunction, optional) – The fitness function. If omitted, the default training fitness function (fitness_function) is used
index (int, optional) – Index where sol should be inserted in the representatives sequence to form a complete solution for the problem. If omitted, index is used
representatives (Sequence of Sequence of Solution or None, optional) – Sequence of representatives of other species or None (if no representatives are needed to evaluate sol). If omitted, the current value of representatives is used

abstract DistributedTrainer.best_solutions() → Sequence[HallOfFame]¶

Get the best solutions found for each species.

This method must be overridden by subclasses to return a correct value.

Returns:: A list containing HallOfFame of solutions. One hof for each species
Return type:: list of HallOfFame
Raises:: NotImplementedError – If has not been overridden

DistributedTrainer.best_representatives() → List[List[Solution]] | None¶

Return a list of representatives from each species.

Only used for cooperative trainers.

Returns:: A list of representatives lists if the trainer is cooperative or None in other cases.
Return type:: list of list of Solution or None

DistributedTrainer.train(state_proxy: DictProxy | None = None) → None¶

Perform the training process.

Parameters:: state_proxy (DictProxy, optional) – Dictionary proxy to copy the output state of the trainer procedure. Only used if train is executed within a multiprocessing.Process. Defaults to None

DistributedTrainer.test(best_found: Sequence[HallOfFame], fitness_func: FitnessFunction | None = None, representatives: Sequence[Sequence[Solution]] | None = None) → None¶

Apply the test fitness function to the solutions found.

Update the solutions in best_found with their test fitness.

Parameters:

best_found (Sequence of HallOfFame) – The best solutions found for each species. One HallOfFame for each species
fitness_func (FitnessFunction, optional) – Fitness function used to evaluate the final solutions. If ommited, the default training fitness function (fitness_function) will be used
representatives (Sequence of Sequence of Solution) – Sequence of representatives of other species or None (if no representatives are needed). If omitted, the current value of representatives is used

Raises:

TypeError – If any parameter has a wrong type
ValueError – If any parameter has an invalid value.

Private methods¶

abstract DistributedTrainer._generate_subtrainers() → None¶

Generate the subtrainers.

Also assign an index and a container to each subtrainer SingleSpeciesTrainer trainer, change the subtrainers’ checkpoint_filename according to the container checkpointing file name and each subtrainer index.

Finally, the _preprocess_iteration() and _postprocess_iteration() methods of the subtrainer_cls class are dynamically overridden, in order to allow solutions exchange between subtrainers, if necessary

This method must be overridden by subclasses.

Raises:: NotImplementedError – if has not been overridden

DistributedTrainer._get_state() → Dict[str, Any]¶

Return the state of this trainer.

Default state is a dictionary composed of the values of the logbook, num_evals, runtime, current_iter, and representatives trainer properties, along with a private boolean attribute that informs if the search has finished and also the states of the random and numpy.random modules.

If subclasses use any more properties to keep their state, the _get_state() and _set_state() methods must be overridden to take into account such properties.

Type:: dict

DistributedTrainer._set_state(state: Dict[str, Any]) → None¶

Set the state of this trainer.

If subclasses use any more properties to keep their state, the _get_state() and _set_state() methods must be overridden to take into account such properties.

Parameters:: state (dict) – The last loaded state

DistributedTrainer._save_state() → None¶

Save the state at a new checkpoint.

Raises:: Exception – If the checkpoint file can’t be written

DistributedTrainer._load_state() → None¶

Load the state of the last checkpoint.

Raises:: Exception – If the checkpoint file can’t be loaded

DistributedTrainer._new_state() → None¶

Generate a new trainer state.

Overridden to set the logbook to None, since the final logbook will be generated from the subtrainers’ logbook, once the trainer has finished.

DistributedTrainer._init_state() → None¶

Init the trainer state.

If there is any checkpoint file, the state is initialized from it with the _load_state() method. Otherwise a new initial state is generated with the _new_state() method.

DistributedTrainer._reset_state() → None¶

Reset the trainer state.

If subclasses overwrite the _new_state() method to add any new property to keep their state, this method should also be overridden to reset the full state of the trainer.

DistributedTrainer._init_internals() → None¶

Set up the trainer internal data structures to start searching.

Overridden to create the subtrainers and communication queues.

DistributedTrainer._reset_internals() → None¶

Reset the internal structures of the trainer.

Overridden to reset the subtrainers and communication queues.

DistributedTrainer._init_search() → None¶

Init the search process.

Initialize the state of the trainer (with _init_state()) and all the internal data structures needed (with _init_internals()) to perform the search.

DistributedTrainer._search() → None¶

Apply the search algorithm.

Execute the trainer until the termination condition is met. Each iteration is composed by the following steps:

_start_iteration()

_preprocess_iteration()

_do_iteration()

_postprocess_iteration()

_finish_iteration()

_do_iteration_stats()

DistributedTrainer._finish_search() → None¶

Finish the search process.

This method is called after the search has finished. It can be overridden to perform any treatment of the solutions found.

DistributedTrainer._start_iteration() → None¶

Start an iteration.

Prepare the iteration metrics (number of evaluations, execution time) before each iteration is run.

DistributedTrainer._preprocess_iteration() → None¶

Preprocess before doing the iteration.

Subclasses should override this method to make any preprocessment before performing an iteration.

abstract DistributedTrainer._do_iteration() → None¶

Implement an iteration of the search process.

This abstract method should be implemented by subclasses in order to implement the desired behavior.

DistributedTrainer._postprocess_iteration() → None¶

Postprocess after doing the iteration.

Subclasses should override this method to make any postprocessment after performing an iteration.

DistributedTrainer._finish_iteration() → None¶

Finish an iteration.

Finish the iteration metrics (number of evaluations, execution time) after each iteration is run.

DistributedTrainer._do_iteration_stats() → None¶

Perform the iteration stats.

This method should be implemented by subclasses in order to perform the adequate stats.

DistributedTrainer._default_termination_func() → bool¶

Set the default termination criterion.

Return True if max_num_iters iterations have been run.

DistributedTrainer._termination_criterion() → bool¶

Return true if the search should terminate.

Returns True if either the default termination criterion or a custom termination criterion is met. The default termination criterion is implemented by the _default_termination_func() method. Another custom termination criterion can be set with custom_termination_func method.

DistributedTrainer._init_representatives() → None¶

Init the representatives of the other species.

Only used for cooperative approaches, which need representatives of all the species to form a complete solution for the problem. Cooperative subclasses of the Trainer class should override this method to get the representatives of the other species initialized.

culebra.trainer.abc.DistributedTrainer class¶

Class attributes¶

Class methods¶

Properties¶

Private properties¶

Static methods¶

Methods¶

Private methods¶

`culebra.trainer.abc.DistributedTrainer` class¶