culebra.trainer.abc.DistributedTrainer
class¶
- class DistributedTrainer(fitness_function: FitnessFunction, subtrainer_cls: Type[SingleSpeciesTrainer], max_num_iters: int | None = None, custom_termination_func: Callable[[SingleSpeciesTrainer], bool] | None = None, num_subtrainers: int | None = None, representation_size: int | None = None, representation_freq: int | None = None, representation_selection_func: Callable[[List[Solution], Any], Solution] | None = None, representation_selection_func_params: Dict[str, Any] | None = None, checkpoint_enable: bool | None = None, checkpoint_freq: int | None = None, checkpoint_filename: str | None = None, verbose: bool | None = None, random_seed: int | None = None, **subtrainer_params: Any)¶
Create a new trainer.
- Parameters:
fitness_function (
FitnessFunction
) – The training fitness functionsubtrainer_cls (Any subclass of
SinglePopEA
) – Single-species trainer class to handle the subtrainers.max_num_iters (
int
, optional) – Maximum number of iterations. If set toNone
,DEFAULT_MAX_NUM_ITERS
will be used. Defaults toNone
custom_termination_func (
Callable
, optional) – Custom termination criterion. If set toNone
,_default_termination_func()
is used. Defaults toNone
num_subtrainers (
int
, optional) – The number of subtrainers. If set toNone
, the number of CPU cores will be used. Defaults toNone
representation_size (
int
, optional) – Number of representative solutions that will be sent to the other subtrainers. If set toNone
,DEFAULT_REPRESENTATION_SIZE
will be used. Defaults toNone
representation_freq (
int
, optional) – Number of iterations between representatives sendings. If set toNone
,DEFAULT_REPRESENTATION_FREQ
will be used. Defaults toNone
representation_selection_func (
Callable
, optional) – Policy function to choose the representatives from each subtrainer. If set toNone
,DEFAULT_REPRESENTATION_SELECTION_FUNC
will be used. Defaults toNone
representation_selection_func_params (
dict
, optional) – Parameters to obtain the representatives with the selection policy function. If set toNone
,DEFAULT_REPRESENTATION_SELECTION_FUNC_PARAMS
will be used. Defaults toNone
checkpoint_enable (
bool
, optional) – Enable/disable checkpoining. If set toNone
,DEFAULT_CHECKPOINT_ENABLE
will be used. Defaults toNone
checkpoint_freq (
int
, optional) – The checkpoint frequency. If set toNone
,DEFAULT_CHECKPOINT_FREQ
will be used. Defaults toNone
checkpoint_filename (
str
, optional) – The checkpoint file path. If set toNone
,DEFAULT_CHECKPOINT_FILENAME
will be used. Defaults toNone
verbose (
bool
, optional) – The verbosity. If set toNone
,__debug__
will be used. Defaults toNone
subtrainer_params (keyworded variable-length argument list) – Custom parameters for the subtrainers trainer
- Raises:
TypeError – If any argument is not of the appropriate type
ValueError – If any argument has an incorrect value
Class attributes¶
- DistributedTrainer.stats_names = ('Iter', 'NEvals')¶
Statistics calculated each iteration.
- DistributedTrainer.objective_stats = {'Avg': <function mean>, 'Max': <function max>, 'Min': <function min>, 'Std': <function std>}¶
Statistics calculated for each objective.
Class methods¶
- classmethod DistributedTrainer.load_pickle(filename: str) Base ¶
Load a pickled object from a file.
- Parameters:
filename (
str
) – The file name.- Raises:
TypeError – If filename is not a valid file name
ValueError – If the filename extension is not
PICKLE_FILE_EXTENSION
Properties¶
- property DistributedTrainer.fitness_function: FitnessFunction¶
Get and set the training fitness function.
- Getter:
Return the fitness function
- Setter:
Set a new fitness function
- Type:
- Raises:
TypeError – If set to a value which is not a fitness function
- property DistributedTrainer.subtrainer_cls: Type[SingleSpeciesTrainer]¶
Get and set the trainer class to handle the subtrainers.
Each subtrainer will be handled by a single-species trainer.
- Getter:
Return the trainer class
- Setter:
Set new trainer class
- Type:
A
SingleSpeciesTrainer
subclass- Raises:
TypeError – If set to a value which is not a
SingleSpeciesTrainer
subclass
- property DistributedTrainer.max_num_iters: int¶
Get and set the maximum number of iterations.
- Getter:
Return the current maximum number of iterations
- Setter:
Set a new value for the maximum number of iterations. If set to
None
, the default maximum number of iterations,DEFAULT_MAX_NUM_ITERS
, is chosen- Type:
- Raises:
TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not a positive number
- property DistributedTrainer.custom_termination_func: Callable[[Trainer], bool]¶
Get and set the custom termination criterion.
The custom termination criterion must be a function which receives the trainer as its unique argument and returns a boolean value,
True
if the search should terminate orFalse
otherwise.If more than one arguments are needed to define the termniation condition,
functools.partial()
can be used:from functools import partial def my_crit(trainer, max_iters): if trainer.current_iter < max_iters: return False return True trainer.custom_termination_func = partial(my_crit, max_iters=10)
- property DistributedTrainer.num_subtrainers: int¶
Get and set the number of subtrainers.
- Getter:
Return the current number of subtrainers
- Setter:
Set a new value for the number of subtrainers. If set to
None
,DEFAULT_NUM_SUBTRAINERS
is chosen- Type:
- Raises:
TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not a positive number
- property DistributedTrainer.representation_size: int¶
Get and set the representation size.
The representation size is the number of representatives sent to the other subtrainers
- Getter:
Return the current representation size
- Setter:
Set the new representation size. If set to
None
,DEFAULT_REPRESENTATION_SIZE
is chosen- Type:
- Raises:
TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not positive
- property DistributedTrainer.representation_freq: int¶
Get and set the number of iterations between representatives sendings.
- Getter:
Return the current frequency
- Setter:
Set a new value for the frequency. If set to
None
,DEFAULT_REPRESENTATION_FREQ
is chosen- Type:
- Raises:
TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not a positive number
- abstract property DistributedTrainer.representation_topology_func: Callable[[int, int, Any], List[int]]¶
Get the representation topology function.
This property must be overridden by subclasses to return a correct value.
- Type:
- abstract property DistributedTrainer.representation_topology_func_params: Dict[str, Any]¶
Get the parameters of the representation topology function.
This property must be overridden by subclasses to return a correct value.
- Type:
- property DistributedTrainer.representation_selection_func: Callable[[List[Solution], Any], Solution]¶
Get and set the representation selection policy function.
The representation selection policy func chooses which solutions are selected as representatives of each subtrainer.
- Getter:
Return the representation selection policy function
- Setter:
Set new representation selection policy function. If set to
None
,DEFAULT_REPRESENTATION_SELECTION_FUNC
is chosen- Type:
- Raises:
TypeError – If set to a value which is not callable
- property DistributedTrainer.representation_selection_func_params: Dict[str, Any]¶
Get and set the parameters of the representation selection function.
- Getter:
Return the current parameters for the representation selection policy function
- Setter:
Set new parameters. If set to
None
,DEFAULT_REPRESENTATION_SELECTION_FUNC_PARAMS
is chosen- Type:
- Raises:
- property DistributedTrainer.checkpoint_freq: int¶
Get and set the checkpoint frequency.
- Getter:
Return the checkpoint frequency
- Setter:
Set a value for the checkpoint frequency. If set to
None
,DEFAULT_CHECKPOINT_FREQ
is chosen- Type:
- Raises:
TypeError – If set to a value which is not an integer
ValueError – If set to a value which is not a positive number
- property DistributedTrainer.checkpoint_filename: str¶
Get and set the checkpoint file path.
- Getter:
Return the checkpoint file path
- Setter:
Set a new value for the checkpoint file path. If set to
None
,DEFAULT_CHECKPOINT_FILENAME
is chosen- Type:
- Raises:
TypeError – If set to a value which is not a a valid file name
ValueError – If set to a value whose extension is not
PICKLE_FILE_EXTENSION
- property DistributedTrainer.random_seed: int¶
Get and set the initial random seed used by this trainer.
- Getter:
Return the seed
- Setter:
Set a new value for the random seed
- Type:
- property DistributedTrainer.logbook: Logbook | None¶
Get the training logbook.
Return a logbook with the statistics of the search or
None
if the search has not been done yet.- Type:
- property DistributedTrainer.num_evals: int | None¶
Get the number of evaluations performed while training.
Return the number of evaluations or
None
if the search has not been done yet.- Type:
- property DistributedTrainer.runtime: float | None¶
Get the training runtime.
Return the training runtime or
None
if the search has not been done yet.- Type:
- property DistributedTrainer.index: int¶
Get and set the trainer index.
The trainer index is only used by distributed trainers. For the rest of trainers
DEFAULT_INDEX
is used.- Getter:
Return the trainer index
- Setter:
Set a new value for trainer index. If set to
None
,DEFAULT_INDEX
is chosen- Type:
- Raises:
TypeError – If set to a value which is not an integer
ValueError – If set to a value which is a negative number
- property DistributedTrainer.container: Trainer | None¶
Get and set the container of this trainer.
The trainer container is only used by distributed trainers. For the rest of trainers defaults to
None
.
- property DistributedTrainer.representatives: Sequence[Sequence[Solution | None]] | None¶
Return the representatives of the other species.
Only used by cooperative trainers. If the trainer does not use representatives,
None
is returned.
- property DistributedTrainer.subtrainer_params: Dict[str, Any]¶
Get and set the custom parameters of the subtrainers.
- property DistributedTrainer.subtrainer_checkpoint_filenames: Generator[str, None, None]¶
Checkpoint file name of all the subtrainers.
- property DistributedTrainer.subtrainers: List[SingleSpeciesTrainer] | None¶
Return the subtrainers.
One single-species trainer for each subtrainer
- Type:
list
ofSingleSpeciesTrainer
trainers
Private properties¶
Static methods¶
- abstract static DistributedTrainer.receive_representatives(subtrainer) None ¶
Receive representative solutions.
This method must be overridden by subclasses.
- Parameters:
subtrainer (
SingleSpeciesTrainer
) – The subtrainer receiving representatives
- abstract static DistributedTrainer.send_representatives(subtrainer) None ¶
Send representatives.
This method must be overridden by subclasses.
- Parameters:
subtrainer (
SingleSpeciesTrainer
) – The sender subtrainer
Methods¶
- DistributedTrainer.save_pickle(filename: str) None ¶
Pickle this object and save it to a file.
- Parameters:
filename (
str
) – The file name.- Raises:
TypeError – If filename is not a valid file name
ValueError – If the filename extension is not
PICKLE_FILE_EXTENSION
- DistributedTrainer.reset() None ¶
Reset the trainer.
Delete the state of the trainer (with
_reset_state()
) and also all the internal data structures needed to perform the search (with_reset_internals()
).This method should be invoqued each time a hyper parameter is modified.
- DistributedTrainer.evaluate(sol: Solution, fitness_func: FitnessFunction | None = None, index: int | None = None, representatives: Sequence[Sequence[Solution | None]] | None = None) None ¶
Evaluate one solution.
Its fitness will be modified according with the fitness function results. Besides, if called during training, the number of evaluations will be also updated.
- Parameters:
sol (
Solution
) – The solutionfitness_func (
FitnessFunction
, optional) – The fitness function. If omitted, the default training fitness function (fitness_function
) is usedindex (
int
, optional) – Index where sol should be inserted in the representatives sequence to form a complete solution for the problem. If omitted,index
is usedrepresentatives (
Sequence
ofSequence
ofSolution
orNone
, optional) – Sequence of representatives of other species orNone
(if no representatives are needed to evaluate sol). If omitted, the current value ofrepresentatives
is used
- abstract DistributedTrainer.best_solutions() Sequence[HallOfFame] ¶
Get the best solutions found for each species.
This method must be overridden by subclasses to return a correct value.
- Returns:
A list containing
HallOfFame
of solutions. One hof for each species- Return type:
list
ofHallOfFame
- Raises:
NotImplementedError – If has not been overridden
- DistributedTrainer.best_representatives() List[List[Solution]] | None ¶
Return a list of representatives from each species.
Only used for cooperative trainers.
- DistributedTrainer.train(state_proxy: DictProxy | None = None) None ¶
Perform the training process.
- Parameters:
state_proxy (
DictProxy
, optional) – Dictionary proxy to copy the output state of the trainer procedure. Only used if train is executed within amultiprocessing.Process
. Defaults toNone
- DistributedTrainer.test(best_found: Sequence[HallOfFame], fitness_func: FitnessFunction | None = None, representatives: Sequence[Sequence[Solution]] | None = None) None ¶
Apply the test fitness function to the solutions found.
Update the solutions in best_found with their test fitness.
- Parameters:
best_found (
Sequence
ofHallOfFame
) – The best solutions found for each species. OneHallOfFame
for each speciesfitness_func (
FitnessFunction
, optional) – Fitness function used to evaluate the final solutions. If ommited, the default training fitness function (fitness_function
) will be usedrepresentatives (
Sequence
ofSequence
ofSolution
) – Sequence of representatives of other species orNone
(if no representatives are needed). If omitted, the current value ofrepresentatives
is used
- Raises:
TypeError – If any parameter has a wrong type
ValueError – If any parameter has an invalid value.
Private methods¶
- abstract DistributedTrainer._generate_subtrainers() None ¶
Generate the subtrainers.
Also assign an
index
and acontainer
to each subtrainerSingleSpeciesTrainer
trainer, change the subtrainers’checkpoint_filename
according to the container checkpointing file name and each subtrainer index.Finally, the
_preprocess_iteration()
and_postprocess_iteration()
methods of thesubtrainer_cls
class are dynamically overridden, in order to allow solutions exchange between subtrainers, if necessaryThis method must be overridden by subclasses.
- Raises:
NotImplementedError – if has not been overridden
- DistributedTrainer._get_state() Dict[str, Any] ¶
Return the state of this trainer.
Default state is a dictionary composed of the values of the
logbook
,num_evals
,runtime
,current_iter
, andrepresentatives
trainer properties, along with a private boolean attribute that informs if the search has finished and also the states of therandom
andnumpy.random
modules.If subclasses use any more properties to keep their state, the
_get_state()
and_set_state()
methods must be overridden to take into account such properties.- Type:
- DistributedTrainer._set_state(state: Dict[str, Any]) None ¶
Set the state of this trainer.
If subclasses use any more properties to keep their state, the
_get_state()
and_set_state()
methods must be overridden to take into account such properties.- Parameters:
state (
dict
) – The last loaded state
- DistributedTrainer._save_state() None ¶
Save the state at a new checkpoint.
- Raises:
Exception – If the checkpoint file can’t be written
- DistributedTrainer._load_state() None ¶
Load the state of the last checkpoint.
- Raises:
Exception – If the checkpoint file can’t be loaded
- DistributedTrainer._new_state() None ¶
Generate a new trainer state.
Overridden to set the logbook to
None
, since the final logbook will be generated from the subtrainers’ logbook, once the trainer has finished.
- DistributedTrainer._init_state() None ¶
Init the trainer state.
If there is any checkpoint file, the state is initialized from it with the
_load_state()
method. Otherwise a new initial state is generated with the_new_state()
method.
- DistributedTrainer._reset_state() None ¶
Reset the trainer state.
If subclasses overwrite the
_new_state()
method to add any new property to keep their state, this method should also be overridden to reset the full state of the trainer.
- DistributedTrainer._init_internals() None ¶
Set up the trainer internal data structures to start searching.
Overridden to create the subtrainers and communication queues.
- DistributedTrainer._reset_internals() None ¶
Reset the internal structures of the trainer.
Overridden to reset the subtrainers and communication queues.
- DistributedTrainer._init_search() None ¶
Init the search process.
Initialize the state of the trainer (with
_init_state()
) and all the internal data structures needed (with_init_internals()
) to perform the search.
- DistributedTrainer._search() None ¶
Apply the search algorithm.
Execute the trainer until the termination condition is met. Each iteration is composed by the following steps:
- DistributedTrainer._finish_search() None ¶
Finish the search process.
This method is called after the search has finished. It can be overridden to perform any treatment of the solutions found.
- DistributedTrainer._start_iteration() None ¶
Start an iteration.
Prepare the iteration metrics (number of evaluations, execution time) before each iteration is run.
- DistributedTrainer._preprocess_iteration() None ¶
Preprocess before doing the iteration.
Subclasses should override this method to make any preprocessment before performing an iteration.
- abstract DistributedTrainer._do_iteration() None ¶
Implement an iteration of the search process.
This abstract method should be implemented by subclasses in order to implement the desired behavior.
- DistributedTrainer._postprocess_iteration() None ¶
Postprocess after doing the iteration.
Subclasses should override this method to make any postprocessment after performing an iteration.
- DistributedTrainer._finish_iteration() None ¶
Finish an iteration.
Finish the iteration metrics (number of evaluations, execution time) after each iteration is run.
- DistributedTrainer._do_iteration_stats() None ¶
Perform the iteration stats.
This method should be implemented by subclasses in order to perform the adequate stats.
- DistributedTrainer._default_termination_func() bool ¶
Set the default termination criterion.
Return
True
ifmax_num_iters
iterations have been run.
- DistributedTrainer._termination_criterion() bool ¶
Return true if the search should terminate.
Returns
True
if either the default termination criterion or a custom termination criterion is met. The default termination criterion is implemented by the_default_termination_func()
method. Another custom termination criterion can be set withcustom_termination_func
method.
- DistributedTrainer._init_representatives() None ¶
Init the representatives of the other species.
Only used for cooperative approaches, which need representatives of all the species to form a complete solution for the problem. Cooperative subclasses of the
Trainer
class should override this method to get the representatives of the other species initialized.