`culebra.fitness_function.feature_selection.KappaIndex` class¶

class KappaIndex(training_data: Dataset, test_data: Dataset | None = None, test_prop: float | None = None, cv_folds: int | None = None, classifier: ClassifierMixin | None = None)¶

Construct a fitness function.

If test_data are provided, the whole training_data are used to train. Otherwise, if test_prop is provided, training_data are split (stratified) into training and test data each time evaluate() is called and a Monte Carlo cross validation is applied. Finally, if both test_data and test_prop are omitted, a k-fold cross-validation is applied.

Parameters:

training_data (Dataset) – The training dataset
test_data (Dataset, optional) – The test dataset, defaults to None
test_prop (float, optional) – A real value in (0, 1) or None. Defaults to None
cv_folds (int, optional) – The number of folds for k-fold cross-validation. If omitted, DEFAULT_CV_FOLDS is used. Defaults to None
classifier (ClassifierMixin, optional) – The classifier. If set to None, DEFAULT_CLASSIFIER will be used. Defaults to None

Class attributes¶

class culebra.fitness_function.feature_selection.KappaIndex.Fitness¶

Handles the values returned by the evaluate() method within a Solution.

Fitness.weights = (1.0,)¶: Maximizes the validation Kappa index.

Fitness.names = ('Kappa',)¶: Name of the objective.

Fitness.thresholds = [0]¶: Similarity threshold for fitness comparisons.

Class methods¶

classmethod KappaIndex.load_pickle(filename: str) → Base¶

Load a pickled object from a file.

Parameters:

filename (str) – The file name.

Raises:

TypeError – If filename is not a valid file name
ValueError – If the filename extension is not PICKLE_FILE_EXTENSION

classmethod KappaIndex.set_fitness_thresholds(thresholds: float | Sequence[float]) → None¶

Set new fitness thresholds.

Modifies the thresholds of the Fitness objects generated by this fitness function.

Parameters:

thresholds (float or Sequence of float) – The new thresholds. If only a single value is provided, the same threshold will be used for all the objectives. Different thresholds can be provided in a Sequence

Raises:

TypeError – If thresholds is not a real number or a Sequence of real numbers
ValueError – If any threshold is negative

classmethod KappaIndex.get_fitness_objective_threshold(obj_name: str) → None¶

Get the similarity threshold for the given objective.

Parameters:

obj_name (str) – Objective name whose threshold is returned

Raises:

TypeError – If obj_name isn’t a string
ValueError – If value isn’t a valid objective name

classmethod KappaIndex.set_fitness_objective_threshold(obj_name: str, value: float) → None¶

Set a similarity threshold for the given fitness objective.

Parameters:

obj_name (str) – Objective name whose threshold is modified
value (float) – New value for the similarity threshold.

Raises:

TypeError – If obj_name isn’t a string or value isn’t a real number
ValueError – If obj_name isn’t a valid objective name or value is lower than 0

Properties¶

property KappaIndex.is_noisy: int¶: Return True if the fitness function is noisy.

property KappaIndex.num_obj: int¶

Get the number of objectives.

Type:: int

property KappaIndex.num_nodes: int¶

Return the problem graph’s number of nodes for ACO-based trainers.

Returns:: The problem graph’s number of nodes
Return type:: int

property KappaIndex.training_data: Dataset¶

Get and set the training dataset.

Getter:: Return the training dataset
Setter:: Set a new training dataset
Type:: Dataset
Raises:: TypeError – If set to an invalid dataset

property KappaIndex.test_data: Dataset | None¶

Get and set the test dataset.

Getter:: Return the test dataset
Setter:: Set a new test dataset
Type:: Dataset
Raises:: TypeError – If set to an invalid dataset

property KappaIndex.test_prop: float | None¶

Get and set the proportion of data used to test.

Getter:

Return the test data proportion

Setter:

Set a new value for the test data porportion. A real value in (0, 1) or None is expected

Type:

float

Raises:

TypeError – If set to a value which is not a real number
ValueError – If set to a value which is not in (0, 1)

property KappaIndex.classifier: ClassifierMixin¶

Get and set the classifier applied within this fitness function.

Getter:: Return the classifier
Setter:: Set a new classifier
Type:: ClassifierMixin
Raises:: TypeError – If set to a value which is not a classifier

Private properties¶

property KappaIndex._worst_score: float¶

Worst achievable score.

Type:: float

Methods¶

KappaIndex.save_pickle(filename: str) → None¶

Pickle this object and save it to a file.

Parameters:

filename (str) – The file name.

Raises:

TypeError – If filename is not a valid file name
ValueError – If the filename extension is not PICKLE_FILE_EXTENSION

KappaIndex.heuristic(species: Species) → Sequence[ndarray, ...]¶

Get the heuristic matrices for ACO-based trainers.

Parameters:: species (Species) – Species constraining the problem solutions
Raises:: TypeError – If species is not an instance of Species
Returns:: A tuple with only one heuristic matrix. Arcs between selectable features have a heuristic value of 1, while arcs involving any non-selectable feature or arcs from a feature to itself have a heuristic value of 0.
Return type:: Sequence of ndarray

KappaIndex.evaluate(sol: Solution, index: int | None = None, representatives: Sequence[Solution] | None = None) → Tuple[float, ...]¶

Evaluate a solution.

Parameters:

sol (Solution) – Solution to be evaluated.
index (int, ignored) – Index where sol should be inserted in the representatives sequence to form a complete solution for the problem. Only used by cooperative problems
representatives (Sequence of Solution, ignored) – Representative solutions of each species being optimized. Only used by cooperative problems

Returns:

The fitness of sol

Return type:

tuple of float

Private methods¶

KappaIndex._score(y2, *, labels=None, weights=None, sample_weight=None)¶: Use cohen_kappa_score() to score.

KappaIndex._final_training_test_data(sol: Solution) → Tuple[Dataset, Dataset]¶

Get the final training and test data.

Parameters:: sol (Solution) – Solution to be evaluated. It is used to select the features from the datasets
Returns:: The final training and test datasets
Return type:: tuple of Dataset

KappaIndex._evaluate_train_test(training_data: Dataset, test_data: Dataset) → Tuple[float, ...]¶

Evaluate a solution.

This method must be overridden by subclasses to return a correct value.

Parameters:

training_data (Dataset) – The training dataset
test_data (Dataset) – The test dataset

Returns:

The fitness values for sol

Return type:

tuple of float

KappaIndex._evaluate_mccv(training_data: Dataset) → Tuple[float, ...]¶

Evaluate a solution.

The training_data are split (stratified) into training and test data according to test_prop each time the solution is evalueted and a Monte Carlo cross-validation is applied.

Parameters:: training_data (Dataset) – The training dataset
Returns:: The fitness values for sol
Return type:: tuple of float

KappaIndex._evaluate_kfcv(training_data: Dataset) → Tuple[float, ...]¶

Evaluate a solution.

A k-fold cross-validation is applied using the training_data with cv_folds folds.

Parameters:: training_data (Dataset) – The training dataset
Returns:: The fitness values for sol
Return type:: tuple of float

culebra.fitness_function.feature_selection.KappaIndex class¶

Class attributes¶

Class methods¶

Properties¶

Private properties¶

Methods¶

Private methods¶

`culebra.fitness_function.feature_selection.KappaIndex` class¶