culebra.fitness_function.feature_selection.KappaIndex class

class KappaIndex(training_data: Dataset, test_data: Dataset | None = None, cv_folds: int | None = None, classifier: ClassifierMixin | None = None, index: int | None = None)

Bases: FSClassificationScorer, KappaIndex

Construct the fitness function.

If test_data are provided, the whole training_data are used to train. Otherwise, a k-fold cross-validation is applied.

Parameters:
  • training_data (Dataset) – The training dataset

  • test_data (Dataset) – The test dataset, defaults to None

  • cv_folds (int) – The number of folds for k-fold cross-validation. If omitted, _default_cv_folds is used. Defaults to None

  • classifier (ClassifierMixin) – The classifier. If omitted, _default_classifier will be used. Defaults to None

  • index (int) – Index of this objective when it is used for multi-objective fitness functions, optional

Raises:
  • RuntimeError – If the number of objectives is not 1

  • TypeError – If training_data or test_data is an invalid dataset

  • TypeError – If cv_folds is not an integer value

  • ValueError – If cv_folds is not positive

  • TypeError – If classifier is not a valid classifier

  • TypeError – If index is not an integer number

  • ValueError – If index is not positive

Class methods

classmethod KappaIndex.load(filename: str) Base

Load a serialized object from a file.

Parameters:

filename (str) – The file name.

Returns:

The loaded object

Raises:

Properties

property KappaIndex.classifier: ClassifierMixin

Classifier applied within this fitness function.

Return type:

ClassifierMixin

Setter:

Set a new classifier

Parameters:

value (ClassifierMixin) – The classifier. If set to None, _default_classifier is used

Raises:

TypeError – If value is not a valid classifier

property KappaIndex.cv_folds: int

Number of cross-validation folds.

Return type:

int

Setter:

Set a new value for the number of cross-validation folds

Parameters:

value (int) – A positive integer value. If set to None, _default_cv_folds is assumed

Raises:
property KappaIndex.fitness_cls: type[Fitness]

Fitness class.

Return type:

type[Fitness]

property KappaIndex.index: int

Objective index.

Return type:

int

Setter:

Set a new index

Parameters:

value (int) – The new index. If set to None, _default_index is chosen

Raises:
property KappaIndex.num_obj: int

Number of objectives.

Return type:

int

property KappaIndex.obj_names: tuple[str, ...]

Objective names.

Returns:

(“Kappa”,)

Return type:

tuple[str]

property KappaIndex.obj_thresholds: list[float]

Objective similarity thresholds.

Return type:

list[float]

Setter:

Set new thresholds.

Parameters:

values (float | Sequence[float]) – The new values. If only a single value is provided, the same threshold will be used for all the objectives. Different thresholds can be provided in a Sequence. If set to None, all the thresholds are set to _default_similarity_threshold

Raises:
  • TypeError – If neither a real number nor a Sequence of real numbers is provided

  • ValueError – If any value is negative

  • ValueError – If the length of the thresholds sequence does not match the number of objectives

property KappaIndex.obj_weights: tuple[int, ...]

Objective weights.

Maximize the validation Kappa index.

Returns:

(1, )

Return type:

tuple[int]

property KappaIndex.test_data: Dataset | None

Test dataset.

If set to None, a k-fold cross-validation is applied.

Return type:

Dataset

Setter:

Set a new test dataset

Parameters:

value (Dataset) – The new test dataset

Raises:

TypeError – If set to an invalid dataset

property KappaIndex.training_data: Dataset

Training dataset.

Return type:

Dataset

Setter:

Set a new training dataset

Parameters:

value (Dataset) – The new training dataset

Raises:

TypeError – If set to an invalid dataset

Private properties

property KappaIndex._default_classifier: ClassifierMixin

Default classifier.

Returns:

A Gaussian Naive Bayes classifier

Return type:

ClassifierMixin

property KappaIndex._default_cv_folds: int

Default number of folds for cross-validation.

Returns:

DEFAULT_CV_FOLDS

Return type:

int

property KappaIndex._default_index: int

Default index.

Returns:

DEFAULT_INDEX

Return type:

int

property KappaIndex._default_similarity_threshold: float

Default similarity threshold for fitnesses.

Returns:

DEFAULT_SIMILARITY_THRESHOLD

Return type:

float

property KappaIndex._worst_score: float

Worst achievable score.

Returns:

-1

Return type:

float

Methods

KappaIndex.dump(filename: str) None

Serialize this object and save it to a file.

Parameters:

filename (str) – The file name.

Raises:
KappaIndex.evaluate(sol: Solution, index: int | None = None, representatives: Sequence[Solution] | None = None) Fitness

Evaluate a solution.

Parameters:
  • sol (Solution) – Solution to be evaluated.

  • index (int) – Index where sol should be inserted in the representatives sequence to form a complete solution for the problem. Only used by cooperative problems

  • representatives (Sequence[Solution]) – Representative solutions of each species being optimized. Only used by cooperative problems

Returns:

The fitness for sol

Return type:

Fitness

Raises:

ValueError – If sol is not evaluable

KappaIndex.is_evaluable(sol: Solution) bool

Assess the evaluability of a solution.

Parameters:

sol (Solution) – Solution to be evaluated.

Returns:

True if the solution can be evaluated

Return type:

bool

Raises:

NotImplementedError – If has not been overridden

Private methods

KappaIndex._evaluate_kfcv(sol: Solution, training_data: Dataset) Fitness

Evaluate a solution.

A k-fold cross-validation is applied using the training_data with cv_folds folds.

Parameters:
  • sol (Solution) – Solution to be evaluated.

  • training_data (Dataset) – The training dataset

Returns:

The fitness for sol

Return type:

Fitness

KappaIndex._evaluate_train_test(sol: Solution, training_data: Dataset, test_data: Dataset) Fitness

Evaluate a solution.

Parameters:
  • sol (Solution) – Solution to be evaluated.

  • training_data (Dataset) – The training dataset

  • test_data (Dataset) – The test dataset

Returns:

The fitness for sol

Return type:

Fitness

KappaIndex._final_training_test_data(sol: Solution) tuple[Dataset, Dataset]

Get the final training and test data.

Parameters:

sol (Solution) – Solution to be evaluated. It is used to select the features from the datasets

Returns:

The final training and test datasets

Return type:

tuple[Dataset]

KappaIndex._score(y2, *, labels=None, weights=None, sample_weight=None)

Use cohen_kappa_score() to score.