culebra.tools module

Tools to automate the execution of experiments.

Since many interesting problems are based on data processing, this module provides the Dataset class to hold and manage the data samples.

Bresides, since automated experimentation is also a quite valuable characteristic when a Trainer method has to be run many times, culebra provides this features by means of the following classes:

  • The Evaluation class, a base class for the evaluation of trainers

  • The Experiment class, designed to run a single experiment with a Trainer

  • The Batch class, which allows to run a batch of experiments with the same configuration

  • The Results class, to manage the results provided by the evaluation of any Trainer

  • The ResultsAnalyzer class, to perform statistical analysis over the results of several experimtent batchs

  • The TestOutcome class, to keep the outcome of a statistical test

  • The ResultsComparison class, to keep the outcome of a comparison of several batches results

  • The EffectSize class, to keep the outcome of an effect size estimation of several batches results

Attributes

DEFAULT_SEP = '\\s+'

Default column separator used within dataset files.

EXCEL_FILE_EXTENSION = '.xlsx'

File extension for Excel datasheets.

DEFAULT_ALPHA = 0.05

Default significance level for statistical tests.

DEFAULT_NORMALITY_TEST = <function shapiro>

Default normality test.

DEFAULT_HOMOSCEDASTICITY_TEST = <function bartlett>

Default homoscedasticity test.

DEFAULT_P_ADJUST = 'fdr_tsbky'

Default method for adjusting the p-values with the Dunn’s test.

DEFAULT_STATS_FUNCTIONS = {'Avg': <function mean>, 'Max': <function amax>, 'Min': <function amin>, 'Std': <function std>}

Default statistics calculated for the results.

DEFAULT_FEATURE_METRIC_FUNCTIONS = {'Rank': <function Metrics.rank>, 'Relevance': <function Metrics.relevance>}

Default metrics calculated for the features in the set of solutions.

DEFAULT_BATCH_STATS_FUNCTIONS = {'Avg': <function NDFrame._add_numeric_operations.<locals>.mean>, 'Max': <function NDFrame._add_numeric_operations.<locals>.max>, 'Min': <function NDFrame._add_numeric_operations.<locals>.min>, 'Std': <function NDFrame._add_numeric_operations.<locals>.std>}

Default statistics calculated for the results gathered from all the experiments.

DEFAULT_NUM_EXPERIMENTS = 1

Default number of experiments in the batch.

DEFAULT_RUN_SCRIPT_FILENAME = 'run.py'

Default file name for the script to run an evaluation.

DEFAULT_CONFIG_SCRIPT_FILENAME = 'config.py'

Default file name for configuration files.

DEFAULT_RESULTS_BASENAME = 'results'

Default base name for results files.