certifai.scanner.builder module

Certifai Scan object model builder

Contains classes representing a Certifai scan definition, and the ability to programmatically manipulate, load, save and run them.

class certifai.scanner.builder.CertifaiOutcomeValue(value: Any, name: Optional[str] = None, favorable: bool = False)
class certifai.scanner.builder.CertifaiTaskOutcomes

Union type representing the outcomes of a task by type (possible classes for classification, favorable direction for regression)

property task_type
property prediction_favorability
static regression(increased_favorable: Optional[bool], change_std_deviation: Optional[float] = None, absolute_threshold: Optional[float] = None, absolute_percentile: Optional[float] = None)

Construct a regression task type

Favorability may be specified in one of three ways (only one of which may be specified):

  1. As a relative increase [or decrease] by a multiple of the population global regressed value standard deviation

  2. As an absolute threshold, specified as an exact value of the regressor output

  3. As an absolute threshold, specified as a percentile of the population global regressed value empirical distribution

Parameters
  • increased_favorable (Optional[bool]) – True if the favorable direction of the prediction is increasing. Can be set to None if there is no favorable direction.

  • change_std_deviation (Optional[float]) – Number of standard deviations considered to be a significant change

  • absolute_threshold (Optional[float]) – Absolute regressed value threshold for favorability

  • absolute_percentile (Optional[float]) – Absolute regressed value threshold for favorability expressed as a population percentile

Returns

CertifaiTaskType instance representing the regression outcome definition

static classification(prediction_values: Iterable[certifai.scanner.builder.CertifaiOutcomeValue], prediction_favorability: Optional[str] = 'explicit', last_favorable_prediction: Optional[Any] = None, favorable_outcome_group_name: Optional[str] = None, unfavorable_outcome_group_name: Optional[str] = None)

Construct a classification task type

Parameters
  • prediction_values (Iterable[CertifaiClassificationPrediction]) – list of possible classes

  • prediction_favorability (Optional[str]) – describes the favorability of the prediction_values, default ‘explicit’. Must be one of - ‘explicit’, predictions should be explicitly marked as favorable - ‘ordered’, predictions are ordered from most to least favorable - ‘none’, no prediction should be treated as favorable

  • last_favorable_prediction (Optional[Any]) – ignored unless the prediction_favorability is ‘ordered’, in which case this value should be the last label (in the ordering of the prediction_values which is considered favorable)

  • favorable_outcome_group_name (Optional[str]) – name of favorable group of prediction values - reserved for multiclass-classification task’s with a prediction_favorablity of ‘explicit’

  • unfavorable_outcome_group_name (Optional[str]) – name of unfavorable groups of prediction values - reserved for multiclass-classification task’s with a prediction_favorability of ‘explicit’

Returns

CertifaiTaskType instance representing the classification outcome definition

class certifai.scanner.builder.CertifaiPredictionTask(outcomes: certifai.scanner.builder.CertifaiTaskOutcomes, prediction_description: Optional[str] = None)

Metadata about the prediction task - immutable once instantiated.

Parameters
  • outcomes (CertifaiTaskOutcomes) – One of the supported CertifaiTaskOutcomes types, constructed from the static methods on CertifaiTaskOutcomes.

  • prediction_description (Optional[str]) – Free text description of what is being predicted.

property task_type

The task type string (‘binary_classification’, ‘multiclass-classification’, ‘regression’)

Getter

Returns the task type.

Type

str

property prediction_description

Description of what the prediction represents.

Getter

Returns the description, if any.

Type

Optional[str]

property prediction_favorability

What format is used for specifying the favorable prediction value, if any, (‘none’, ‘ordered’, ‘explicit’).

Getter

Returns prediction favorability

Type

Optional[str]

property favorable_outcome

What the favorable outcome direction is for a regression task, None otherwise.

Getter

Returns the favorable label direction (regression) if set.

Type

Optional[Any]

property prediction_values
property last_favorable_prediction
property regression_standard_deviation
property regression_absolute_threshold
property regression_absolute_percentile
property favorable_outcome_group_name

The string name of the favorable group of prediction values - reserved for multiclass-classification with prediction_favorability of ‘explicit’ - None otherwise.

Getter

Return name of favorable group of prediction values

Type

Optional[str]

property unfavorable_outcome_group_name

The string name of the unfavorable group of prediction values - reserved for multiclass-classification with prediction_favorability of ‘explicit’ - None otherwise.

Getter

Return name of unfavorable group of prediction values

Type

Optional[str]

class certifai.scanner.builder.CertifaiModelMetric(name: str, certifai_metric: Optional[str] = None)

Metadata for a metric - immutable once instantiated

Parameters
  • name (str) – Free text descriptive name of the metric.

  • certifai_metric (Optional[str]) –

    If specified will allow Certifai to calculate the value. Supported values are:

    • ’accuracy’ (classification)

    • ’precision’ (classification)

    • ’recall’ (classification)

    • ’f1’ (classification)

    • ’r-squared’ (regression)

    Micro and macro variants are also supported for precision, recall and f1 e.g. ‘f1(micro)’

property name

Descriptive name of the metric.

Getter

Returns the human-readable metric name.

Type

str

property certifai_metric

Certifai metric type name.

Getter

Returns the Certifai-evaluable metric type (if set).

Type

Optional[str]

class certifai.scanner.builder.CertifaiPredictorWrapper(predictor: certifai.common.hosted_model.IBaseModel, encoder: Optional[Callable[Sequence, Sequence]] = None, decoder: Optional[Callable[Sequence, Sequence]] = None, wrapped: Optional[certifai.common.hosted_model.IHostedModel] = None, soft_predictions: bool = False, label_ordering: Optional[List[Any]] = None, threshold: Optional[float] = None)

Wrapper class for in-process models

Parameters
  • predictor (IBaseModel) – Any predictor object that has a predict method which takes a sequence of data vectors as a numpy array and returns a sequence of corresponding predicted values.

  • encoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the predictor’s input (e.g. - to perform one-hot encoding and so on)

  • decoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the predictor’s output (e.g. - to binarize with a threshold).

  • wrapped (Optional[IHostedModel]) – if specified other parameters are ignored and the wrapper just proxies the (already wrapped) model provided here (mostly intended for internal usage).

  • soft_predictions (bool) – If True the model supports soft scoring for predictions (default False)

  • label_ordering (Optional[List[Any]]) – For soft scoring models the ordering of the classification labels in the scoring vector

  • threshold (Optional[float]) – For binary classifiers whose soft-scores are returned as a 1-dimensional array of scores, one for each input row, the threshold to apply. Scores greater than or equal to the threshold will receive the second label (or 1 rather than 0 if no labels provided)

Note - the underlying model and any encoder and decoders used must be picklable.

property model

Certifai metric type name.

Getter

Returns the wrapped model suitable for use by Certifai.

Type

IHostedModel

class certifai.scanner.builder.CertifaiModelConnector(name: str, module_name: str, class_name: str, description: Optional[str] = None, model_args: Dict[str, str] = {}, model_secrets: Dict[str, str] = {})

Metadata for a model connector

Parameters
  • name (str) – Free text descriptive name of the connector.

  • module_name (str) – python module containingthe external connector (e.g. ‘certifai.connectors’)

  • class_name (str) – name of python class of the connector

  • Optional[str] (description) – Optional description

  • Dict[str,str] (model_secrets) – arguments to pass to the model connector instances

  • Dict[str,str] – secrets to pass to the model connector instances - substrings of the values of the form {<NAME>} will have the <NAME> replaced by the contents of the environment variable of that name

property name

Descriptive name of the connector.

Getter

Returns the connector name.

Type

str

property module_name

Module containing the connector.

Getter

Returns the fully qualified module name.

Type

str

property class_name

Class name of the connector.

Getter

Returns the name of the python class of the connector.

Type

str

property description

Description of the connector.

Getter

Returns the optional description.

Type

str

property model_args

Arguments to instantiated model connector instances.

Getter

Returns the arguments to be provided to connector instances.

Type

Dict[str,str]

property model_secrets

Secrets provided to instantiated model connector instances.

Getter

Returns the secrets to be provided to connector instances.

Type

Dict[str,str]

class certifai.scanner.builder.CertifaiModel(id: str, name: Optional[str] = None, author: Optional[str] = None, version: Optional[str] = None, performance_metric_values: Optional[List[Tuple]] = None, description: Optional[str] = None, predict_endpoint: Optional[str] = None, max_batch_size: Optional[int] = None, local_predictor: Optional[certifai.scanner.builder.CertifaiPredictorWrapper] = None, supports_soft_scoring: bool = False, prediction_value_order: Optional[List[Any]] = None, connector: Optional[certifai.scanner.builder.CertifaiModelConnector] = None, json_strict: bool = False)

Metadata describing a model, and allowing manipulation of this metadata.

Parameters
  • id (str) – Identifier for the model used to refer to it.

  • name (Optional[str]) – Descriptive name for the model. Defaults to the value provided for id

  • author (Optional[str]) – Optional author name.

  • version (Optional[str]) – Optional model version string.

  • performance_metric_values (Optional[List[Tuple]]) – Optional asserted list of (metric_name, value) pairs for metrics of the model - primarily intended to allow injection of externally measured values for metrics not directly supported by Certifai.

  • description (Optional[str]) – Optional free text description of the model.

  • predict_endpoint (Optional[str]) – URL of the prediction endpoint of the model (if non-process-local).

  • max_batch_size (Optional[int]) – Optional limit on prediction batch sizes to call the model with.

  • local_predictor (Optional[CertifaiPredictorWrapper]) – wrapped model object (if using a local in-process model).

  • supports_soft_scoring (bool) – If True model is expected to return soft scores as well as hard predictions

  • prediction_value_order (List[Any[) – For soft scoring models the ordering of the class labels in the score vector

  • connector (Optional[CertifaiModelConnector]) – Optional connector to use to attach to the model

  • json_strict (bool) – If True data will be serialized to send to the model’s predict endpoint in strict JSON, encoding missing data as JSON nulls. If False then JavaScript extended JSON will be used which encodes missing values as NaN. Defaults to False

property name

Model name.

Getter

Returns the human-readable name of the model.

Type

str

property id

Model id.

Getter

Returns the identifier for the model by which it may be referenced.

Type

str

property author

Model author.

Getter

Returns the author string if provided.

Setter

Set author string for the model.

Type

Optional[str]

property version

Model version.

Getter

Returns the version string if provided.

Setter

Set version string for the model.

Type

Optional[str]

property description

Model description.

Getter

Returns the description string if provided.

Setter

Set description string for the model.

Type

Optional[str]

property predict_endpoint

Model predict endpoint URL

Getter

Returns the URL of the (remote) model prediction endpoint, if provided

Setter

Sets the prediction endpoint URL for the model

Type

Optional[str]

property max_batch_size

Model max batch size.

Getter

Returns the max batch size to send to the model.

Setter

Sets the provided restriction on max batch size (None => unlimited).

Type

Optional[int]

property supports_soft_scoring

Whether the model returns soft scores.

Getter

True if the model is expected to upport soft scores.

Setter

Sets whether the model is expected to support soft scores.

Type

bool

property prediction_value_order

Ordering of class labels in the score vector returned by the model.

Getter

Returns the ordering.

Setter

Sets the ordering.

Type

List[Any]

property local_predictor

Wrapped local (in-process) model.

Getter

Returns the wrapped model being used.

Setter

Sets a local wrapped model (see CertifaiPredictorWrapper) to use.

Type

Optional[CertifaiPredictorWrapper]

property performance_metric_values

List of asserted metric values for this model.

Getter

Returns any asserted values as (metric name, value) tuples.

Type

List[Tuple[str,Any]]

add_performance_metric_value(metric_name: str, metric_value: Any)

Add an asserted performance metric value.

Parameters
  • metric_name (str) – Name of the metric to assert a value for.

  • value (Any) – Value to assert.

remove_performance_metric_value(metric_name: str)

Remove an asserted metric value

Parameters

metric_name – name of the metric to remove the asserted value for.

property connector
property json_strict

Whether to encode to this model is strict JSON

Getter

True if the model expects strict JSON (missing encoded as null as opposed to NaN).

Setter

Sets whether the model expects strict JSON

Type

bool

class certifai.scanner.builder.CertifaiFeatureDataType(args: dict)

Class describing feature datatypes supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported type.

property value_dict

Data type details as a dictionary.

Getter

Returns the metadata dict for this datatype. In particular this will contain a key named data_type which will be one of

  • ‘numerical-int’

  • ‘numerical-float’

  • ‘categorical’

Other keys vary by datatype.

Type

dict

static int(min: Optional[int] = None, max: Optional[int] = None, spread: Optional[float] = None) → certifai.scanner.builder.CertifaiFeatureDataType

Constructor for an ‘int’ feature.

Parameters
  • min (Optional[int]) – optional floor value this feature can take.

  • max (Optional[int]) – optional ceiling value this feature can take.

  • spread (Optional[float]) – optional measure of spread (typically MAD or std. deviation).

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

static float(min: Optional[float] = None, max: Optional[float] = None, spread: Optional[float] = None) → certifai.scanner.builder.CertifaiFeatureDataType

Constructor for an ‘float’ feature.

Parameters
  • min (Optional[float]) – optional floor value this feature can take.

  • max (Optional[float]) – optional ceiling value this feature can take.

  • spread (Optional[float]) – optional measure of spread (typically MAD or std deviation).

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

static categorical(values: Optional[Iterable[Union[str, int]]] = None, value_columns: Optional[List[Tuple[str, Union[str, int]]]] = None, target_encodings: Optional[Iterable[float]] = None, categorical_type: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureDataType

Constructor for a ‘categorical’ feature.

Parameters
  • values (Optional[Iterable[Union[str,int]]]) – optional list of possible values this categorical field may take on. If omitted then Certifai will infer the value set from the available data.

  • value_columns (Optional[List[Tuple[str,Union[str,builtins.int]]]]) – optional list of categorical value -> column name mappings for one-hot encoded data. If both value_columns and values are specified then they must have exactly the same set of values. If only value_columns is specified then the values are inferred. If only values is specified then the feature is assumed to be value-encoded in a single column.

  • target_encodings (Optional[Iterable[float]]) – optional list of encodings for the values in values used to represent those values in the dataset

  • categorical_type (Optional[str]) – optional string specifying the data type the categorical feature is. Must be one of: ‘string’, ‘int’, or ‘auto’. For example, specifying ‘string’ would mean that the value 001 will be interpreted as the string ‘001’, instead of as the integer 1.

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

class certifai.scanner.builder.CertifaiFeatureRestriction(args: dict)

Class describing feature change restrictions supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported type.

property value_dict

Data type details as a dictionary.

Getter

Returns the metadata dict for this datatype. In particular this will contain a key named constraint which will be one of

  • ‘constant’

  • ‘percentage’

  • ‘range’

Other keys vary by datatype.

Type

dict

static range(min: Optional[int] = None, max: Optional[int] = None, direction: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a range constraint on feature modifications in counterfactual production.

Parameters
  • min (Optional[int]) – optional floor value this feature can take.

  • max (Optional[int]) – optional ceiling value this feature can take.

  • direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static percentage(amount: float, direction: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a percentage change constraint on feature modifications in counterfactual production.

Parameters
  • amount (float) – max percentage the feature may change by (can only be applied to numeric features).

  • direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static constant() → certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a no-change constraint on feature modifications in counterfactual production.

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static standard_deviation(value: float, tolerance_value: Optional[float] = None, direction: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a standard deviation constraint on feature modifications in counterfactual production.

Parameters
  • value (float) – number of standard deviations the feature may change by (can only be applied to numeric features).

  • tolerance_value (float) – additional number of standard deviations the feature may change by if no solutions could be found (not applicable to all scans)

  • direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’ (not applicable to all scans)

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static fixed_amount(value: float, tolerance_value: Optional[float] = None, direction: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a fixed amount constraint on feature modifications in counterfactual production.

Parameters
  • value (float) – fixed amount that the feature may change by (can only be applied to numeric features).

  • tolerance_value (float) – additional amount the feature may change by if no solutions could be found (not applicable to all scans)

  • direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’ (not applicable to all scans)

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static value_set(values: List[str], tolerance_values: Optional[List[str]] = None) → certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for an allowed value mapping constraint on feature modifications in counterfactual production.

Parameters
  • values (List[str]) – fixed set of values the feature may change to (can only be applied to categorical features).

  • tolerance_values (Optional[List[str]]) – additional values the feature may change to if no solutions could be found

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static value_map(values: Dict[str, List[str]], tolerance_values: Optional[Dict[str, List[str]]] = None)

Constructor for an allowed value mapping constraint on feature modifications in counterfactual production.

Parameters
  • List[str]] values (Dict[str,) – dictionary mapping of categorical values to values the feature may change to (can only be applied to categorical features).

  • List[str]]] tolerance_values (Optional[Dict[str,) – additional dictionary mapping of categorical values to values the feature may change to

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

class certifai.scanner.builder.CertifaiFeatureSchema(name: str, data_type: Optional[certifai.scanner.builder.CertifaiFeatureDataType] = None)

Class describing a feature - immutable once instantiated.

Parameters
  • name (str) – The name of the feature (should match any column headers in the dataset if any).

  • data_type (CertifaiFeatureDataType) – Type of data the feature holds.

property name

Feature name.

Getter

Returns the name of the feature.

Type

str

property data_type

Feature data type

Getter

Returns the data type of the feature.

Type

CertifaiFeatureDataType

class certifai.scanner.builder.CertifaiDataSchema(features: Optional[List[certifai.scanner.builder.CertifaiFeatureSchema]] = None, outcome_feature_name: Optional[str] = None, predicted_outcome_feature_name: Optional[str] = None, hidden_feature_names: List[str] = [], defined_feature_order: bool = False)

Class describing a dataset’s feature schema, and allowing manipulation of this schema.

Parameters
  • features (Optional[List[CertifaiFeatureSchema]]) – features specified by the scan definition. This may be a subset of all the features present. Any that are omitted will be inferred from the available data.

  • outcome_feature_name (Optional[str]) – name of the feature holding the ground truth label/value (if present) Note Any outcome feature column will be removed before passing data to the model.

  • predicted_outcome_feature_name (Optional[str]) – name of the feature holding the predicted label/value (if present) Note Any predicted_outcome feature column will be removed before passing data to the model.

  • defined_feature_order (bool) – If present and True asserts that the list order of features in the schema matches the layout of columns in the dataset. If True then all columns must be present. Intended for use in cases where the dataset does not specify a column ordering itself.

property features

features defined by the schema.

Getter

Returns the list of defined features.

Type

Optional[List[CertifaiFeatureSchema]]

property defined_feature_order

Whether the schema defines the column ordering of the data.

Getter

Returns True if the schema defines the column ordering.

Setter

Sets whether the schema defines the column ordering of the data.

Type

bool

add_feature(name: str, data_type: certifai.scanner.builder.CertifaiFeatureDataType)

Add a feature

Parameters

Note - the feature will be appended to the current list

insert_feature(name: str, index: int, data_type: certifai.scanner.builder.CertifaiFeatureDataType)

Insert a feature.

Parameters
  • name (str) – Name of feature to add.

  • index (int) – Columnar position to insert the feature at (0-based).

  • data_type (CertifaiFeatureDataType) – data type of feature to add.

update_feature(name: str, data_type: certifai.scanner.builder.CertifaiFeatureDataType)

Update an existing feature by name - preserves its index in th feature list

Parameters
  • name (str) – Name of feature to update.

  • data_type (CertifaiFeatureDataType) – new data type of feature being updated.

remove_feature(name: str)

Remove a feature.

Parameters

name (str) – Name of feature to remove.

infer_features_from_data(dataset_source: certifai.scanner.builder.CertifaiDatasetSource)
property outcome_feature_name

Name of the (ground truth) outcome column (if any).

Getter

Returns the feature name of the outcome feature.

Setter

Sets the name of the (ground truth) outcome column.

Type

Optional[str]

property predicted_outcome_feature_name

Name of the predicted outcome column (if any).

Getter

Returns the feature name of the predicted outcome feature.

Setter

Sets the name of the predicted outcome column.

Type

Optional[str]

property hidden_feature_names

Names of hidden (from the model) features (if any).

Getter

Returns a list feature names of features which are not provided to the model.

Setter

Sets a list feature names of features which are not provided to the model.

Type

Optional[str]

Note Any specified outcome_feature_name or predicted_outcome_feature_name will automatically be hidden from the model and need not occur in this list

class certifai.scanner.builder.CertifaiDatasetSource(args)

Class describing dataset storage formats supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported format.

property value_dict

Data source details as a dictionary.

Getter

Returns the metadata dict for this data source. In particular this will contain a key named file_type which will be one of

  • ‘csv’

  • ‘json’

  • ‘loaded’

Other keys vary by source type.

Type

dict

static json(url: str, lines: bool = True, orient: str = 'records', encoding: Optional[str] = None)

Constructor for a ‘json’ source.

Parameters
  • url (str) – Location the data may be loaded from. If no protocol is specified then ‘file:’ is assumed.

  • lines (bool) – If True then JSON lines format (default is True), else JSON list expected.

  • orient (str) – One of ‘records’, ‘columns’, ‘values’ (matching Pandas usage). Default is ‘records’

  • encoding (Optional[str]) – string encoding used - default is ‘utf-8’.

Returns

instantiated DatasetSource

Return type

DatasetSource

static csv(url: str, delimiter: str = ', ', escape_character: Optional[str] = None, quote_character: str = '"', has_header: bool = True, encoding: Optional[str] = None)

Constructor for a ‘csv’ source.

Parameters
  • url (str) – Location the data may be loaded from. If no protocol is specified then ‘file:’ is assumed.

  • delimiter (str) – Record separator used. Default is ‘,’.

  • escape_character (Optional[str]) – Escape character if any. Default is None.

  • quote_character (str) – Quote delimiter. Default is ‘”’.

  • has_header (bool) – Whether the source CSV has a header row specifying column names. Default is True.

Returns

instantiated DatasetSource

Return type

DatasetSource

static dataframe(df)

Constructor for a ‘dataframe’ source (an already loaded Pandas dataframe).

Parameters

df (DataFrame) – Dataframe containing the data.

Returns

instantiated DatasetSource.

Return type

DatasetSource

class certifai.scanner.builder.CertifaiDataset(id: str, source: certifai.scanner.builder.CertifaiDatasetSource, name: Optional[str] = None, description: Optional[str] = None)

Metadata describing a dataset.

Parameters
  • id (str) – identifier string by which the dataset may be referenced.

  • source (DatasetSource) – source for the actual data in the dataset.

  • name (Optional[str]) – Optional human readable name of the dataset.

  • description (Optional[str]) – Optional free text description of the dataset.

class certifai.scanner.builder.CertifaiGroupingBucket(description: str, max: Optional[float] = None, values: Optional[List[Union[str, int]]] = None)

Metadata describing a value grouping bucket for feature values - immutable once instantiated.

Parameters
  • description (str) – Descriptive name of the bucket.

  • max (Optional[float]) – Optional maximum numerical value in the bucket (may only be used with numeric features).

  • values (Optional[List[Union[str,int]]]) – Optional explicit list of values falling within the bucket (primarily intended for use with categorical features).

property description

Description of the bucket.

Getter

Returns the bucket description string.

Type

str

property max

Maximum numeric value falling within the bucket (if specified).

Getter

Returns the bucket ceiling value.

Note - the floor of a bucket is determined by the ceiling of the previous bucket. The entire bucket list will be sorted on max values and a sentinel bucket with no maximum value should be included in the list. :type: Optional[float]

property values

List of values falling within the bucket if defined.

Getter

Returns the list of values or None if not defined.

Type

Iterable[Union[str,int]]

class certifai.scanner.builder.CertifaiGroupingFeature(name: str, buckets: Optional[Iterable[certifai.scanner.builder.CertifaiGroupingBucket]] = None)

Metadata describing a fairness grouping feature - immutable once instantiated

Parameters
  • name (str) – Feature name of the feature which defines the grouping.

  • buckets (Optional[Iterable[CertifaiGroupingBucket]]) – Optional definition for bucketing the values of the feature. If not specified then every unique value occurring in the data will be treated as its own group.

property name

Name of the grouping feature

Getter

Returns the feature name used to define the groups

Type

str

property buckets

List of grouping buckets.

Getter

Returns the list of grouping buckets for the grouping feature, if defined.

Type

Iterable[CertifaiGroupingBucket]

class certifai.scanner.builder.CertifaiScanBuilder(base: certifai.scanner.schemas.ScanTemplate)

Builder class for scan templates, with static method for instantiation, and methods for manipulation, persistence, and running of the defined scan.

property model_headers

Returns model headers defined in the scan.

Getter

Returns model headers as dict

Type

Dict

add_model_header(header_name: str, header_value: str, model_id: Optional[str] = None)

Add or Update a model header. If model_id is provided then model header is added to the specific model otherwise header is set as a default for all models.

Parameters
  • header_name (str) – Name of the model header to inject.

  • header_value (str) – Value associated with the header.

  • model_id (Optional[str]) – model to add/update headers given model_id

remove_model_header(header_name: str, model_id: Optional[str] = None)

Remove a model header given header_name. If model_id is provided then the header is removed for that specific model, otherwise it is removed from all models (default case).

Parameters
  • header_name (str) – name of the header to remove .

  • model_id (Optional[str]) – model to remove headers from

property template

Retrieve a scan template which can be serialized to dictionary form for saving as JSON or YAML by calling its dump method.

Return ScanTemplate

a ScanTemplate instance

property author

Author of the template.

Getter

Returns the author of the scan template, if defined.

Setter

Sets the author of the scan template.

Type

Optional[str]

property use_case_name

Use case name - human readable name for a use case (i.e. - a prediction task).

Getter

Returns the use case name.

Setter

Sets the use case name.

Type

str

property use_case_id

Use case id - id by which the use case may be referenced.

Getter

Returns the use case id.

Setter

Sets the use case id.

Type

str

property evaluation_name

Evaluation name - name of a particular evaluation (scan run).

Getter

Returns the evaluation name.

Setter

Sets the evaluation name.

Type

str

property evaluation_environment

Evaluation environment - free text string with evaluation environment details.

Getter

Returns the evaluation environment string.

Setter

Sets the evaluation environment string.

Type

str

property evaluation_description

Evaluation description - free text string describing the evaluation.

Getter

Returns the evaluation description string.

Setter

Sets the evaluation description string.

Type

str

property evaluation_dataset_id

Evaluation dataset id - specifies which dataset to use as the evaluation set.

Getter

Returns the evaluation dataset id.

Setter

Sets the evaluation dataset id.

Type

str

property explanation_dataset_id

Explanation dataset id - specifies which dataset to generate explanations of if the ‘explanation’ evaluation type is included in the scan.

Getter

Returns the explanation dataset id.

Setter

Sets the explanation dataset id.

Type

Optional[str]

property test_dataset_id

Test dataset id - specifies which dataset to measure metrics on if the ‘performance’ evaluation type is included in the scan.

Getter

Returns the test dataset id.

Setter

Sets the test dataset id.

Type

Optional[str]

property prediction_task

Metadata for the prediction task.

Getter

Returns the prediction task metadata.

Setter

Sets the prediction task metadata.

Type

CertifaiPredictionTask

property output_path

Output path to which reports will be written. If set to None output will be to ‘./reports’ relative to the scan base path unless explicitly overriden either by the run call or by the SCAN_RESULTS_DIRECTORY environment variable

Getter

Returns the output path.

Setter

Sets the output path of the scan.

Type

Optional[str]

add_evaluation_type(value: str)

Add an evaluation type to the scan.

Parameters

value (str) – type of evaluation to add. Must be one of - ‘fairness’ - ‘robustness’ - ‘explanation’ - ‘explainability’ - ‘performance’

remove_evaluation_type(value: str)

Remove an evaluation type from the scan.

Parameters

value (str) – type of evaluation to remove.

property evaluation_types

Evaluation types included in the scan.

Getter

Returns the list of included evaluation types.

Type

Iterable[str]

property hyper_parameter_overrides

Hyper-parameter overrides to apply to the analysis.

Getter

Returns a dictionary of hyper-parameter overrides.

Setter

Specifies a dictionary of hyper-parameter overrides.

Type

dict

add_fairness_grouping_feature(feature: certifai.scanner.builder.CertifaiGroupingFeature)

Add a fairness grouping feature.

Parameters

feature (CertifaiGroupingFeature) – grouping feature definition.

remove_fairness_grouping_feature(name: str)

Remove a fairness grouping feature.

Parameters

name (str) – name of the grouping feature to remove.

property fairness_grouping_features

Fairness grouping features defined for the scan.

Getter

Returns a list of defined grouping features.

Type

List[CertifaiGroupingFeature]

property metrics

Performance metrics defined for the scan.

Getter

Returns a list of defined performance metrics.

Type

List[ModelMetric]

add_metric(metric: certifai.scanner.builder.CertifaiModelMetric)

Add a performance metric.

Parameters

metric (CertifaiModelMetric) – metric to add.

remove_metric(name: str)

Remove a performance metric

Parameters

name (str) – Name of the metric to remove

property explanation_types

Explanation types defined for the scan.

Getter

Returns a list of defined explanation types.

Type

List[str]

add_explanation_type(explanation: str)

Add a explanation type.

Parameters

explanation (str) – explanation type to add.

remove_explanation_type(name: str)

Remove an explanation type

Parameters

name (str) – Name of the explanation type to remove

property primary_explanation_type

Explanation type to select for the explainability axis of the ATX score.

Getter

Returns the name of the selected explanation type for use in ATX calculation.

Setter

Sets the name of the selected explanation type for use in ATX calculation.

Accepts an str specifying the explanation type on set, returns str of the explanation type on get.

property fairness_metrics

Fairness metrics defined for the scan.

Getter

Returns a list of defined fairness metrics.

Type

List[str]

add_fairness_metric(metric: str)

Add a fairness metric.

Parameters

metric (str) – metric to add.

remove_fairness_metric(name: str)

Remove a fairness metric

Parameters

name (str) – Name of the metric to remove

property primary_fairness_metric

The fairness metric to use as the Fairness aspect for calculating the ATX score.

Getter

Returns the name of the selected fairness metric for use in ATX calculation.

Setter

Sets the name of the selected fairness metric for use in ATX calculation.

Accepts an Optional[str] specifying the primary fairness metric (if any) on set, returns Optional[str] of the fairness metric instance on get.

property atx_performance_metric

Metric to select for the performance axis of the ATX score.

Getter

Returns the name of the selected performance metric for use in ATX calculation.

Setter

Sets the name of the selected performance metric for use in ATX calculation.

Accepts an Optional[str] specifying the performance metric name (if any) on set, returns Optional[CertifaiModelMetric] of the metric instance on get.

add_model(model: certifai.scanner.builder.CertifaiModel)

Add a model to the scan.

Parameters

model (CertifaiModel) – metadata of the model to add.

remove_model(id: str)

Remove a model from the scan.

Parameters

id (str) – Removes the model with the specified id.

Returns

property models

Models included in the scan.

Getter

Returns a list of included models.

Type

List[CertifaiModel]

add_dataset(dataset: certifai.scanner.builder.CertifaiDataset)

Add a dataset.

Parameters

dataset (CertifaiDataset) – the dataset to add.

remove_dataset(dataset_id: str)

Remove a dataset.

Parameters

dataset_id (str) – Dataset to remove (by id).

property datasets

Datasets defined by the scan.

Getter

Returns a list of defined datasets.

Type

List[CertifaiDataset]

property dataset_schema

Dataset schema used by the scan use case.

Getter

Returns the dataset schema.

Setter

Sets the dataset schema.

Type

CertifaiDataSchema

property feature_restrictions

Get restrictions on feature changes made during counterfactual production

Returns

dictionary of restrictions keyed on feature name

Return type

Dict[str,CertifaiFeatureRestriction]

add_feature_restriction(feature_name: str, restriction: certifai.scanner.builder.CertifaiFeatureRestriction)

Add a restriction on the changes that can be made to a feature during counterfactual production.

Parameters
remove_feature_restriction(feature_name: str)

Remove a restriction on the changes that can be made to a feature during counterfactual production

Parameters

feature_name (str) – feature to de-restrict

extract_yaml() → str

Extract the scan as a YAML definition.

Returns

string containing the scan template encoded as YAML.

Return type

str

save(file)

Save the scan template to a file.

Parameters

file – file object opened for write to which the definition is to be saved.

run_preflight(model_id: Optional[str] = None, base_path: Optional[str] = None, callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>, refresh: bool = True)

Run the preflight scan (in-process).

Parameters
  • model_id (Optional[str]) – Optional specific model id to restrict the scan to

  • base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).

  • callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as preflight checks are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. This is not applicable when refresh is False. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.

  • refresh (bool) – If True all preflight checks will be run and the latest results will be returned. Otherwise, the results will be computed from existing preflight report data. Defaults to True.

Returns

a nested dictionary of messages produced during the preflight scan. The top level keys are model ids, second level keys is the message type, within which is a list of strings

Return type

dict

run_explain(precalculate: bool = False, fast: bool = False, sampling: bool = False, model_id: Optional[str] = None, base_path: Optional[str] = None, explanation_format: str = 'csv', callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>, **kwargs)

Run an explanation scan (in-process).

Parameters
  • precalculate (bool) – If True then precalculation of baselines for the model/usecase will be calculated and stored for use in fast explanations. Defaults to False.

  • fast (bool) – If True then fast explanations will be used, which is suitable for bulk-explanation of large datasets. Fast explanation requires the precalculate step to have been performed for the model and use case previously (or in the same call). Defaults to False.

  • sampling (bool) – If true then Counterfactual Sampling will be used. This is suitable for use-cases that have a large representative evaluation dataset. Defaults to False.

  • model_id (Optional[str]) – Optional specific model id to restrict the scan to

  • base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).

  • explanation_format (str) – Format in which to write the explanations, must be one of: ‘csv’, ‘jsonlines’, ‘inline’. If either ‘csv’ or ‘jsonlines’, then explanations will be written in a separate file and the filename will be specified in the scan report. If ‘inline’ the explanations will be included in the scan report. This is not applicable when precalculate is True. Defaults to ‘csv’.

  • callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as evalutions are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.

Returns

a nested dictionary. If precalculate is True, the top level keys are the model ids and each value dictionary with a status, a possible error message, and the location for the persisted calculations. Otherwise, the top level keys are the evaluation type and second level keys are the model ids, within which is the report JSON represented in dictionary format.

Return type

dict

run(model_id: Optional[str] = None, report: Optional[str] = None, write_reports: bool = True, base_path: Optional[str] = None, callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>)

Run the scan (in-process).

Parameters
  • model_id (Optional[str]) – Optional specific model id to restrict the scan to.

  • report (Optional[str]) – Optional specific report (evaluation type) to restrict the scan to.

  • write_reports (bool) – Whether to write report files for each model evaluation to the scan’s output directory (by default ‘./reports’ relative to base_path). Default is True.

  • base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).

  • callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as evaluations are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.

Returns

nested dictionary of reports. Top level keys are the evaluation type, second level keys are the model ids, within which is the report JSON represented in dictionary format.

Return type

dict

static from_file(filename: str)

Load a scan template from file.

Parameters

filename (str) – path to template file to read.

Returns

Instantiated ScanBuilder with metadata from the template that was read.

Return type

ScanTemplate

static from_yaml(as_yaml: str)

Load a scan template from file.

Parameters

as_yaml (str) – Definition to load as YAML string.

Returns

Instantiated ScanBuilder with metadata from the template that was read.

Return type

ScanTemplate

static create(use_case_name: str, use_case_id: Optional[str] = None, evaluation_name: Optional[str] = None, environment: Optional[str] = None, description: Optional[str] = None, prediction_task: certifai.scanner.builder.CertifaiPredictionTask = <certifai.scanner.builder.CertifaiPredictionTask object>, output_path: Optional[str] = None)

Create a new template builder.

Parameters
  • use_case_name (str) – Name of the prediction use case.

  • use_case_id (Optional[str]) – Id by which the use case will be referenced. Defaults to the name if omitted.

  • evaluation_name (Optional[str]) – Name of the evaluation. Defaults to the use case name if not provided.

  • environment (Optional[str]) – Optional opaque string recording scan environment information.

  • description (Optional[str]) – Optional human readable description of the use case.

  • prediction_task (str) – Prediction task metadata.

  • output_path (Optional[str]) – where to write report files to. If a relative path evaluated with respect to the base path at evaluation time. If omitted, reports will be written to ‘./reports’.

Returns

Instantiated ScanBuilder with metadata from the template that was read.

Return type

ScanTemplate