certifai.scanner.builder module¶

Certifai Scan object model builder

Contains classes representing a Certifai scan definition, and the ability to programmatically manipulate, load, save and run them.

class certifai.scanner.builder.CertifaiOutcomeValue(value: Any, name: Optional[str] = None, favorable: bool = False)¶: Outcome value for a classification task type.

class certifai.scanner.builder.CertifaiTaskOutcomes¶

Union type representing the outcomes of a task by type (possible classes for classification, favorable direction for regression)

property task_type: certifai.engine.engine_api_types.CertifaiTaskType¶

property prediction_favorability: str¶

static regression(increased_favorable: Optional[bool], change_std_deviation: Optional[float] = None, absolute_threshold: Optional[float] = None, absolute_percentile: Optional[float] = None)¶

Construct a regression task type

Favorability may be specified in one of three ways (only one of which may be specified):

As a relative increase [or decrease] by a multiple of the population global regressed value standard deviation

As an absolute threshold, specified as an exact value of the regressor output

As an absolute threshold, specified as a percentile of the population global regressed value empirical distribution

Parameters

increased_favorable (Optional[bool]) – True if the favorable direction of the prediction is increasing. Can be set to None if there is no favorable direction.
change_std_deviation (Optional[float]) – Number of standard deviations considered to be a significant change
absolute_threshold (Optional[float]) – Absolute regressed value threshold for favorability
absolute_percentile (Optional[float]) – Absolute regressed value threshold for favorability expressed as a population percentile

Returns

CertifaiTaskType instance representing the regression outcome definition

static classification(prediction_values: Iterable[certifai.scanner.builder.CertifaiOutcomeValue], prediction_favorability: Optional[str] = 'explicit', last_favorable_prediction: Optional[Any] = None, favorable_outcome_group_name: Optional[str] = None, unfavorable_outcome_group_name: Optional[str] = None)¶

Construct a classification task type

Parameters

prediction_values (Iterable[CertifaiClassificationPrediction]) – list of possible classes
prediction_favorability (Optional[str]) –
describes the favorability of the prediction_values, default ‘explicit’. Must be one of
1. ’explicit’, predictions should be explicitly marked as favorable
2. ’ordered’, predictions are ordered from most to least favorable
3. ’none’, no prediction should be treated as favorable
last_favorable_prediction (Optional[Any]) – ignored unless the prediction_favorability is ‘ordered’, in which case this value should be the last label (in the ordering of the prediction_values which is considered favorable)
favorable_outcome_group_name (Optional[str]) – name of favorable group of prediction values - reserved for multiclass-classification task’s with a prediction_favorablity of ‘explicit’
unfavorable_outcome_group_name (Optional[str]) – name of unfavorable groups of prediction values - reserved for multiclass-classification task’s with a prediction_favorability of ‘explicit’

Returns

CertifaiTaskType instance representing the classification outcome definition

class certifai.scanner.builder.CertifaiPredictionTask(outcomes: certifai.scanner.builder.CertifaiTaskOutcomes, prediction_description: Optional[str] = None)¶

Metadata about the prediction task - immutable once instantiated.

Parameters

outcomes (CertifaiTaskOutcomes) – One of the supported CertifaiTaskOutcomes types, constructed from the static methods on CertifaiTaskOutcomes.
prediction_description (Optional[str]) – Free text description of what is being predicted.

property task_type: str¶

The task type string (‘binary_classification’, ‘multiclass-classification’, ‘regression’)

Getter: Returns the task type.
Type: str

property prediction_description: Optional[str]¶

Description of what the prediction represents.

Getter: Returns the description, if any.
Type: Optional[str]

property prediction_favorability: Optional[str]¶

What format is used for specifying the favorable prediction value, if any, (‘none’, ‘ordered’, ‘explicit’).

Getter: Returns prediction favorability
Type: Optional[str]

property favorable_outcome: Optional[str]¶

What the favorable outcome direction is for a regression task, None otherwise.

Getter: Returns the favorable label direction (regression) if set.
Type: Optional[Any]

property prediction_values: List[certifai.scanner.builder.CertifaiOutcomeValue]¶

property last_favorable_prediction: Optional[Any]¶

property regression_standard_deviation: Optional[float]¶

property regression_absolute_threshold: Optional[float]¶

property regression_absolute_percentile: Optional[float]¶

property favorable_outcome_group_name: Optional[str]¶

The string name of the favorable group of prediction values - reserved for multiclass-classification with prediction_favorability of ‘explicit’ - None otherwise.

Getter: Return name of favorable group of prediction values
Type: Optional[str]

property unfavorable_outcome_group_name: Optional[str]¶

The string name of the unfavorable group of prediction values - reserved for multiclass-classification with prediction_favorability of ‘explicit’ - None otherwise.

Getter: Return name of unfavorable group of prediction values
Type: Optional[str]

class certifai.scanner.builder.CertifaiModelMetric(name: str, certifai_metric: Optional[str] = None)¶

Metadata for a metric - immutable once instantiated

Parameters

name (str) – Free text descriptive name of the metric.
certifai_metric (Optional[str]) –
If specified will allow Certifai to calculate the value. Supported values are:
- ’accuracy’ (classification)
- ’precision’ (classification)
- ’recall’ (classification)
- ’f1’ (classification)
- ’r-squared’ (regression)
Micro and macro variants are also supported for precision, recall and f1 e.g. ‘f1(micro)’

property name: str¶

Descriptive name of the metric.

Getter: Returns the human-readable metric name.
Type: str

property certifai_metric: Optional[str]¶

Certifai metric type name.

Getter: Returns the Certifai-evaluable metric type (if set).
Type: Optional[str]

class certifai.scanner.builder.CertifaiPredictorWrapper(predictor: certifai.common.hosted_model.IBaseModel, encoder: Optional[Callable[Sequence, Sequence]] = None, decoder: Optional[Callable[Sequence, Sequence]] = None, wrapped: Optional[certifai.common.hosted_model.IHostedModel] = None, soft_predictions: bool = False, label_ordering: Optional[List[Any]] = None, threshold: Optional[float] = None)¶

Wrapper class for in-process models

Note - the underlying model and any encoder and decoders used must be picklable.

Parameters

predictor (IBaseModel) – Any predictor object that has a predict method which takes a sequence of data vectors as a numpy array and returns a sequence of corresponding predicted values.
encoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the predictor’s input (e.g. - to perform one-hot encoding and so on)
decoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the predictor’s output (e.g. - to binarize with a threshold).
wrapped (Optional[IHostedModel]) – if specified other parameters are ignored and the wrapper just proxies the (already wrapped) model provided here (mostly intended for internal usage).
soft_predictions (bool) – If True the model supports soft scoring for predictions (default False)
label_ordering (Optional[List[Any]]) – For soft scoring models the ordering of the classification labels in the scoring vector
threshold (Optional[float]) – For binary classifiers whose soft-scores are returned as a 1-dimensional array of scores, one for each input row, the threshold to apply. Scores greater than or equal to the threshold will receive the second label (or 1 rather than 0 if no labels provided)

property model: certifai.common.hosted_model.IHostedModel¶

Certifai metric type name.

Getter: Returns the wrapped model suitable for use by Certifai.
Type: IHostedModel

class certifai.scanner.builder.CertifaiModelConnector(name: str, module_name: str, class_name: str, description: Optional[str] = None, model_args: Dict[str, str] = {}, model_secrets: Dict[str, str] = {})¶

Metadata for a model connector

Parameters

name (str) – Free text descriptive name of the connector.
module_name (str) – python module containingthe external connector (e.g. ‘certifai.connectors’)
class_name (str) – name of python class of the connector
Optional[str] (description) – Optional description
Dict[str,str] (model_secrets) – arguments to pass to the model connector instances
Dict[str,str] – secrets to pass to the model connector instances - substrings of the values of the form {<NAME>} will have the <NAME> replaced by the contents of the environment variable of that name

property name: str¶

Descriptive name of the connector.

Getter: Returns the connector name.
Type: str

property module_name: str¶

Module containing the connector.

Getter: Returns the fully qualified module name.
Type: str

property class_name: str¶

Class name of the connector.

Getter: Returns the name of the python class of the connector.
Type: str

property description: str¶

Description of the connector.

Getter: Returns the optional description.
Type: str

property model_args: Dict[str, str]¶

Arguments to instantiated model connector instances.

Getter: Returns the arguments to be provided to connector instances.
Type: Dict[str,str]

property model_secrets: Dict[str, str]¶

Secrets provided to instantiated model connector instances.

Getter: Returns the secrets to be provided to connector instances.
Type: Dict[str,str]

class certifai.scanner.builder.CertifaiModel(id: str, name: Optional[str] = None, author: Optional[str] = None, version: Optional[str] = None, performance_metric_values: Optional[List[Tuple]] = None, description: Optional[str] = None, predict_endpoint: Optional[str] = None, max_batch_size: Optional[int] = None, local_predictor: Optional[certifai.scanner.builder.CertifaiPredictorWrapper] = None, supports_soft_scoring: bool = False, prediction_value_order: Optional[List[Any]] = None, connector: Optional[certifai.scanner.builder.CertifaiModelConnector] = None, json_strict: bool = False)¶

Metadata describing a model, and allowing manipulation of this metadata.

Parameters

id (str) – Identifier for the model used to refer to it.
name (Optional[str]) – Descriptive name for the model. Defaults to the value provided for id
author (Optional[str]) – Optional author name.
version (Optional[str]) – Optional model version string.
performance_metric_values (Optional[List[Tuple]]) – Optional asserted list of (metric_name, value) pairs for metrics of the model - primarily intended to allow injection of externally measured values for metrics not directly supported by Certifai.
description (Optional[str]) – Optional free text description of the model.
predict_endpoint (Optional[str]) – URL of the prediction endpoint of the model (if non-process-local).
max_batch_size (Optional[int]) – Optional limit on prediction batch sizes to call the model with.
local_predictor (Optional[CertifaiPredictorWrapper]) – wrapped model object (if using a local in-process model).
supports_soft_scoring (bool) – If True model is expected to return soft scores as well as hard predictions
prediction_value_order (List[Any]) – For soft scoring models the ordering of the class labels in the score vector
connector (Optional[CertifaiModelConnector]) – Optional connector to use to attach to the model
json_strict (bool) – If True data will be serialized to send to the model’s predict endpoint in strict JSON, encoding missing data as JSON nulls. If False then JavaScript extended JSON will be used which encodes missing values as NaN. Defaults to False

property name: str¶

Model name.

Getter: Returns the human-readable name of the model.
Type: str

property id: str¶

Model id.

Getter: Returns the identifier for the model by which it may be referenced.
Type: str

property author: Optional[str]¶

Model author.

Getter: Returns the author string if provided.
Setter: Set author string for the model.
Type: Optional[str]

property version: Optional[str]¶

Model version.

Getter: Returns the version string if provided.
Setter: Set version string for the model.
Type: Optional[str]

property description: Optional[str]¶

Model description.

Getter: Returns the description string if provided.
Setter: Set description string for the model.
Type: Optional[str]

property predict_endpoint: Optional[str]¶

Model predict endpoint URL

Getter: Returns the URL of the (remote) model prediction endpoint, if provided
Setter: Sets the prediction endpoint URL for the model
Type: Optional[str]

property max_batch_size: Optional[int]¶

Model max batch size.

Getter: Returns the max batch size to send to the model.
Setter: Sets the provided restriction on max batch size (None => unlimited).
Type: Optional[int]

property supports_soft_scoring: bool¶

Whether the model returns soft scores.

Getter: True if the model is expected to upport soft scores.
Setter: Sets whether the model is expected to support soft scores.
Type: bool

property prediction_value_order: bool¶

Ordering of class labels in the score vector returned by the model.

Getter: Returns the ordering.
Setter: Sets the ordering.
Type: List[Any]

property local_predictor: Optional[certifai.scanner.builder.CertifaiPredictorWrapper]¶

Wrapped local (in-process) model.

Getter: Returns the wrapped model being used.
Setter: Sets a local wrapped model (see CertifaiPredictorWrapper) to use.
Type: Optional[CertifaiPredictorWrapper]

property performance_metric_values: List[Tuple[str, Any]]¶

List of asserted metric values for this model.

Getter: Returns any asserted values as (metric name, value) tuples.
Type: List[Tuple[str,Any]]

add_performance_metric_value(metric_name: str, metric_value: Any)¶

Add an asserted performance metric value.

Parameters

metric_name (str) – Name of the metric to assert a value for.
value (Any) – Value to assert.

remove_performance_metric_value(metric_name: str)¶

Remove an asserted metric value

Parameters: metric_name – name of the metric to remove the asserted value for.

property connector: Optional[certifai.scanner.builder.CertifaiModelConnector]¶

property json_strict: bool¶

Whether to encode to this model is strict JSON

Getter: True if the model expects strict JSON (missing encoded as null as opposed to NaN).
Setter: Sets whether the model expects strict JSON
Type: bool

class certifai.scanner.builder.CertifaiFeatureDataType(args: dict)¶

Class describing feature datatypes supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported type.

property value_dict: dict¶

Data type details as a dictionary.

Getter

Returns the metadata dict for this datatype. In particular this will contain a key named data_type which will be one of

‘numerical-int’

‘numerical-float’

‘categorical’

Other keys vary by datatype.

Type

dict

static int(min: Optional[int] = None, max: Optional[int] = None, spread: Optional[float] = None) → certifai.scanner.builder.CertifaiFeatureDataType¶

Constructor for an ‘int’ feature.

Parameters

min (Optional[int]) – optional floor value this feature can take.
max (Optional[int]) – optional ceiling value this feature can take.
spread (Optional[float]) – optional measure of spread (typically MAD or std. deviation).

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

static float(min: Optional[float] = None, max: Optional[float] = None, spread: Optional[float] = None) → certifai.scanner.builder.CertifaiFeatureDataType¶

Constructor for an ‘float’ feature.

Parameters

min (Optional[float]) – optional floor value this feature can take.
max (Optional[float]) – optional ceiling value this feature can take.
spread (Optional[float]) – optional measure of spread (typically MAD or std deviation).

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

static categorical(values: Optional[Iterable[Union[str, int]]] = None, value_columns: Optional[List[Tuple[str, Union[str, int]]]] = None, target_encodings: Optional[Iterable[float]] = None, categorical_type: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureDataType¶

Constructor for a ‘categorical’ feature.

Parameters

values (Optional[Iterable[Union[str,int,bool]]]) – Optional list of possible values this categorical field may take on. If omitted then Certifai will infer the value set from the available data.
value_columns (Optional[List[Tuple[str, Union[str, builtins.int, bool]]]]) –
Optional list of column name and categorical value pairs, for one-hot encoded data. If both value_columns and values are specified then they must have exactly the same

set of values. If only value_columns is specified then the values are inferred. If only values is specified then the feature is assumed to be value-encoded in a single column.
target_encodings (Optional[Iterable[float]]) – optional list of encodings for the values in values used to represent those values in the dataset
categorical_type (Optional[str]) – optional string specifying the data type the categorical feature is. Must be one of: ‘string’, ‘int’, or ‘auto’. For example, specifying ‘string’ would mean that the value 001 will be interpreted as the string ‘001’, instead of as the integer 1.

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

class certifai.scanner.builder.CertifaiFeatureRestriction(args: dict)¶

Class describing feature change restrictions supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported type.

property value_dict: dict¶

Data type details as a dictionary.

Getter

Returns the metadata dict for this datatype. In particular this will contain a key named constraint which will be one of

‘constant’

‘percentage’

‘range’

Other keys vary by datatype.

Type

dict

static range(min: Optional[int] = None, max: Optional[int] = None, direction: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureRestriction¶

Constructor for a range constraint on feature modifications in counterfactual production.

Parameters

min (Optional[int]) – optional floor value this feature can take.
max (Optional[int]) – optional ceiling value this feature can take.
direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static percentage(amount: float, direction: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureRestriction¶

Constructor for a percentage change constraint on feature modifications in counterfactual production.

Parameters

amount (float) – max percentage the feature may change by (can only be applied to numeric features).
direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static constant() → certifai.scanner.builder.CertifaiFeatureRestriction¶

Constructor for a no-change constraint on feature modifications in counterfactual production.

Returns: instantiated CertifaiFeatureRestriction
Return type: CertifaiFeatureRestriction

static standard_deviation(value: float, tolerance_value: Optional[float] = None, direction: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureRestriction¶

Constructor for a standard deviation constraint on feature modifications in counterfactual production.

Parameters

value (float) – number of standard deviations the feature may change by (can only be applied to numeric features).
tolerance_value (float) – additional number of standard deviations the feature may change by if no solutions could be found (not applicable to all scans)
direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’ (not applicable to all scans)

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static fixed_amount(value: float, tolerance_value: Optional[float] = None, direction: Optional[str] = None) → certifai.scanner.builder.CertifaiFeatureRestriction¶

Constructor for a fixed amount constraint on feature modifications in counterfactual production.

Parameters

value (float) – fixed amount that the feature may change by (can only be applied to numeric features).
tolerance_value (float) – additional amount the feature may change by if no solutions could be found (not applicable to all scans)
direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’ (not applicable to all scans)

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static value_set(values: List[Union[str, int]], tolerance_values: Optional[List[Union[str, int]]] = None) → certifai.scanner.builder.CertifaiFeatureRestriction¶

Constructor for an allowed value mapping constraint on feature modifications in counterfactual production.

Parameters

values (List[Union[str, builtins.int, bool]]) – fixed set of values the feature may change to (can only be applied to categorical features).
tolerance_values (Optional[List[Union[str, builtins.int, bool]]]) – additional values the feature may change to if no solutions could be found

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static value_map(values: Dict[Union[str, int], List[Union[str, int]]], tolerance_values: Optional[Dict[Union[str, int], List[Union[str, int]]]] = None)¶

Constructor for an allowed value mapping constraint on feature modifications in counterfactual production.

Parameters

values (Dict[Union[str, builtins.int, bool], List[Union[str, builtins.int, bool]]]) – Dictionary mapping of categorical values to values the feature may change to (can only be applied to categorical features).
tolerance_values (Optional[Dict[Union[str, builtins.int, bool], List[Union[str, builtins.int, bool]]]]) – Additional dictionary mapping of categorical values to values the feature may change to.

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

class certifai.scanner.builder.CertifaiFeatureSchema(name: str, data_type: Optional[certifai.scanner.builder.CertifaiFeatureDataType] = None)¶

Class describing a feature - immutable once instantiated.

Parameters

name (str) – The name of the feature (should match any column headers in the dataset if any).
data_type (CertifaiFeatureDataType) – Type of data the feature holds.

property name: str¶

Feature name.

Getter: Returns the name of the feature.
Type: str

property data_type: Optional[certifai.scanner.builder.CertifaiFeatureDataType]¶

Feature data type

Getter: Returns the data type of the feature.
Type: CertifaiFeatureDataType

class certifai.scanner.builder.CertifaiDataSchema(features: Optional[List[certifai.scanner.builder.CertifaiFeatureSchema]] = None, outcome_feature_name: Optional[str] = None, predicted_outcome_feature_name: Optional[str] = None, hidden_feature_names: Optional[List[str]] = None, defined_feature_order: bool = False)¶

Class describing a dataset’s feature schema, and allowing manipulation of this schema.

Parameters

features (Optional[List[CertifaiFeatureSchema]]) – features specified by the scan definition. This may be a subset of all the features present. Any that are omitted will be inferred from the available data.
outcome_feature_name (Optional[str]) – name of the feature holding the ground truth label/value (if present) Note Any outcome feature column will be removed before passing data to the model.
predicted_outcome_feature_name (Optional[str]) – name of the feature holding the predicted label/value (if present) Note Any predicted_outcome feature column will be removed before passing data to the model
hidden_feature_names (Optional[List[str]]) – list of feature names that should be hidden from the model
defined_feature_order (bool) – If present and True asserts that the list order of features in the schema matches the layout of columns in the dataset. If True then all columns must be present. Intended for use in cases where the dataset does not specify a column ordering itself.

property features: Optional[List[certifai.scanner.builder.CertifaiFeatureSchema]]¶

features defined by the schema.

Getter: Returns the list of defined features.
Type: Optional[List[CertifaiFeatureSchema]]

property defined_feature_order: bool¶

Whether the schema defines the column ordering of the data.

Getter: Returns True if the schema defines the column ordering.
Setter: Sets whether the schema defines the column ordering of the data.
Type: bool

add_feature(name: str, data_type: certifai.scanner.builder.CertifaiFeatureDataType)¶

Add a feature

Parameters

name (str) – Name of feature to add.
data_type (CertifaiFeatureDataType) – data type of feature to add.

Note - the feature will be appended to the current list

insert_feature(name: str, index: int, data_type: certifai.scanner.builder.CertifaiFeatureDataType)¶

Insert a feature.

Parameters

name (str) – Name of feature to add.
index (int) – Columnar position to insert the feature at (0-based).
data_type (CertifaiFeatureDataType) – data type of feature to add.

update_feature(name: str, data_type: certifai.scanner.builder.CertifaiFeatureDataType)¶

Update an existing feature by name - preserves its index in th feature list

Parameters

name (str) – Name of feature to update.
data_type (CertifaiFeatureDataType) – new data type of feature being updated.

remove_feature(name: str)¶

Remove a feature.

Parameters: name (str) – Name of feature to remove.

infer_features_from_data(dataset_source: certifai.scanner.builder.CertifaiDatasetSource)¶

property outcome_feature_name: Optional[str]¶

Name of the (ground truth) outcome column (if any).

Getter: Returns the feature name of the outcome feature.
Setter: Sets the name of the (ground truth) outcome column.
Type: Optional[str]

property predicted_outcome_feature_name: Optional[str]¶

Name of the predicted outcome column (if any).

Getter: Returns the feature name of the predicted outcome feature.
Setter: Sets the name of the predicted outcome column.
Type: Optional[str]

property hidden_feature_names: List[str]¶

Names of hidden (from the model) features (if any).

Getter: Returns a list feature names of features which are not provided to the model.
Setter: Sets a list feature names of features which are not provided to the model.
Type: Optional[str]

Note Any specified outcome_feature_name or predicted_outcome_feature_name will automatically be hidden from the model and need not occur in this list

class certifai.scanner.builder.CertifaiDatasetSource(args)¶

Class describing dataset storage formats supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported format.

property value_dict¶

Data source details as a dictionary.

Getter

Returns the metadata dict for this data source. In particular this will contain a key named file_type which will be one of

‘csv’

‘json’

‘loaded’

Other keys vary by source type.

Type

dict

static json(url: str, lines: bool = True, orient: str = 'records', encoding: Optional[str] = None)¶

Constructor for a ‘json’ source.

Parameters

url (str) – Location the data may be loaded from. If no protocol is specified then ‘file:’ is assumed.
lines (bool) – If True then JSON lines format (default is True), else JSON list expected.
orient (str) – One of ‘records’, ‘columns’, ‘values’ (matching Pandas usage). Default is ‘records’
encoding (Optional[str]) – string encoding used - default is ‘utf-8’.

Returns

instantiated DatasetSource

Return type

DatasetSource

static csv(url: str, delimiter: str = ',', escape_character: Optional[str] = None, quote_character: str = '"', has_header: bool = True, encoding: Optional[str] = None)¶

Constructor for a ‘csv’ source.

Parameters

url (str) – Location the data may be loaded from. If no protocol is specified then ‘file:’ is assumed.
delimiter (str) – Record separator used. Default is ‘,’.
escape_character (Optional[str]) – Escape character if any. Default is None.
quote_character (str) – Quote delimiter. Default is ‘”’.
has_header (bool) – Whether the source CSV has a header row specifying column names. Default is True.

Returns

instantiated DatasetSource

Return type

DatasetSource

static dataframe(df)¶

Constructor for a ‘dataframe’ source (an already loaded Pandas dataframe).

Parameters: df (DataFrame) – Dataframe containing the data.
Returns: instantiated DatasetSource.
Return type: DatasetSource

class certifai.scanner.builder.CertifaiDataset(id: str, source: certifai.scanner.builder.CertifaiDatasetSource, name: Optional[str] = None, description: Optional[str] = None)¶

Metadata describing a dataset.

Parameters

id (str) – identifier string by which the dataset may be referenced.
source (DatasetSource) – source for the actual data in the dataset.
name (Optional[str]) – Optional human readable name of the dataset.
description (Optional[str]) – Optional free text description of the dataset.

class certifai.scanner.builder.CertifaiGroupingBucket(description: str, max: Optional[float] = None, values: Optional[List[Union[str, int]]] = None)¶

Metadata describing a value grouping bucket for feature values - immutable once instantiated.

Parameters

description (str) – Descriptive name of the bucket.
max (Optional[float]) – Optional maximum numerical value in the bucket (may only be used with numeric features).
values (Optional[List[Union[str,int,bool]]]) – Optional explicit list of values falling within the bucket (intended for use with categorical features).

property description: str¶

Description of the bucket.

Getter: Returns the bucket description string.
Type: str

property max: Optional[float]¶

Maximum numeric value falling within the bucket (if specified).

Note - the floor of a bucket is determined by the ceiling of the previous bucket. The entire bucket list will be sorted on max values and a sentinel bucket with no maximum value should be included in the list.

Getter: Returns the bucket ceiling value.
Type: Optional[float]

property values: Optional[Iterable[Union[str, int]]]¶

List of values falling within the bucket if defined.

Getter: Returns the list of values or None if not defined.
Type: Iterable[Union[str,int,bool]]

class certifai.scanner.builder.CertifaiGroupingFeature(name: str, buckets: Optional[Iterable[certifai.scanner.builder.CertifaiGroupingBucket]] = None)¶

Metadata describing a fairness grouping feature - immutable once instantiated

Parameters

name (str) – Feature name of the feature which defines the grouping.
buckets (Optional[Iterable[CertifaiGroupingBucket]]) – Optional definition for bucketing the values of the feature. If not specified then every unique value occurring in the data will be treated as its own group.

property name: str¶

Name of the grouping feature

Getter: Returns the feature name used to define the groups
Type: str

property buckets: Optional[Iterable[certifai.scanner.builder.CertifaiGroupingBucket]]¶

List of grouping buckets.

Getter: Returns the list of grouping buckets for the grouping feature, if defined.
Type: Iterable[CertifaiGroupingBucket]

class certifai.scanner.builder.CertifaiScanBuilder(base: certifai.scanner.schemas.ScanTemplate)¶

Builder class for scan templates, with static method for instantiation, and methods for manipulation, persistence, and running of the defined scan.

property model_headers: Dict¶

Returns model headers defined in the scan.

Getter: Returns model headers as dict
Type: Dict

add_model_header(header_name: str, header_value: str, model_id: Optional[str] = None)¶

Add or Update a model header. If model_id is provided then model header is added to the specific model otherwise header is set as a default for all models.

Parameters

header_name (str) – Name of the model header to inject.
header_value (str) – Value associated with the header.
model_id (Optional[str]) – model to add/update headers given model_id

remove_model_header(header_name: str, model_id: Optional[str] = None)¶

Remove a model header given header_name. If model_id is provided then the header is removed for that specific model, otherwise it is removed from all models (default case).

Parameters

header_name (str) – name of the header to remove .
model_id (Optional[str]) – model to remove headers from

property template: certifai.scanner.schemas.ScanTemplate¶

Retrieve a scan template which can be serialized to dictionary form for saving as JSON or YAML by calling its dump method.

Return ScanTemplate: a ScanTemplate instance

property author: str¶

Author of the template.

Getter: Returns the author of the scan template, if defined.
Setter: Sets the author of the scan template.
Type: Optional[str]

property use_case_name: str¶

Use case name - human readable name for a use case (i.e. - a prediction task).

Getter: Returns the use case name.
Setter: Sets the use case name.
Type: str

property use_case_id: str¶

Use case id - id by which the use case may be referenced.

Getter: Returns the use case id.
Setter: Sets the use case id.
Type: str

property no_model_access: bool¶

Whether the scan will have access to the model. If false, then all datasets should include a predicted_outcome_column with the model’s predictions and only evaluations that support no model access will be run.

Getter: returns whether the evaluation has access to its model.
Setter: sets whether the scan will have access to the listed model.
Return type: bool

property evaluation_name: str¶

Evaluation name - name of a particular evaluation (scan run).

Getter: Returns the evaluation name.
Setter: Sets the evaluation name.
Type: str

property evaluation_environment: str¶

Evaluation environment - free text string with evaluation environment details.

Getter: Returns the evaluation environment string.
Setter: Sets the evaluation environment string.
Type: str

property evaluation_description: str¶

Evaluation description - free text string describing the evaluation.

Getter: Returns the evaluation description string.
Setter: Sets the evaluation description string.
Type: str

property evaluation_dataset_id: str¶

Evaluation dataset id - specifies which dataset to use as the evaluation set.

Getter: Returns the evaluation dataset id.
Setter: Sets the evaluation dataset id.
Type: str

property explanation_dataset_id: Optional[str]¶

Explanation dataset id - specifies which dataset to generate explanations of if the ‘explanation’ evaluation type is included in the scan.

Getter: Returns the explanation dataset id.
Setter: Sets the explanation dataset id.
Type: Optional[str]

property test_dataset_id: Optional[str]¶

Test dataset id - specifies which dataset to measure metrics on if the ‘performance’ evaluation type is included in the scan.

Getter: Returns the test dataset id.
Setter: Sets the test dataset id.
Type: Optional[str]

property reference_dataset_id: Optional[str]¶

Reference dataset id - specifies which dataset to use as the reference for computing data quality metrics and drift metrics if the ‘data_statistics’ evaluation type is included in the scan.

Getter: Returns the reference dataset id.
Setter: Sets the reference dataset id.
Return type: Optional[str]

property prediction_task: certifai.scanner.builder.CertifaiPredictionTask¶

Metadata for the prediction task.

Getter: Returns the prediction task metadata.
Setter: Sets the prediction task metadata.
Type: CertifaiPredictionTask

property output_path: Optional[str]¶

Output path to which reports will be written. If set to None output will be to ‘./reports’ relative to the scan base path unless explicitly overriden either by the run call or by the SCAN_RESULTS_DIRECTORY environment variable

Getter: Returns the output path.
Setter: Sets the output path of the scan.
Type: Optional[str]

add_evaluation_type(value: str)¶

Add an evaluation type to the scan.

Parameters: value (str) – type of evaluation to add. Must be one of - ‘fairness’ - ‘robustness’ - ‘explanation’ - ‘explainability’ - ‘performance’ - ‘data_statistics’

remove_evaluation_type(value: str)¶

Remove an evaluation type from the scan.

Parameters: value (str) – type of evaluation to remove.

property evaluation_types: Iterable[str]¶

Evaluation types included in the scan.

Getter: Returns the list of included evaluation types.
Type: Iterable[str]

property hyper_parameter_overrides: dict¶

Hyper-parameter overrides to apply to the analysis.

Getter: Returns a dictionary of hyper-parameter overrides.
Setter: Specifies a dictionary of hyper-parameter overrides.
Type: dict

add_fairness_grouping_feature(feature: certifai.scanner.builder.CertifaiGroupingFeature)¶

Add a fairness grouping feature.

Parameters: feature (CertifaiGroupingFeature) – grouping feature definition.

remove_fairness_grouping_feature(name: str)¶

Remove a fairness grouping feature.

Parameters: name (str) – name of the grouping feature to remove.

property fairness_grouping_features: List[certifai.scanner.builder.CertifaiGroupingFeature]¶

Fairness grouping features defined for the scan.

Getter: Returns a list of defined grouping features.
Type: List[CertifaiGroupingFeature]

property metrics: List[certifai.scanner.builder.CertifaiModelMetric]¶

Performance metrics defined for the scan.

Getter: Returns a list of defined performance metrics.
Type: List[ModelMetric]

add_metric(metric: certifai.scanner.builder.CertifaiModelMetric)¶

Add a performance metric.

Parameters: metric (CertifaiModelMetric) – metric to add.

remove_metric(name: str)¶

Remove a performance metric

Parameters: name (str) – Name of the metric to remove

property explanation_types: List[str]¶

Explanation types defined for the scan.

Getter: Returns a list of defined explanation types.
Type: List[str]

add_explanation_type(explanation: str)¶

Add a explanation type.

Parameters: explanation (str) – explanation type to add.

remove_explanation_type(name: str)¶

Remove an explanation type

Parameters: name (str) – Name of the explanation type to remove

property primary_explanation_type: str¶

Explanation type to select for the explainability axis of the ATX score.

Getter: Returns the name of the selected explanation type for use in ATX calculation.
Setter: Sets the name of the selected explanation type for use in ATX calculation.

Accepts an str specifying the explanation type on set, returns str of the explanation type on get.

property fairness_metrics: List[str]¶

Fairness metrics defined for the scan.

Getter: Returns a list of defined fairness metrics.
Type: List[str]

add_fairness_metric(metric: str)¶

Add a fairness metric.

Parameters: metric (str) – metric to add.

remove_fairness_metric(name: str)¶

Remove a fairness metric

Parameters: name (str) – Name of the metric to remove

property primary_fairness_metric: Optional[str]¶

The fairness metric to use as the Fairness aspect for calculating the ATX score.

Getter: Returns the name of the selected fairness metric for use in ATX calculation.
Setter: Sets the name of the selected fairness metric for use in ATX calculation.

Accepts an Optional[str] specifying the primary fairness metric (if any) on set, returns Optional[str] of the fairness metric instance on get.

property atx_performance_metric: Optional[certifai.scanner.builder.CertifaiModelMetric]¶

Metric to select for the performance axis of the ATX score.

Getter: Returns the name of the selected performance metric for use in ATX calculation.
Setter: Sets the name of the selected performance metric for use in ATX calculation.

Accepts an Optional[str] specifying the performance metric name (if any) on set, returns Optional[CertifaiModelMetric] of the metric instance on get.

add_model(model: certifai.scanner.builder.CertifaiModel)¶

Add a model to the scan.

Parameters: model (CertifaiModel) – metadata of the model to add.

remove_model(id: str)¶

Remove a model from the scan.

Parameters: id (str) – Removes the model with the specified id.
Returns

property models: List[certifai.scanner.builder.CertifaiModel]¶

Models included in the scan.

Getter: Returns a list of included models.
Type: List[CertifaiModel]

add_dataset(dataset: certifai.scanner.builder.CertifaiDataset)¶

Add a dataset.

Parameters: dataset (CertifaiDataset) – the dataset to add.

remove_dataset(dataset_id: str)¶

Remove a dataset.

Parameters: dataset_id (str) – Dataset to remove (by id).

property datasets: List[certifai.scanner.builder.CertifaiDataset]¶

Datasets defined by the scan.

Getter: Returns a list of defined datasets.
Type: List[CertifaiDataset]

property dataset_schema: certifai.scanner.builder.CertifaiDataSchema¶

Dataset schema used by the scan use case.

Getter: Returns the dataset schema.
Setter: Sets the dataset schema.
Type: CertifaiDataSchema

property feature_restrictions: Dict[str, certifai.scanner.builder.CertifaiFeatureRestriction]¶

Get restrictions on feature changes made during counterfactual production

Returns: dictionary of restrictions keyed on feature name
Return type: Dict[str,CertifaiFeatureRestriction]

add_feature_restriction(feature_name: str, restriction: certifai.scanner.builder.CertifaiFeatureRestriction)¶

Add a restriction on the changes that can be made to a feature during counterfactual production.

Parameters

feature_name (str) – feature to restrict
restriction (CertifaiFeatureRestriction) – restriction to apply

remove_feature_restriction(feature_name: str)¶

Remove a restriction on the changes that can be made to a feature during counterfactual production

Parameters: feature_name (str) – feature to de-restrict

property monitored_features: List[Union[str, int]]¶

Monitored features defined for the scan.

Getter: Returns a list of monitored features.
Type: List[Union[str, int]]

add_monitored_feature(feature: Union[str, int])¶

Adds a monitored feature.

Parameters: feature (Union[str, int]) – feature name or index

remove_monitored_feature(feature: Union[str, int])¶

Removes a monitored feature.

Parameters: feature (Union[str, int]) – feature name or index

extract_yaml() → str¶

Extract the scan as a YAML definition.

Returns: string containing the scan template encoded as YAML.
Return type: str

save(file)¶

Save the scan template to a file.

Parameters: file – file object opened for write to which the definition is to be saved.

run_preflight(model_id: Optional[str] = None, base_path: Optional[str] = None, callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>, refresh: bool = True)¶

Run the preflight scan (in-process).

Parameters

model_id (Optional[str]) – Optional specific model id to restrict the scan to
base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).
callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as preflight checks are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. This is not applicable when refresh is False. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.
refresh (bool) – If True all preflight checks will be run and the latest results will be returned. Otherwise, the results will be computed from existing preflight report data. Defaults to True.

Returns

a nested dictionary of messages produced during the preflight scan. The top level keys are model ids, second level keys is the message type, within which is a list of strings

Return type

dict

run_explain(precalculate: bool = False, fast: bool = False, sampling: bool = False, model_id: Optional[str] = None, base_path: Optional[str] = None, explanation_format: str = 'csv', callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>, write_reports: bool = True, **kwargs)¶

Run an explanation scan (in-process).

Parameters

precalculate (bool) – If True then precalculation of baselines for the model/usecase will be calculated and stored for use in fast explanations. Defaults to False.
fast (bool) – If True then fast explanations will be used, which is suitable for bulk-explanation of large datasets. Fast explanation requires the precalculate step to have been performed for the model and use case previously (or in the same call). Defaults to False.
sampling (bool) – If true then Counterfactual Sampling will be used. This is suitable for use-cases that have a large representative evaluation dataset. Defaults to False.
model_id (Optional[str]) – Optional specific model id to restrict the scan to
base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).
explanation_format (str) – Format in which to write the explanations, must be one of: ‘csv’, ‘jsonlines’, ‘inline’. If either ‘csv’ or ‘jsonlines’, then explanations will be written in a separate file and the filename will be specified in the scan report. If ‘inline’ the explanations will be included in the scan report. This is not applicable when precalculate is True. Defaults to ‘csv’.
callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as evalutions are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.
write_reports (bool) – Whether to write scan report files or not, default to True. This arguments takes precedence over the explanation_format.

Returns

a nested dictionary. If precalculate is True, the top level keys are the model ids and each value dictionary with a status, a possible error message, and the location for the persisted calculations. Otherwise, the top level keys are the evaluation type and second level keys are the model ids, within which is the report JSON represented in dictionary format.

Return type

dict

run(model_id: Optional[str] = None, report: Optional[str] = None, write_reports: bool = True, base_path: Optional[str] = None, callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>)¶

Run the scan (in-process).

Parameters

model_id (Optional[str]) – Optional specific model id to restrict the scan to.
report (Optional[str]) – Optional specific report (evaluation type) to restrict the scan to.
write_reports (bool) – Whether to write report files for each model evaluation to the scan’s output directory (by default ‘./reports’ relative to base_path). Default is True.
base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).
callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as evaluations are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.

Returns

nested dictionary of reports. Top level keys are the evaluation type, second level keys are the model ids, within which is the report JSON represented in dictionary format.

Return type

dict

static from_file(filename: str) → certifai.scanner.builder.CertifaiScanBuilder¶

Load a scan template from file.

Parameters: filename (str) – path to template file to read.
Returns: Instantiated ScanBuilder with metadata from the template that was read.
Return type: CertifaiScanBuilder

static from_yaml(as_yaml: str) → certifai.scanner.builder.CertifaiScanBuilder¶

Load a scan template from file.

Parameters: as_yaml (str) – Definition to load as YAML string.
Returns: Instantiated ScanBuilder with metadata from the template that was read.
Return type: CertifaiScanBuilder

static create(use_case_name: str, use_case_id: Optional[str] = None, evaluation_name: Optional[str] = None, environment: Optional[str] = None, description: Optional[str] = None, prediction_task: certifai.scanner.builder.CertifaiPredictionTask = <certifai.scanner.builder.CertifaiPredictionTask object>, output_path: Optional[str] = None) → certifai.scanner.builder.CertifaiScanBuilder¶

Create a new template builder.

Parameters

use_case_name (str) – Name of the prediction use case.
use_case_id (Optional[str]) – Id by which the use case will be referenced. Defaults to the name if omitted.
evaluation_name (Optional[str]) – Name of the evaluation. Defaults to the use case name if not provided.
environment (Optional[str]) – Optional opaque string recording scan environment information.
description (Optional[str]) – Optional human readable description of the use case.
prediction_task (str) – Prediction task metadata.
output_path (Optional[str]) – where to write report files to. If a relative path evaluated with respect to the base path at evaluation time. If omitted, reports will be written to ‘./reports’.

Returns

Instantiated ScanBuilder with metadata from the template that was read.

Return type

CertifaiScanBuilder