certifai.scanner.builder module

Certifai Scan object model builder

Contains classes representing a Certifai scan definition, and the ability to programmatically manipulate, load, save and run them.

class certifai.scanner.builder.CertifaiOutcomeValue(value: Any, name: Optional[str] = None, favorable: bool = False)

Outcome value for a classification task type.

class certifai.scanner.builder.CertifaiTaskOutcomes

Union type representing the outcomes of a task by type (possible classes for classification, favorable direction for regression)

property task_type: certifai.engine.engine_api_types.CertifaiTaskType
property prediction_favorability: str
static regression(increased_favorable: Optional[bool], change_std_deviation: Optional[float] = None, absolute_threshold: Optional[float] = None, absolute_percentile: Optional[float] = None)

Construct a regression task type

Favorability may be specified in one of three ways (only one of which may be specified):

  1. As a relative increase [or decrease] by a multiple of the population global regressed value standard deviation

  2. As an absolute threshold, specified as an exact value of the regressor output

  3. As an absolute threshold, specified as a percentile of the population global regressed value empirical distribution

Parameters
  • increased_favorable (Optional[bool]) – True if the favorable direction of the prediction is increasing. Can be set to None if there is no favorable direction.

  • change_std_deviation (Optional[float]) – Number of standard deviations considered to be a significant change

  • absolute_threshold (Optional[float]) – Absolute regressed value threshold for favorability

  • absolute_percentile (Optional[float]) – Absolute regressed value threshold for favorability expressed as a population percentile

Returns

CertifaiTaskType instance representing the regression outcome definition

static classification(prediction_values: Iterable[certifai.scanner.builder.CertifaiOutcomeValue], prediction_favorability: Optional[str] = 'explicit', last_favorable_prediction: Optional[Any] = None, favorable_outcome_group_name: Optional[str] = None, unfavorable_outcome_group_name: Optional[str] = None)

Construct a classification task type

Parameters
  • prediction_values (Iterable[CertifaiClassificationPrediction]) – list of possible classes

  • prediction_favorability (Optional[str]) –

    describes the favorability of the prediction_values, default ‘explicit’. Must be one of

    1. ’explicit’, predictions should be explicitly marked as favorable

    2. ’ordered’, predictions are ordered from most to least favorable

    3. ’none’, no prediction should be treated as favorable

  • last_favorable_prediction (Optional[Any]) – ignored unless the prediction_favorability is ‘ordered’, in which case this value should be the last label (in the ordering of the prediction_values which is considered favorable)

  • favorable_outcome_group_name (Optional[str]) – name of favorable group of prediction values - reserved for multiclass-classification task’s with a prediction_favorablity of ‘explicit’

  • unfavorable_outcome_group_name (Optional[str]) – name of unfavorable groups of prediction values - reserved for multiclass-classification task’s with a prediction_favorability of ‘explicit’

Returns

CertifaiTaskType instance representing the classification outcome definition

class certifai.scanner.builder.CertifaiPredictionTask(outcomes: certifai.scanner.builder.CertifaiTaskOutcomes, prediction_description: Optional[str] = None)

Metadata about the prediction task - immutable once instantiated.

Parameters
  • outcomes (CertifaiTaskOutcomes) – One of the supported CertifaiTaskOutcomes types, constructed from the static methods on CertifaiTaskOutcomes.

  • prediction_description (Optional[str]) – Free text description of what is being predicted.

property task_type: str

The task type string (‘binary_classification’, ‘multiclass-classification’, ‘regression’)

Getter

Returns the task type.

Type

str

property prediction_description: Optional[str]

Description of what the prediction represents.

Getter

Returns the description, if any.

Type

Optional[str]

property prediction_favorability: Optional[str]

What format is used for specifying the favorable prediction value, if any, (‘none’, ‘ordered’, ‘explicit’).

Getter

Returns prediction favorability

Type

Optional[str]

property favorable_outcome: Optional[str]

What the favorable outcome direction is for a regression task, None otherwise.

Getter

Returns the favorable label direction (regression) if set.

Type

Optional[Any]

property prediction_values: List[certifai.scanner.builder.CertifaiOutcomeValue]
property last_favorable_prediction: Optional[Any]
property regression_standard_deviation: Optional[float]
property regression_absolute_threshold: Optional[float]
property regression_absolute_percentile: Optional[float]
property favorable_outcome_group_name: Optional[str]

The string name of the favorable group of prediction values - reserved for multiclass-classification with prediction_favorability of ‘explicit’ - None otherwise.

Getter

Return name of favorable group of prediction values

Type

Optional[str]

property unfavorable_outcome_group_name: Optional[str]

The string name of the unfavorable group of prediction values - reserved for multiclass-classification with prediction_favorability of ‘explicit’ - None otherwise.

Getter

Return name of unfavorable group of prediction values

Type

Optional[str]

class certifai.scanner.builder.CertifaiModelMetric(name: str, certifai_metric: Optional[str] = None)

Metadata for a metric - immutable once instantiated

Parameters
  • name (str) – Free text descriptive name of the metric.

  • certifai_metric (Optional[str]) –

    If specified will allow Certifai to calculate the value. Supported values are:

    • ’accuracy’ (classification)

    • ’precision’ (classification)

    • ’recall’ (classification)

    • ’f1’ (classification)

    • ’r-squared’ (regression)

    Micro and macro variants are also supported for precision, recall and f1 e.g. ‘f1(micro)’

property name: str

Descriptive name of the metric.

Getter

Returns the human-readable metric name.

Type

str

property certifai_metric: Optional[str]

Certifai metric type name.

Getter

Returns the Certifai-evaluable metric type (if set).

Type

Optional[str]

class certifai.scanner.builder.CertifaiPredictorWrapper(predictor: certifai.common.hosted_model.IBaseModel, encoder: Optional[Callable[Sequence, Sequence]] = None, decoder: Optional[Callable[Sequence, Sequence]] = None, wrapped: Optional[certifai.common.hosted_model.IHostedModel] = None, soft_predictions: bool = False, label_ordering: Optional[List[Any]] = None, threshold: Optional[float] = None)

Wrapper class for in-process models

Note - the underlying model and any encoder and decoders used must be picklable.

Parameters
  • predictor (IBaseModel) – Any predictor object that has a predict method which takes a sequence of data vectors as a numpy array and returns a sequence of corresponding predicted values.

  • encoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the predictor’s input (e.g. - to perform one-hot encoding and so on)

  • decoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the predictor’s output (e.g. - to binarize with a threshold).

  • wrapped (Optional[IHostedModel]) – if specified other parameters are ignored and the wrapper just proxies the (already wrapped) model provided here (mostly intended for internal usage).

  • soft_predictions (bool) – If True the model supports soft scoring for predictions (default False)

  • label_ordering (Optional[List[Any]]) – For soft scoring models the ordering of the classification labels in the scoring vector

  • threshold (Optional[float]) – For binary classifiers whose soft-scores are returned as a 1-dimensional array of scores, one for each input row, the threshold to apply. Scores greater than or equal to the threshold will receive the second label (or 1 rather than 0 if no labels provided)

property model: certifai.common.hosted_model.IHostedModel

Certifai metric type name.

Getter

Returns the wrapped model suitable for use by Certifai.

Type

IHostedModel

class certifai.scanner.builder.CertifaiModelConnector(name: str, module_name: str, class_name: str, description: Optional[str] = None, model_args: Dict[str, str] = {}, model_secrets: Dict[str, str] = {})

Metadata for a model connector

Parameters
  • name (str) – Free text descriptive name of the connector.

  • module_name (str) – python module containingthe external connector (e.g. ‘certifai.connectors’)

  • class_name (str) – name of python class of the connector

  • Optional[str] (description) – Optional description

  • Dict[str,str] (model_secrets) – arguments to pass to the model connector instances

  • Dict[str,str] – secrets to pass to the model connector instances - substrings of the values of the form {<NAME>} will have the <NAME> replaced by the contents of the environment variable of that name

property name: str

Descriptive name of the connector.

Getter

Returns the connector name.

Type

str

property module_name: str

Module containing the connector.

Getter

Returns the fully qualified module name.

Type

str

property class_name: str

Class name of the connector.

Getter

Returns the name of the python class of the connector.

Type

str

property description: str

Description of the connector.

Getter

Returns the optional description.

Type

str

property model_args: Dict[str, str]

Arguments to instantiated model connector instances.

Getter

Returns the arguments to be provided to connector instances.

Type

Dict[str,str]

property model_secrets: Dict[str, str]

Secrets provided to instantiated model connector instances.

Getter

Returns the secrets to be provided to connector instances.

Type

Dict[str,str]

class certifai.scanner.builder.CertifaiModel(id: str, name: Optional[str] = None, author: Optional[str] = None, version: Optional[str] = None, performance_metric_values: Optional[List[Tuple]] = None, description: Optional[str] = None, predict_endpoint: Optional[str] = None, max_batch_size: Optional[int] = None, local_predictor: Optional[certifai.scanner.builder.CertifaiPredictorWrapper] = None, supports_soft_scoring: bool = False, prediction_value_order: Optional[List[Any]] = None, connector: Optional[certifai.scanner.builder.CertifaiModelConnector] = None, json_strict: bool = False)

Metadata describing a model, and allowing manipulation of this metadata.

Parameters
  • id (str) – Identifier for the model used to refer to it.

  • name (Optional[str]) – Descriptive name for the model. Defaults to the value provided for id

  • author (Optional[str]) – Optional author name.

  • version (Optional[str]) – Optional model version string.

  • performance_metric_values (Optional[List[Tuple]]) – Optional asserted list of (metric_name, value) pairs for metrics of the model - primarily intended to allow injection of externally measured values for metrics not directly supported by Certifai.

  • description (Optional[str]) – Optional free text description of the model.

  • predict_endpoint (Optional[str]) – URL of the prediction endpoint of the model (if non-process-local).

  • max_batch_size (Optional[int]) – Optional limit on prediction batch sizes to call the model with.

  • local_predictor (Optional[CertifaiPredictorWrapper]) – wrapped model object (if using a local in-process model).

  • supports_soft_scoring (bool) – If True model is expected to return soft scores as well as hard predictions

  • prediction_value_order (List[Any]) – For soft scoring models the ordering of the class labels in the score vector

  • connector (Optional[CertifaiModelConnector]) – Optional connector to use to attach to the model

  • json_strict (bool) – If True data will be serialized to send to the model’s predict endpoint in strict JSON, encoding missing data as JSON nulls. If False then JavaScript extended JSON will be used which encodes missing values as NaN. Defaults to False

property name: str

Model name.

Getter

Returns the human-readable name of the model.

Type

str

property id: str

Model id.

Getter

Returns the identifier for the model by which it may be referenced.

Type

str

property author: Optional[str]

Model author.

Getter

Returns the author string if provided.

Setter

Set author string for the model.

Type

Optional[str]

property version: Optional[str]

Model version.

Getter

Returns the version string if provided.

Setter

Set version string for the model.

Type

Optional[str]

property description: Optional[str]

Model description.

Getter

Returns the description string if provided.

Setter

Set description string for the model.

Type

Optional[str]

property predict_endpoint: Optional[str]

Model predict endpoint URL

Getter

Returns the URL of the (remote) model prediction endpoint, if provided

Setter

Sets the prediction endpoint URL for the model

Type

Optional[str]

property max_batch_size: Optional[int]

Model max batch size.

Getter

Returns the max batch size to send to the model.

Setter

Sets the provided restriction on max batch size (None => unlimited).

Type

Optional[int]

property supports_soft_scoring: bool

Whether the model returns soft scores.

Getter

True if the model is expected to upport soft scores.

Setter

Sets whether the model is expected to support soft scores.

Type

bool

property prediction_value_order: bool

Ordering of class labels in the score vector returned by the model.

Getter

Returns the ordering.

Setter

Sets the ordering.

Type

List[Any]

property local_predictor: Optional[certifai.scanner.builder.CertifaiPredictorWrapper]

Wrapped local (in-process) model.

Getter

Returns the wrapped model being used.

Setter

Sets a local wrapped model (see CertifaiPredictorWrapper) to use.

Type

Optional[CertifaiPredictorWrapper]

property performance_metric_values: List[Tuple[str, Any]]

List of asserted metric values for this model.

Getter

Returns any asserted values as (metric name, value) tuples.

Type

List[Tuple[str,Any]]

add_performance_metric_value(metric_name: str, metric_value: Any)

Add an asserted performance metric value.

Parameters
  • metric_name (str) – Name of the metric to assert a value for.

  • value (Any) – Value to assert.

remove_performance_metric_value(metric_name: str)

Remove an asserted metric value

Parameters

metric_name – name of the metric to remove the asserted value for.

property connector: Optional[certifai.scanner.builder.CertifaiModelConnector]
property json_strict: bool

Whether to encode to this model is strict JSON

Getter

True if the model expects strict JSON (missing encoded as null as opposed to NaN).

Setter

Sets whether the model expects strict JSON

Type

bool

class certifai.scanner.builder.CertifaiFeatureDataType(args: dict)

Class describing feature datatypes supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported type.

property value_dict: dict

Data type details as a dictionary.

Getter

Returns the metadata dict for this datatype. In particular this will contain a key named data_type which will be one of

  • ‘numerical-int’

  • ‘numerical-float’

  • ‘categorical’

Other keys vary by datatype.

Type

dict

static int(min: Optional[int] = None, max: Optional[int] = None, spread: Optional[float] = None) certifai.scanner.builder.CertifaiFeatureDataType

Constructor for an ‘int’ feature.

Parameters
  • min (Optional[int]) – optional floor value this feature can take.

  • max (Optional[int]) – optional ceiling value this feature can take.

  • spread (Optional[float]) – optional measure of spread (typically MAD or std. deviation).

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

static float(min: Optional[float] = None, max: Optional[float] = None, spread: Optional[float] = None) certifai.scanner.builder.CertifaiFeatureDataType

Constructor for an ‘float’ feature.

Parameters
  • min (Optional[float]) – optional floor value this feature can take.

  • max (Optional[float]) – optional ceiling value this feature can take.

  • spread (Optional[float]) – optional measure of spread (typically MAD or std deviation).

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

static categorical(values: Optional[Iterable[Union[str, int]]] = None, value_columns: Optional[List[Tuple[str, Union[str, int]]]] = None, target_encodings: Optional[Iterable[float]] = None, categorical_type: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureDataType

Constructor for a ‘categorical’ feature.

Parameters
  • values (Optional[Iterable[Union[str,int,bool]]]) – Optional list of possible values this categorical field may take on. If omitted then Certifai will infer the value set from the available data.

  • value_columns (Optional[List[Tuple[str, Union[str, builtins.int, bool]]]]) –

    Optional list of column name and categorical value pairs, for one-hot encoded data. If both value_columns and values are specified then they must have exactly the same

    set of values. If only value_columns is specified then the values are inferred. If only values is specified then the feature is assumed to be value-encoded in a single column.

  • target_encodings (Optional[Iterable[float]]) – optional list of encodings for the values in values used to represent those values in the dataset

  • categorical_type (Optional[str]) – optional string specifying the data type the categorical feature is. Must be one of: ‘string’, ‘int’, or ‘auto’. For example, specifying ‘string’ would mean that the value 001 will be interpreted as the string ‘001’, instead of as the integer 1.

Returns

instantiated CertifaiFeatureDataType

Return type

CertifaiFeatureDataType

class certifai.scanner.builder.CertifaiFeatureRestriction(args: dict)

Class describing feature change restrictions supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported type.

property value_dict: dict

Data type details as a dictionary.

Getter

Returns the metadata dict for this datatype. In particular this will contain a key named constraint which will be one of

  • ‘constant’

  • ‘percentage’

  • ‘range’

Other keys vary by datatype.

Type

dict

static range(min: Optional[int] = None, max: Optional[int] = None, direction: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a range constraint on feature modifications in counterfactual production.

Parameters
  • min (Optional[int]) – optional floor value this feature can take.

  • max (Optional[int]) – optional ceiling value this feature can take.

  • direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static percentage(amount: float, direction: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a percentage change constraint on feature modifications in counterfactual production.

Parameters
  • amount (float) – max percentage the feature may change by (can only be applied to numeric features).

  • direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static constant() certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a no-change constraint on feature modifications in counterfactual production.

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static standard_deviation(value: float, tolerance_value: Optional[float] = None, direction: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a standard deviation constraint on feature modifications in counterfactual production.

Parameters
  • value (float) – number of standard deviations the feature may change by (can only be applied to numeric features).

  • tolerance_value (float) – additional number of standard deviations the feature may change by if no solutions could be found (not applicable to all scans)

  • direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’ (not applicable to all scans)

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static fixed_amount(value: float, tolerance_value: Optional[float] = None, direction: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for a fixed amount constraint on feature modifications in counterfactual production.

Parameters
  • value (float) – fixed amount that the feature may change by (can only be applied to numeric features).

  • tolerance_value (float) – additional amount the feature may change by if no solutions could be found (not applicable to all scans)

  • direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’ (not applicable to all scans)

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static value_set(values: List[Union[str, int]], tolerance_values: Optional[List[Union[str, int]]] = None) certifai.scanner.builder.CertifaiFeatureRestriction

Constructor for an allowed value mapping constraint on feature modifications in counterfactual production.

Parameters
  • values (List[Union[str, builtins.int, bool]]) – fixed set of values the feature may change to (can only be applied to categorical features).

  • tolerance_values (Optional[List[Union[str, builtins.int, bool]]]) – additional values the feature may change to if no solutions could be found

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

static value_map(values: Dict[Union[str, int], List[Union[str, int]]], tolerance_values: Optional[Dict[Union[str, int], List[Union[str, int]]]] = None)

Constructor for an allowed value mapping constraint on feature modifications in counterfactual production.

Parameters
  • values (Dict[Union[str, builtins.int, bool], List[Union[str, builtins.int, bool]]]) – Dictionary mapping of categorical values to values the feature may change to (can only be applied to categorical features).

  • tolerance_values (Optional[Dict[Union[str, builtins.int, bool], List[Union[str, builtins.int, bool]]]]) – Additional dictionary mapping of categorical values to values the feature may change to.

Returns

instantiated CertifaiFeatureRestriction

Return type

CertifaiFeatureRestriction

class certifai.scanner.builder.CertifaiFeatureSchema(name: str, data_type: Optional[certifai.scanner.builder.CertifaiFeatureDataType] = None)

Class describing a feature - immutable once instantiated.

Parameters
  • name (str) – The name of the feature (should match any column headers in the dataset if any).

  • data_type (CertifaiFeatureDataType) – Type of data the feature holds.

property name: str

Feature name.

Getter

Returns the name of the feature.

Type

str

property data_type: Optional[certifai.scanner.builder.CertifaiFeatureDataType]

Feature data type

Getter

Returns the data type of the feature.

Type

CertifaiFeatureDataType

class certifai.scanner.builder.CertifaiDataSchema(features: Optional[List[certifai.scanner.builder.CertifaiFeatureSchema]] = None, outcome_feature_name: Optional[str] = None, predicted_outcome_feature_name: Optional[str] = None, hidden_feature_names: Optional[List[str]] = None, defined_feature_order: bool = False)

Class describing a dataset’s feature schema, and allowing manipulation of this schema.

Parameters
  • features (Optional[List[CertifaiFeatureSchema]]) – features specified by the scan definition. This may be a subset of all the features present. Any that are omitted will be inferred from the available data.

  • outcome_feature_name (Optional[str]) – name of the feature holding the ground truth label/value (if present) Note Any outcome feature column will be removed before passing data to the model.

  • predicted_outcome_feature_name (Optional[str]) – name of the feature holding the predicted label/value (if present) Note Any predicted_outcome feature column will be removed before passing data to the model

  • hidden_feature_names (Optional[List[str]]) – list of feature names that should be hidden from the model

  • defined_feature_order (bool) – If present and True asserts that the list order of features in the schema matches the layout of columns in the dataset. If True then all columns must be present. Intended for use in cases where the dataset does not specify a column ordering itself.

property features: Optional[List[certifai.scanner.builder.CertifaiFeatureSchema]]

features defined by the schema.

Getter

Returns the list of defined features.

Type

Optional[List[CertifaiFeatureSchema]]

property defined_feature_order: bool

Whether the schema defines the column ordering of the data.

Getter

Returns True if the schema defines the column ordering.

Setter

Sets whether the schema defines the column ordering of the data.

Type

bool

add_feature(name: str, data_type: certifai.scanner.builder.CertifaiFeatureDataType)

Add a feature

Parameters

Note - the feature will be appended to the current list

insert_feature(name: str, index: int, data_type: certifai.scanner.builder.CertifaiFeatureDataType)

Insert a feature.

Parameters
  • name (str) – Name of feature to add.

  • index (int) – Columnar position to insert the feature at (0-based).

  • data_type (CertifaiFeatureDataType) – data type of feature to add.

update_feature(name: str, data_type: certifai.scanner.builder.CertifaiFeatureDataType)

Update an existing feature by name - preserves its index in th feature list

Parameters
  • name (str) – Name of feature to update.

  • data_type (CertifaiFeatureDataType) – new data type of feature being updated.

remove_feature(name: str)

Remove a feature.

Parameters

name (str) – Name of feature to remove.

infer_features_from_data(dataset_source: certifai.scanner.builder.CertifaiDatasetSource)
property outcome_feature_name: Optional[str]

Name of the (ground truth) outcome column (if any).

Getter

Returns the feature name of the outcome feature.

Setter

Sets the name of the (ground truth) outcome column.

Type

Optional[str]

property predicted_outcome_feature_name: Optional[str]

Name of the predicted outcome column (if any).

Getter

Returns the feature name of the predicted outcome feature.

Setter

Sets the name of the predicted outcome column.

Type

Optional[str]

property hidden_feature_names: List[str]

Names of hidden (from the model) features (if any).

Getter

Returns a list feature names of features which are not provided to the model.

Setter

Sets a list feature names of features which are not provided to the model.

Type

Optional[str]

Note Any specified outcome_feature_name or predicted_outcome_feature_name will automatically be hidden from the model and need not occur in this list

class certifai.scanner.builder.CertifaiDatasetSource(args)

Class describing dataset storage formats supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported format.

property value_dict

Data source details as a dictionary.

Getter

Returns the metadata dict for this data source. In particular this will contain a key named file_type which will be one of

  • ‘csv’

  • ‘json’

  • ‘loaded’

Other keys vary by source type.

Type

dict

static json(url: str, lines: bool = True, orient: str = 'records', encoding: Optional[str] = None)

Constructor for a ‘json’ source.

Parameters
  • url (str) – Location the data may be loaded from. If no protocol is specified then ‘file:’ is assumed.

  • lines (bool) – If True then JSON lines format (default is True), else JSON list expected.

  • orient (str) – One of ‘records’, ‘columns’, ‘values’ (matching Pandas usage). Default is ‘records’

  • encoding (Optional[str]) – string encoding used - default is ‘utf-8’.

Returns

instantiated DatasetSource

Return type

DatasetSource

static csv(url: str, delimiter: str = ',', escape_character: Optional[str] = None, quote_character: str = '"', has_header: bool = True, encoding: Optional[str] = None)

Constructor for a ‘csv’ source.

Parameters
  • url (str) – Location the data may be loaded from. If no protocol is specified then ‘file:’ is assumed.

  • delimiter (str) – Record separator used. Default is ‘,’.

  • escape_character (Optional[str]) – Escape character if any. Default is None.

  • quote_character (str) – Quote delimiter. Default is ‘”’.

  • has_header (bool) – Whether the source CSV has a header row specifying column names. Default is True.

Returns

instantiated DatasetSource

Return type

DatasetSource

static dataframe(df)

Constructor for a ‘dataframe’ source (an already loaded Pandas dataframe).

Parameters

df (DataFrame) – Dataframe containing the data.

Returns

instantiated DatasetSource.

Return type

DatasetSource

class certifai.scanner.builder.CertifaiDataset(id: str, source: certifai.scanner.builder.CertifaiDatasetSource, name: Optional[str] = None, description: Optional[str] = None)

Metadata describing a dataset.

Parameters
  • id (str) – identifier string by which the dataset may be referenced.

  • source (DatasetSource) – source for the actual data in the dataset.

  • name (Optional[str]) – Optional human readable name of the dataset.

  • description (Optional[str]) – Optional free text description of the dataset.

class certifai.scanner.builder.CertifaiGroupingBucket(description: str, max: Optional[float] = None, values: Optional[List[Union[str, int]]] = None)

Metadata describing a value grouping bucket for feature values - immutable once instantiated.

Parameters
  • description (str) – Descriptive name of the bucket.

  • max (Optional[float]) – Optional maximum numerical value in the bucket (may only be used with numeric features).

  • values (Optional[List[Union[str,int,bool]]]) – Optional explicit list of values falling within the bucket (intended for use with categorical features).

property description: str

Description of the bucket.

Getter

Returns the bucket description string.

Type

str

property max: Optional[float]

Maximum numeric value falling within the bucket (if specified).

Note - the floor of a bucket is determined by the ceiling of the previous bucket. The entire bucket list will be sorted on max values and a sentinel bucket with no maximum value should be included in the list.

Getter

Returns the bucket ceiling value.

Type

Optional[float]

property values: Optional[Iterable[Union[str, int]]]

List of values falling within the bucket if defined.

Getter

Returns the list of values or None if not defined.

Type

Iterable[Union[str,int,bool]]

class certifai.scanner.builder.CertifaiGroupingFeature(name: str, buckets: Optional[Iterable[certifai.scanner.builder.CertifaiGroupingBucket]] = None)

Metadata describing a fairness grouping feature - immutable once instantiated

Parameters
  • name (str) – Feature name of the feature which defines the grouping.

  • buckets (Optional[Iterable[CertifaiGroupingBucket]]) – Optional definition for bucketing the values of the feature. If not specified then every unique value occurring in the data will be treated as its own group.

property name: str

Name of the grouping feature

Getter

Returns the feature name used to define the groups

Type

str

property buckets: Optional[Iterable[certifai.scanner.builder.CertifaiGroupingBucket]]

List of grouping buckets.

Getter

Returns the list of grouping buckets for the grouping feature, if defined.

Type

Iterable[CertifaiGroupingBucket]

class certifai.scanner.builder.CertifaiScanBuilder(base: certifai.scanner.schemas.ScanTemplate)

Builder class for scan templates, with static method for instantiation, and methods for manipulation, persistence, and running of the defined scan.

property model_headers: Dict

Returns model headers defined in the scan.

Getter

Returns model headers as dict

Type

Dict

add_model_header(header_name: str, header_value: str, model_id: Optional[str] = None)

Add or Update a model header. If model_id is provided then model header is added to the specific model otherwise header is set as a default for all models.

Parameters
  • header_name (str) – Name of the model header to inject.

  • header_value (str) – Value associated with the header.

  • model_id (Optional[str]) – model to add/update headers given model_id

remove_model_header(header_name: str, model_id: Optional[str] = None)

Remove a model header given header_name. If model_id is provided then the header is removed for that specific model, otherwise it is removed from all models (default case).

Parameters
  • header_name (str) – name of the header to remove .

  • model_id (Optional[str]) – model to remove headers from

property template: certifai.scanner.schemas.ScanTemplate

Retrieve a scan template which can be serialized to dictionary form for saving as JSON or YAML by calling its dump method.

Return ScanTemplate

a ScanTemplate instance

property author: str

Author of the template.

Getter

Returns the author of the scan template, if defined.

Setter

Sets the author of the scan template.

Type

Optional[str]

property use_case_name: str

Use case name - human readable name for a use case (i.e. - a prediction task).

Getter

Returns the use case name.

Setter

Sets the use case name.

Type

str

property use_case_id: str

Use case id - id by which the use case may be referenced.

Getter

Returns the use case id.

Setter

Sets the use case id.

Type

str

property no_model_access: bool

Whether the scan will have access to the model. If false, then all datasets should include a predicted_outcome_column with the model’s predictions and only evaluations that support no model access will be run.

Getter

returns whether the evaluation has access to its model.

Setter

sets whether the scan will have access to the listed model.

Return type

bool

property evaluation_name: str

Evaluation name - name of a particular evaluation (scan run).

Getter

Returns the evaluation name.

Setter

Sets the evaluation name.

Type

str

property evaluation_environment: str

Evaluation environment - free text string with evaluation environment details.

Getter

Returns the evaluation environment string.

Setter

Sets the evaluation environment string.

Type

str

property evaluation_description: str

Evaluation description - free text string describing the evaluation.

Getter

Returns the evaluation description string.

Setter

Sets the evaluation description string.

Type

str

property evaluation_dataset_id: str

Evaluation dataset id - specifies which dataset to use as the evaluation set.

Getter

Returns the evaluation dataset id.

Setter

Sets the evaluation dataset id.

Type

str

property explanation_dataset_id: Optional[str]

Explanation dataset id - specifies which dataset to generate explanations of if the ‘explanation’ evaluation type is included in the scan.

Getter

Returns the explanation dataset id.

Setter

Sets the explanation dataset id.

Type

Optional[str]

property test_dataset_id: Optional[str]

Test dataset id - specifies which dataset to measure metrics on if the ‘performance’ evaluation type is included in the scan.

Getter

Returns the test dataset id.

Setter

Sets the test dataset id.

Type

Optional[str]

property reference_dataset_id: Optional[str]

Reference dataset id - specifies which dataset to use as the reference for computing data quality metrics and drift metrics if the ‘data_statistics’ evaluation type is included in the scan.

Getter

Returns the reference dataset id.

Setter

Sets the reference dataset id.

Return type

Optional[str]

property prediction_task: certifai.scanner.builder.CertifaiPredictionTask

Metadata for the prediction task.

Getter

Returns the prediction task metadata.

Setter

Sets the prediction task metadata.

Type

CertifaiPredictionTask

property output_path: Optional[str]

Output path to which reports will be written. If set to None output will be to ‘./reports’ relative to the scan base path unless explicitly overriden either by the run call or by the SCAN_RESULTS_DIRECTORY environment variable

Getter

Returns the output path.

Setter

Sets the output path of the scan.

Type

Optional[str]

add_evaluation_type(value: str)

Add an evaluation type to the scan.

Parameters

value (str) – type of evaluation to add. Must be one of - ‘fairness’ - ‘robustness’ - ‘explanation’ - ‘explainability’ - ‘performance’ - ‘data_statistics’

remove_evaluation_type(value: str)

Remove an evaluation type from the scan.

Parameters

value (str) – type of evaluation to remove.

property evaluation_types: Iterable[str]

Evaluation types included in the scan.

Getter

Returns the list of included evaluation types.

Type

Iterable[str]

property hyper_parameter_overrides: dict

Hyper-parameter overrides to apply to the analysis.

Getter

Returns a dictionary of hyper-parameter overrides.

Setter

Specifies a dictionary of hyper-parameter overrides.

Type

dict

add_fairness_grouping_feature(feature: certifai.scanner.builder.CertifaiGroupingFeature)

Add a fairness grouping feature.

Parameters

feature (CertifaiGroupingFeature) – grouping feature definition.

remove_fairness_grouping_feature(name: str)

Remove a fairness grouping feature.

Parameters

name (str) – name of the grouping feature to remove.

property fairness_grouping_features: List[certifai.scanner.builder.CertifaiGroupingFeature]

Fairness grouping features defined for the scan.

Getter

Returns a list of defined grouping features.

Type

List[CertifaiGroupingFeature]

property metrics: List[certifai.scanner.builder.CertifaiModelMetric]

Performance metrics defined for the scan.

Getter

Returns a list of defined performance metrics.

Type

List[ModelMetric]

add_metric(metric: certifai.scanner.builder.CertifaiModelMetric)

Add a performance metric.

Parameters

metric (CertifaiModelMetric) – metric to add.

remove_metric(name: str)

Remove a performance metric

Parameters

name (str) – Name of the metric to remove

property explanation_types: List[str]

Explanation types defined for the scan.

Getter

Returns a list of defined explanation types.

Type

List[str]

add_explanation_type(explanation: str)

Add a explanation type.

Parameters

explanation (str) – explanation type to add.

remove_explanation_type(name: str)

Remove an explanation type

Parameters

name (str) – Name of the explanation type to remove

property primary_explanation_type: str

Explanation type to select for the explainability axis of the ATX score.

Getter

Returns the name of the selected explanation type for use in ATX calculation.

Setter

Sets the name of the selected explanation type for use in ATX calculation.

Accepts an str specifying the explanation type on set, returns str of the explanation type on get.

property fairness_metrics: List[str]

Fairness metrics defined for the scan.

Getter

Returns a list of defined fairness metrics.

Type

List[str]

add_fairness_metric(metric: str)

Add a fairness metric.

Parameters

metric (str) – metric to add.

remove_fairness_metric(name: str)

Remove a fairness metric

Parameters

name (str) – Name of the metric to remove

property primary_fairness_metric: Optional[str]

The fairness metric to use as the Fairness aspect for calculating the ATX score.

Getter

Returns the name of the selected fairness metric for use in ATX calculation.

Setter

Sets the name of the selected fairness metric for use in ATX calculation.

Accepts an Optional[str] specifying the primary fairness metric (if any) on set, returns Optional[str] of the fairness metric instance on get.

property atx_performance_metric: Optional[certifai.scanner.builder.CertifaiModelMetric]

Metric to select for the performance axis of the ATX score.

Getter

Returns the name of the selected performance metric for use in ATX calculation.

Setter

Sets the name of the selected performance metric for use in ATX calculation.

Accepts an Optional[str] specifying the performance metric name (if any) on set, returns Optional[CertifaiModelMetric] of the metric instance on get.

add_model(model: certifai.scanner.builder.CertifaiModel)

Add a model to the scan.

Parameters

model (CertifaiModel) – metadata of the model to add.

remove_model(id: str)

Remove a model from the scan.

Parameters

id (str) – Removes the model with the specified id.

Returns

property models: List[certifai.scanner.builder.CertifaiModel]

Models included in the scan.

Getter

Returns a list of included models.

Type

List[CertifaiModel]

add_dataset(dataset: certifai.scanner.builder.CertifaiDataset)

Add a dataset.

Parameters

dataset (CertifaiDataset) – the dataset to add.

remove_dataset(dataset_id: str)

Remove a dataset.

Parameters

dataset_id (str) – Dataset to remove (by id).

property datasets: List[certifai.scanner.builder.CertifaiDataset]

Datasets defined by the scan.

Getter

Returns a list of defined datasets.

Type

List[CertifaiDataset]

property dataset_schema: certifai.scanner.builder.CertifaiDataSchema

Dataset schema used by the scan use case.

Getter

Returns the dataset schema.

Setter

Sets the dataset schema.

Type

CertifaiDataSchema

property feature_restrictions: Dict[str, certifai.scanner.builder.CertifaiFeatureRestriction]

Get restrictions on feature changes made during counterfactual production

Returns

dictionary of restrictions keyed on feature name

Return type

Dict[str,CertifaiFeatureRestriction]

add_feature_restriction(feature_name: str, restriction: certifai.scanner.builder.CertifaiFeatureRestriction)

Add a restriction on the changes that can be made to a feature during counterfactual production.

Parameters
remove_feature_restriction(feature_name: str)

Remove a restriction on the changes that can be made to a feature during counterfactual production

Parameters

feature_name (str) – feature to de-restrict

property monitored_features: List[Union[str, int]]

Monitored features defined for the scan.

Getter

Returns a list of monitored features.

Type

List[Union[str, int]]

add_monitored_feature(feature: Union[str, int])

Adds a monitored feature.

Parameters

feature (Union[str, int]) – feature name or index

remove_monitored_feature(feature: Union[str, int])

Removes a monitored feature.

Parameters

feature (Union[str, int]) – feature name or index

extract_yaml() str

Extract the scan as a YAML definition.

Returns

string containing the scan template encoded as YAML.

Return type

str

save(file)

Save the scan template to a file.

Parameters

file – file object opened for write to which the definition is to be saved.

run_preflight(model_id: Optional[str] = None, base_path: Optional[str] = None, callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>, refresh: bool = True)

Run the preflight scan (in-process).

Parameters
  • model_id (Optional[str]) – Optional specific model id to restrict the scan to

  • base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).

  • callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as preflight checks are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. This is not applicable when refresh is False. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.

  • refresh (bool) – If True all preflight checks will be run and the latest results will be returned. Otherwise, the results will be computed from existing preflight report data. Defaults to True.

Returns

a nested dictionary of messages produced during the preflight scan. The top level keys are model ids, second level keys is the message type, within which is a list of strings

Return type

dict

run_explain(precalculate: bool = False, fast: bool = False, sampling: bool = False, model_id: Optional[str] = None, base_path: Optional[str] = None, explanation_format: str = 'csv', callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>, write_reports: bool = True, **kwargs)

Run an explanation scan (in-process).

Parameters
  • precalculate (bool) – If True then precalculation of baselines for the model/usecase will be calculated and stored for use in fast explanations. Defaults to False.

  • fast (bool) – If True then fast explanations will be used, which is suitable for bulk-explanation of large datasets. Fast explanation requires the precalculate step to have been performed for the model and use case previously (or in the same call). Defaults to False.

  • sampling (bool) – If true then Counterfactual Sampling will be used. This is suitable for use-cases that have a large representative evaluation dataset. Defaults to False.

  • model_id (Optional[str]) – Optional specific model id to restrict the scan to

  • base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).

  • explanation_format (str) – Format in which to write the explanations, must be one of: ‘csv’, ‘jsonlines’, ‘inline’. If either ‘csv’ or ‘jsonlines’, then explanations will be written in a separate file and the filename will be specified in the scan report. If ‘inline’ the explanations will be included in the scan report. This is not applicable when precalculate is True. Defaults to ‘csv’.

  • callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as evalutions are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.

  • write_reports (bool) – Whether to write scan report files or not, default to True. This arguments takes precedence over the explanation_format.

Returns

a nested dictionary. If precalculate is True, the top level keys are the model ids and each value dictionary with a status, a possible error message, and the location for the persisted calculations. Otherwise, the top level keys are the evaluation type and second level keys are the model ids, within which is the report JSON represented in dictionary format.

Return type

dict

run(model_id: Optional[str] = None, report: Optional[str] = None, write_reports: bool = True, base_path: Optional[str] = None, callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>)

Run the scan (in-process).

Parameters
  • model_id (Optional[str]) – Optional specific model id to restrict the scan to.

  • report (Optional[str]) – Optional specific report (evaluation type) to restrict the scan to.

  • write_reports (bool) – Whether to write report files for each model evaluation to the scan’s output directory (by default ‘./reports’ relative to base_path). Default is True.

  • base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).

  • callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as evaluations are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.

Returns

nested dictionary of reports. Top level keys are the evaluation type, second level keys are the model ids, within which is the report JSON represented in dictionary format.

Return type

dict

static from_file(filename: str) certifai.scanner.builder.CertifaiScanBuilder

Load a scan template from file.

Parameters

filename (str) – path to template file to read.

Returns

Instantiated ScanBuilder with metadata from the template that was read.

Return type

CertifaiScanBuilder

static from_yaml(as_yaml: str) certifai.scanner.builder.CertifaiScanBuilder

Load a scan template from file.

Parameters

as_yaml (str) – Definition to load as YAML string.

Returns

Instantiated ScanBuilder with metadata from the template that was read.

Return type

CertifaiScanBuilder

static create(use_case_name: str, use_case_id: Optional[str] = None, evaluation_name: Optional[str] = None, environment: Optional[str] = None, description: Optional[str] = None, prediction_task: certifai.scanner.builder.CertifaiPredictionTask = <certifai.scanner.builder.CertifaiPredictionTask object>, output_path: Optional[str] = None) certifai.scanner.builder.CertifaiScanBuilder

Create a new template builder.

Parameters
  • use_case_name (str) – Name of the prediction use case.

  • use_case_id (Optional[str]) – Id by which the use case will be referenced. Defaults to the name if omitted.

  • evaluation_name (Optional[str]) – Name of the evaluation. Defaults to the use case name if not provided.

  • environment (Optional[str]) – Optional opaque string recording scan environment information.

  • description (Optional[str]) – Optional human readable description of the use case.

  • prediction_task (str) – Prediction task metadata.

  • output_path (Optional[str]) – where to write report files to. If a relative path evaluated with respect to the base path at evaluation time. If omitted, reports will be written to ‘./reports’.

Returns

Instantiated ScanBuilder with metadata from the template that was read.

Return type

CertifaiScanBuilder