certifai.scanner.builder module¶
Certifai Scan object model builder
Contains classes representing a Certifai scan definition, and the ability to programmatically manipulate, load, save and run them.
- class certifai.scanner.builder.CertifaiOutcomeValue(value: Any, name: Optional[str] = None, favorable: bool = False)¶
Outcome value for a classification task type.
- class certifai.scanner.builder.CertifaiTaskOutcomes¶
Union type representing the outcomes of a task by type (possible classes for classification, favorable direction for regression)
- property task_type: certifai.engine.engine_api_types.CertifaiTaskType¶
- property prediction_favorability: str¶
- static regression(increased_favorable: Optional[bool], change_std_deviation: Optional[float] = None, absolute_threshold: Optional[float] = None, absolute_percentile: Optional[float] = None)¶
Construct a regression task type
Favorability may be specified in one of three ways (only one of which may be specified):
As a relative increase [or decrease] by a multiple of the population global regressed value standard deviation
As an absolute threshold, specified as an exact value of the regressor output
As an absolute threshold, specified as a percentile of the population global regressed value empirical distribution
- Parameters
increased_favorable (Optional[bool]) – True if the favorable direction of the prediction is increasing. Can be set to None if there is no favorable direction.
change_std_deviation (Optional[float]) – Number of standard deviations considered to be a significant change
absolute_threshold (Optional[float]) – Absolute regressed value threshold for favorability
absolute_percentile (Optional[float]) – Absolute regressed value threshold for favorability expressed as a population percentile
- Returns
CertifaiTaskType instance representing the regression outcome definition
- static classification(prediction_values: Iterable[certifai.scanner.builder.CertifaiOutcomeValue], prediction_favorability: Optional[str] = 'explicit', last_favorable_prediction: Optional[Any] = None, favorable_outcome_group_name: Optional[str] = None, unfavorable_outcome_group_name: Optional[str] = None)¶
Construct a classification task type
- Parameters
prediction_values (Iterable[CertifaiClassificationPrediction]) – list of possible classes
prediction_favorability (Optional[str]) –
describes the favorability of the prediction_values, default ‘explicit’. Must be one of
’explicit’, predictions should be explicitly marked as favorable
’ordered’, predictions are ordered from most to least favorable
’none’, no prediction should be treated as favorable
last_favorable_prediction (Optional[Any]) – ignored unless the prediction_favorability is ‘ordered’, in which case this value should be the last label (in the ordering of the prediction_values which is considered favorable)
favorable_outcome_group_name (Optional[str]) – name of favorable group of prediction values - reserved for multiclass-classification task’s with a prediction_favorablity of ‘explicit’
unfavorable_outcome_group_name (Optional[str]) – name of unfavorable groups of prediction values - reserved for multiclass-classification task’s with a prediction_favorability of ‘explicit’
- Returns
CertifaiTaskType instance representing the classification outcome definition
- class certifai.scanner.builder.CertifaiPredictionTask(outcomes: certifai.scanner.builder.CertifaiTaskOutcomes, prediction_description: Optional[str] = None)¶
Metadata about the prediction task - immutable once instantiated.
- Parameters
outcomes (CertifaiTaskOutcomes) – One of the supported CertifaiTaskOutcomes types, constructed from the static methods on CertifaiTaskOutcomes.
prediction_description (Optional[str]) – Free text description of what is being predicted.
- property task_type: str¶
The task type string (‘binary_classification’, ‘multiclass-classification’, ‘regression’)
- Getter
Returns the task type.
- Type
str
- property prediction_description: Optional[str]¶
Description of what the prediction represents.
- Getter
Returns the description, if any.
- Type
Optional[str]
- property prediction_favorability: Optional[str]¶
What format is used for specifying the favorable prediction value, if any, (‘none’, ‘ordered’, ‘explicit’).
- Getter
Returns prediction favorability
- Type
Optional[str]
- property favorable_outcome: Optional[str]¶
What the favorable outcome direction is for a regression task, None otherwise.
- Getter
Returns the favorable label direction (regression) if set.
- Type
Optional[Any]
- property prediction_values: List[certifai.scanner.builder.CertifaiOutcomeValue]¶
- property last_favorable_prediction: Optional[Any]¶
- property regression_standard_deviation: Optional[float]¶
- property regression_absolute_threshold: Optional[float]¶
- property regression_absolute_percentile: Optional[float]¶
- property favorable_outcome_group_name: Optional[str]¶
The string name of the favorable group of prediction values - reserved for multiclass-classification with prediction_favorability of ‘explicit’ - None otherwise.
- Getter
Return name of favorable group of prediction values
- Type
Optional[str]
- property unfavorable_outcome_group_name: Optional[str]¶
The string name of the unfavorable group of prediction values - reserved for multiclass-classification with prediction_favorability of ‘explicit’ - None otherwise.
- Getter
Return name of unfavorable group of prediction values
- Type
Optional[str]
- class certifai.scanner.builder.CertifaiModelMetric(name: str, certifai_metric: Optional[str] = None)¶
Metadata for a metric - immutable once instantiated
- Parameters
name (str) – Free text descriptive name of the metric.
certifai_metric (Optional[str]) –
If specified will allow Certifai to calculate the value. Supported values are:
’accuracy’ (classification)
’precision’ (classification)
’recall’ (classification)
’f1’ (classification)
’r-squared’ (regression)
Micro and macro variants are also supported for precision, recall and f1 e.g. ‘f1(micro)’
- property name: str¶
Descriptive name of the metric.
- Getter
Returns the human-readable metric name.
- Type
str
- property certifai_metric: Optional[str]¶
Certifai metric type name.
- Getter
Returns the Certifai-evaluable metric type (if set).
- Type
Optional[str]
- class certifai.scanner.builder.CertifaiPredictorWrapper(predictor: certifai.common.hosted_model.IBaseModel, encoder: Optional[Callable[Sequence, Sequence]] = None, decoder: Optional[Callable[Sequence, Sequence]] = None, wrapped: Optional[certifai.common.hosted_model.IHostedModel] = None, soft_predictions: bool = False, label_ordering: Optional[List[Any]] = None, threshold: Optional[float] = None)¶
Wrapper class for in-process models
Note - the underlying model and any encoder and decoders used must be picklable.
- Parameters
predictor (IBaseModel) – Any predictor object that has a predict method which takes a sequence of data vectors as a numpy array and returns a sequence of corresponding predicted values.
encoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the predictor’s input (e.g. - to perform one-hot encoding and so on)
decoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the predictor’s output (e.g. - to binarize with a threshold).
wrapped (Optional[IHostedModel]) – if specified other parameters are ignored and the wrapper just proxies the (already wrapped) model provided here (mostly intended for internal usage).
soft_predictions (bool) – If True the model supports soft scoring for predictions (default False)
label_ordering (Optional[List[Any]]) – For soft scoring models the ordering of the classification labels in the scoring vector
threshold (Optional[float]) – For binary classifiers whose soft-scores are returned as a 1-dimensional array of scores, one for each input row, the threshold to apply. Scores greater than or equal to the threshold will receive the second label (or 1 rather than 0 if no labels provided)
- property model: certifai.common.hosted_model.IHostedModel¶
Certifai metric type name.
- Getter
Returns the wrapped model suitable for use by Certifai.
- Type
IHostedModel
- class certifai.scanner.builder.CertifaiModelConnector(name: str, module_name: str, class_name: str, description: Optional[str] = None, model_args: Dict[str, str] = {}, model_secrets: Dict[str, str] = {})¶
Metadata for a model connector
- Parameters
name (str) – Free text descriptive name of the connector.
module_name (str) – python module containingthe external connector (e.g. ‘certifai.connectors’)
class_name (str) – name of python class of the connector
Optional[str] (description) – Optional description
Dict[str,str] (model_secrets) – arguments to pass to the model connector instances
Dict[str,str] – secrets to pass to the model connector instances - substrings of the values of the form {<NAME>} will have the <NAME> replaced by the contents of the environment variable of that name
- property name: str¶
Descriptive name of the connector.
- Getter
Returns the connector name.
- Type
str
- property module_name: str¶
Module containing the connector.
- Getter
Returns the fully qualified module name.
- Type
str
- property class_name: str¶
Class name of the connector.
- Getter
Returns the name of the python class of the connector.
- Type
str
- property description: str¶
Description of the connector.
- Getter
Returns the optional description.
- Type
str
- property model_args: Dict[str, str]¶
Arguments to instantiated model connector instances.
- Getter
Returns the arguments to be provided to connector instances.
- Type
Dict[str,str]
- property model_secrets: Dict[str, str]¶
Secrets provided to instantiated model connector instances.
- Getter
Returns the secrets to be provided to connector instances.
- Type
Dict[str,str]
- class certifai.scanner.builder.CertifaiModel(id: str, name: Optional[str] = None, author: Optional[str] = None, version: Optional[str] = None, performance_metric_values: Optional[List[Tuple]] = None, description: Optional[str] = None, predict_endpoint: Optional[str] = None, max_batch_size: Optional[int] = None, local_predictor: Optional[certifai.scanner.builder.CertifaiPredictorWrapper] = None, supports_soft_scoring: bool = False, prediction_value_order: Optional[List[Any]] = None, connector: Optional[certifai.scanner.builder.CertifaiModelConnector] = None, json_strict: bool = False)¶
Metadata describing a model, and allowing manipulation of this metadata.
- Parameters
id (str) – Identifier for the model used to refer to it.
name (Optional[str]) – Descriptive name for the model. Defaults to the value provided for id
author (Optional[str]) – Optional author name.
version (Optional[str]) – Optional model version string.
performance_metric_values (Optional[List[Tuple]]) – Optional asserted list of (metric_name, value) pairs for metrics of the model - primarily intended to allow injection of externally measured values for metrics not directly supported by Certifai.
description (Optional[str]) – Optional free text description of the model.
predict_endpoint (Optional[str]) – URL of the prediction endpoint of the model (if non-process-local).
max_batch_size (Optional[int]) – Optional limit on prediction batch sizes to call the model with.
local_predictor (Optional[CertifaiPredictorWrapper]) – wrapped model object (if using a local in-process model).
supports_soft_scoring (bool) – If True model is expected to return soft scores as well as hard predictions
prediction_value_order (List[Any]) – For soft scoring models the ordering of the class labels in the score vector
connector (Optional[CertifaiModelConnector]) – Optional connector to use to attach to the model
json_strict (bool) – If True data will be serialized to send to the model’s predict endpoint in strict JSON, encoding missing data as JSON nulls. If False then JavaScript extended JSON will be used which encodes missing values as NaN. Defaults to False
- property name: str¶
Model name.
- Getter
Returns the human-readable name of the model.
- Type
str
- property id: str¶
Model id.
- Getter
Returns the identifier for the model by which it may be referenced.
- Type
str
- property author: Optional[str]¶
Model author.
- Getter
Returns the author string if provided.
- Setter
Set author string for the model.
- Type
Optional[str]
- property version: Optional[str]¶
Model version.
- Getter
Returns the version string if provided.
- Setter
Set version string for the model.
- Type
Optional[str]
- property description: Optional[str]¶
Model description.
- Getter
Returns the description string if provided.
- Setter
Set description string for the model.
- Type
Optional[str]
- property predict_endpoint: Optional[str]¶
Model predict endpoint URL
- Getter
Returns the URL of the (remote) model prediction endpoint, if provided
- Setter
Sets the prediction endpoint URL for the model
- Type
Optional[str]
- property max_batch_size: Optional[int]¶
Model max batch size.
- Getter
Returns the max batch size to send to the model.
- Setter
Sets the provided restriction on max batch size (None => unlimited).
- Type
Optional[int]
- property supports_soft_scoring: bool¶
Whether the model returns soft scores.
- Getter
True if the model is expected to upport soft scores.
- Setter
Sets whether the model is expected to support soft scores.
- Type
bool
- property prediction_value_order: bool¶
Ordering of class labels in the score vector returned by the model.
- Getter
Returns the ordering.
- Setter
Sets the ordering.
- Type
List[Any]
- property local_predictor: Optional[certifai.scanner.builder.CertifaiPredictorWrapper]¶
Wrapped local (in-process) model.
- Getter
Returns the wrapped model being used.
- Setter
Sets a local wrapped model (see
CertifaiPredictorWrapper
) to use.- Type
Optional[CertifaiPredictorWrapper]
- property performance_metric_values: List[Tuple[str, Any]]¶
List of asserted metric values for this model.
- Getter
Returns any asserted values as (metric name, value) tuples.
- Type
List[Tuple[str,Any]]
- add_performance_metric_value(metric_name: str, metric_value: Any)¶
Add an asserted performance metric value.
- Parameters
metric_name (str) – Name of the metric to assert a value for.
value (Any) – Value to assert.
- remove_performance_metric_value(metric_name: str)¶
Remove an asserted metric value
- Parameters
metric_name – name of the metric to remove the asserted value for.
- property connector: Optional[certifai.scanner.builder.CertifaiModelConnector]¶
- property json_strict: bool¶
Whether to encode to this model is strict JSON
- Getter
True if the model expects strict JSON (missing encoded as null as opposed to NaN).
- Setter
Sets whether the model expects strict JSON
- Type
bool
- class certifai.scanner.builder.CertifaiFeatureDataType(args: dict)¶
Class describing feature datatypes supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported type.
- property value_dict: dict¶
Data type details as a dictionary.
- Getter
Returns the metadata dict for this datatype. In particular this will contain a key named data_type which will be one of
‘numerical-int’
‘numerical-float’
‘categorical’
Other keys vary by datatype.
- Type
dict
- static int(min: Optional[int] = None, max: Optional[int] = None, spread: Optional[float] = None) certifai.scanner.builder.CertifaiFeatureDataType ¶
Constructor for an ‘int’ feature.
- Parameters
min (Optional[int]) – optional floor value this feature can take.
max (Optional[int]) – optional ceiling value this feature can take.
spread (Optional[float]) – optional measure of spread (typically MAD or std. deviation).
- Returns
instantiated CertifaiFeatureDataType
- Return type
CertifaiFeatureDataType
- static float(min: Optional[float] = None, max: Optional[float] = None, spread: Optional[float] = None) certifai.scanner.builder.CertifaiFeatureDataType ¶
Constructor for an ‘float’ feature.
- Parameters
min (Optional[float]) – optional floor value this feature can take.
max (Optional[float]) – optional ceiling value this feature can take.
spread (Optional[float]) – optional measure of spread (typically MAD or std deviation).
- Returns
instantiated CertifaiFeatureDataType
- Return type
CertifaiFeatureDataType
- static categorical(values: Optional[Iterable[Union[str, int]]] = None, value_columns: Optional[List[Tuple[str, Union[str, int]]]] = None, target_encodings: Optional[Iterable[float]] = None, categorical_type: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureDataType ¶
Constructor for a ‘categorical’ feature.
- Parameters
values (Optional[Iterable[Union[str,int,bool]]]) – Optional list of possible values this categorical field may take on. If omitted then Certifai will infer the value set from the available data.
value_columns (Optional[List[Tuple[str, Union[str, builtins.int, bool]]]]) –
Optional list of column name and categorical value pairs, for one-hot encoded data. If both value_columns and values are specified then they must have exactly the same
set of values. If only value_columns is specified then the values are inferred. If only values is specified then the feature is assumed to be value-encoded in a single column.
target_encodings (Optional[Iterable[float]]) – optional list of encodings for the values in values used to represent those values in the dataset
categorical_type (Optional[str]) – optional string specifying the data type the categorical feature is. Must be one of: ‘string’, ‘int’, or ‘auto’. For example, specifying ‘string’ would mean that the value 001 will be interpreted as the string ‘001’, instead of as the integer 1.
- Returns
instantiated CertifaiFeatureDataType
- Return type
CertifaiFeatureDataType
- class certifai.scanner.builder.CertifaiFeatureRestriction(args: dict)¶
Class describing feature change restrictions supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported type.
- property value_dict: dict¶
Data type details as a dictionary.
- Getter
Returns the metadata dict for this datatype. In particular this will contain a key named constraint which will be one of
‘constant’
‘percentage’
‘range’
Other keys vary by datatype.
- Type
dict
- static range(min: Optional[int] = None, max: Optional[int] = None, direction: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureRestriction ¶
Constructor for a range constraint on feature modifications in counterfactual production.
- Parameters
min (Optional[int]) – optional floor value this feature can take.
max (Optional[int]) – optional ceiling value this feature can take.
direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’
- Returns
instantiated CertifaiFeatureRestriction
- Return type
CertifaiFeatureRestriction
- static percentage(amount: float, direction: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureRestriction ¶
Constructor for a percentage change constraint on feature modifications in counterfactual production.
- Parameters
amount (float) – max percentage the feature may change by (can only be applied to numeric features).
direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’
- Returns
instantiated CertifaiFeatureRestriction
- Return type
CertifaiFeatureRestriction
- static constant() certifai.scanner.builder.CertifaiFeatureRestriction ¶
Constructor for a no-change constraint on feature modifications in counterfactual production.
- Returns
instantiated CertifaiFeatureRestriction
- Return type
CertifaiFeatureRestriction
- static standard_deviation(value: float, tolerance_value: Optional[float] = None, direction: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureRestriction ¶
Constructor for a standard deviation constraint on feature modifications in counterfactual production.
- Parameters
value (float) – number of standard deviations the feature may change by (can only be applied to numeric features).
tolerance_value (float) – additional number of standard deviations the feature may change by if no solutions could be found (not applicable to all scans)
direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’ (not applicable to all scans)
- Returns
instantiated CertifaiFeatureRestriction
- Return type
CertifaiFeatureRestriction
- static fixed_amount(value: float, tolerance_value: Optional[float] = None, direction: Optional[str] = None) certifai.scanner.builder.CertifaiFeatureRestriction ¶
Constructor for a fixed amount constraint on feature modifications in counterfactual production.
- Parameters
value (float) – fixed amount that the feature may change by (can only be applied to numeric features).
tolerance_value (float) – additional amount the feature may change by if no solutions could be found (not applicable to all scans)
direction (Optional[str]) – optional direction that features can be allowed to change. Must be one of: ‘any’, ‘increase’, ‘decrease’ (not applicable to all scans)
- Returns
instantiated CertifaiFeatureRestriction
- Return type
CertifaiFeatureRestriction
- static value_set(values: List[Union[str, int]], tolerance_values: Optional[List[Union[str, int]]] = None) certifai.scanner.builder.CertifaiFeatureRestriction ¶
Constructor for an allowed value mapping constraint on feature modifications in counterfactual production.
- Parameters
values (List[Union[str, builtins.int, bool]]) – fixed set of values the feature may change to (can only be applied to categorical features).
tolerance_values (Optional[List[Union[str, builtins.int, bool]]]) – additional values the feature may change to if no solutions could be found
- Returns
instantiated CertifaiFeatureRestriction
- Return type
CertifaiFeatureRestriction
- static value_map(values: Dict[Union[str, int], List[Union[str, int]]], tolerance_values: Optional[Dict[Union[str, int], List[Union[str, int]]]] = None)¶
Constructor for an allowed value mapping constraint on feature modifications in counterfactual production.
- Parameters
values (Dict[Union[str, builtins.int, bool], List[Union[str, builtins.int, bool]]]) – Dictionary mapping of categorical values to values the feature may change to (can only be applied to categorical features).
tolerance_values (Optional[Dict[Union[str, builtins.int, bool], List[Union[str, builtins.int, bool]]]]) – Additional dictionary mapping of categorical values to values the feature may change to.
- Returns
instantiated CertifaiFeatureRestriction
- Return type
CertifaiFeatureRestriction
- class certifai.scanner.builder.CertifaiFeatureSchema(name: str, data_type: Optional[certifai.scanner.builder.CertifaiFeatureDataType] = None)¶
Class describing a feature - immutable once instantiated.
- Parameters
name (str) – The name of the feature (should match any column headers in the dataset if any).
data_type (CertifaiFeatureDataType) – Type of data the feature holds.
- property name: str¶
Feature name.
- Getter
Returns the name of the feature.
- Type
str
- property data_type: Optional[certifai.scanner.builder.CertifaiFeatureDataType]¶
Feature data type
- Getter
Returns the data type of the feature.
- Type
- class certifai.scanner.builder.CertifaiDataSchema(features: Optional[List[certifai.scanner.builder.CertifaiFeatureSchema]] = None, outcome_feature_name: Optional[str] = None, predicted_outcome_feature_name: Optional[str] = None, hidden_feature_names: Optional[List[str]] = None, defined_feature_order: bool = False)¶
Class describing a dataset’s feature schema, and allowing manipulation of this schema.
- Parameters
features (Optional[List[CertifaiFeatureSchema]]) – features specified by the scan definition. This may be a subset of all the features present. Any that are omitted will be inferred from the available data.
outcome_feature_name (Optional[str]) – name of the feature holding the ground truth label/value (if present) Note Any outcome feature column will be removed before passing data to the model.
predicted_outcome_feature_name (Optional[str]) – name of the feature holding the predicted label/value (if present) Note Any predicted_outcome feature column will be removed before passing data to the model
hidden_feature_names (Optional[List[str]]) – list of feature names that should be hidden from the model
defined_feature_order (bool) – If present and True asserts that the list order of features in the schema matches the layout of columns in the dataset. If True then all columns must be present. Intended for use in cases where the dataset does not specify a column ordering itself.
- property features: Optional[List[certifai.scanner.builder.CertifaiFeatureSchema]]¶
features defined by the schema.
- Getter
Returns the list of defined features.
- Type
Optional[List[CertifaiFeatureSchema]]
- property defined_feature_order: bool¶
Whether the schema defines the column ordering of the data.
- Getter
Returns True if the schema defines the column ordering.
- Setter
Sets whether the schema defines the column ordering of the data.
- Type
bool
- add_feature(name: str, data_type: certifai.scanner.builder.CertifaiFeatureDataType)¶
Add a feature
- Parameters
name (str) – Name of feature to add.
data_type (CertifaiFeatureDataType) – data type of feature to add.
Note - the feature will be appended to the current list
- insert_feature(name: str, index: int, data_type: certifai.scanner.builder.CertifaiFeatureDataType)¶
Insert a feature.
- Parameters
name (str) – Name of feature to add.
index (int) – Columnar position to insert the feature at (0-based).
data_type (CertifaiFeatureDataType) – data type of feature to add.
- update_feature(name: str, data_type: certifai.scanner.builder.CertifaiFeatureDataType)¶
Update an existing feature by name - preserves its index in th feature list
- Parameters
name (str) – Name of feature to update.
data_type (CertifaiFeatureDataType) – new data type of feature being updated.
- remove_feature(name: str)¶
Remove a feature.
- Parameters
name (str) – Name of feature to remove.
- infer_features_from_data(dataset_source: certifai.scanner.builder.CertifaiDatasetSource)¶
- property outcome_feature_name: Optional[str]¶
Name of the (ground truth) outcome column (if any).
- Getter
Returns the feature name of the outcome feature.
- Setter
Sets the name of the (ground truth) outcome column.
- Type
Optional[str]
- property predicted_outcome_feature_name: Optional[str]¶
Name of the predicted outcome column (if any).
- Getter
Returns the feature name of the predicted outcome feature.
- Setter
Sets the name of the predicted outcome column.
- Type
Optional[str]
Names of hidden (from the model) features (if any).
- Getter
Returns a list feature names of features which are not provided to the model.
- Setter
Sets a list feature names of features which are not provided to the model.
- Type
Optional[str]
Note Any specified outcome_feature_name or predicted_outcome_feature_name will automatically be hidden from the model and need not occur in this list
- class certifai.scanner.builder.CertifaiDatasetSource(args)¶
Class describing dataset storage formats supported by Certifai - immutable once instantiated. Static methods are provided for instantiating each supported format.
- property value_dict¶
Data source details as a dictionary.
- Getter
Returns the metadata dict for this data source. In particular this will contain a key named file_type which will be one of
‘csv’
‘json’
‘loaded’
Other keys vary by source type.
- Type
dict
- static json(url: str, lines: bool = True, orient: str = 'records', encoding: Optional[str] = None)¶
Constructor for a ‘json’ source.
- Parameters
url (str) – Location the data may be loaded from. If no protocol is specified then ‘file:’ is assumed.
lines (bool) – If True then JSON lines format (default is True), else JSON list expected.
orient (str) – One of ‘records’, ‘columns’, ‘values’ (matching Pandas usage). Default is ‘records’
encoding (Optional[str]) – string encoding used - default is ‘utf-8’.
- Returns
instantiated DatasetSource
- Return type
DatasetSource
- static csv(url: str, delimiter: str = ',', escape_character: Optional[str] = None, quote_character: str = '"', has_header: bool = True, encoding: Optional[str] = None)¶
Constructor for a ‘csv’ source.
- Parameters
url (str) – Location the data may be loaded from. If no protocol is specified then ‘file:’ is assumed.
delimiter (str) – Record separator used. Default is ‘,’.
escape_character (Optional[str]) – Escape character if any. Default is None.
quote_character (str) – Quote delimiter. Default is ‘”’.
has_header (bool) – Whether the source CSV has a header row specifying column names. Default is True.
- Returns
instantiated DatasetSource
- Return type
DatasetSource
- static dataframe(df)¶
Constructor for a ‘dataframe’ source (an already loaded Pandas dataframe).
- Parameters
df (DataFrame) – Dataframe containing the data.
- Returns
instantiated DatasetSource.
- Return type
DatasetSource
- class certifai.scanner.builder.CertifaiDataset(id: str, source: certifai.scanner.builder.CertifaiDatasetSource, name: Optional[str] = None, description: Optional[str] = None)¶
Metadata describing a dataset.
- Parameters
id (str) – identifier string by which the dataset may be referenced.
source (DatasetSource) – source for the actual data in the dataset.
name (Optional[str]) – Optional human readable name of the dataset.
description (Optional[str]) – Optional free text description of the dataset.
- class certifai.scanner.builder.CertifaiGroupingBucket(description: str, max: Optional[float] = None, values: Optional[List[Union[str, int]]] = None)¶
Metadata describing a value grouping bucket for feature values - immutable once instantiated.
- Parameters
description (str) – Descriptive name of the bucket.
max (Optional[float]) – Optional maximum numerical value in the bucket (may only be used with numeric features).
values (Optional[List[Union[str,int,bool]]]) – Optional explicit list of values falling within the bucket (intended for use with categorical features).
- property description: str¶
Description of the bucket.
- Getter
Returns the bucket description string.
- Type
str
- property max: Optional[float]¶
Maximum numeric value falling within the bucket (if specified).
Note - the floor of a bucket is determined by the ceiling of the previous bucket. The entire bucket list will be sorted on max values and a sentinel bucket with no maximum value should be included in the list.
- Getter
Returns the bucket ceiling value.
- Type
Optional[float]
- property values: Optional[Iterable[Union[str, int]]]¶
List of values falling within the bucket if defined.
- Getter
Returns the list of values or None if not defined.
- Type
Iterable[Union[str,int,bool]]
- class certifai.scanner.builder.CertifaiGroupingFeature(name: str, buckets: Optional[Iterable[certifai.scanner.builder.CertifaiGroupingBucket]] = None)¶
Metadata describing a fairness grouping feature - immutable once instantiated
- Parameters
name (str) – Feature name of the feature which defines the grouping.
buckets (Optional[Iterable[CertifaiGroupingBucket]]) – Optional definition for bucketing the values of the feature. If not specified then every unique value occurring in the data will be treated as its own group.
- property name: str¶
Name of the grouping feature
- Getter
Returns the feature name used to define the groups
- Type
str
- property buckets: Optional[Iterable[certifai.scanner.builder.CertifaiGroupingBucket]]¶
List of grouping buckets.
- Getter
Returns the list of grouping buckets for the grouping feature, if defined.
- Type
Iterable[CertifaiGroupingBucket]
- class certifai.scanner.builder.CertifaiScanBuilder(base: certifai.scanner.schemas.ScanTemplate)¶
Builder class for scan templates, with static method for instantiation, and methods for manipulation, persistence, and running of the defined scan.
- property model_headers: Dict¶
Returns model headers defined in the scan.
- Getter
Returns model headers as dict
- Type
Dict
- add_model_header(header_name: str, header_value: str, model_id: Optional[str] = None)¶
Add or Update a model header. If model_id is provided then model header is added to the specific model otherwise header is set as a default for all models.
- Parameters
header_name (str) – Name of the model header to inject.
header_value (str) – Value associated with the header.
model_id (Optional[str]) – model to add/update headers given model_id
- remove_model_header(header_name: str, model_id: Optional[str] = None)¶
Remove a model header given header_name. If model_id is provided then the header is removed for that specific model, otherwise it is removed from all models (default case).
- Parameters
header_name (str) – name of the header to remove .
model_id (Optional[str]) – model to remove headers from
- property template: certifai.scanner.schemas.ScanTemplate¶
Retrieve a scan template which can be serialized to dictionary form for saving as JSON or YAML by calling its dump method.
- Return ScanTemplate
a ScanTemplate instance
- property author: str¶
Author of the template.
- Getter
Returns the author of the scan template, if defined.
- Setter
Sets the author of the scan template.
- Type
Optional[str]
- property use_case_name: str¶
Use case name - human readable name for a use case (i.e. - a prediction task).
- Getter
Returns the use case name.
- Setter
Sets the use case name.
- Type
str
- property use_case_id: str¶
Use case id - id by which the use case may be referenced.
- Getter
Returns the use case id.
- Setter
Sets the use case id.
- Type
str
- property no_model_access: bool¶
Whether the scan will have access to the model. If false, then all datasets should include a predicted_outcome_column with the model’s predictions and only evaluations that support no model access will be run.
- Getter
returns whether the evaluation has access to its model.
- Setter
sets whether the scan will have access to the listed model.
- Return type
bool
- property evaluation_name: str¶
Evaluation name - name of a particular evaluation (scan run).
- Getter
Returns the evaluation name.
- Setter
Sets the evaluation name.
- Type
str
- property evaluation_environment: str¶
Evaluation environment - free text string with evaluation environment details.
- Getter
Returns the evaluation environment string.
- Setter
Sets the evaluation environment string.
- Type
str
- property evaluation_description: str¶
Evaluation description - free text string describing the evaluation.
- Getter
Returns the evaluation description string.
- Setter
Sets the evaluation description string.
- Type
str
- property evaluation_dataset_id: str¶
Evaluation dataset id - specifies which dataset to use as the evaluation set.
- Getter
Returns the evaluation dataset id.
- Setter
Sets the evaluation dataset id.
- Type
str
- property explanation_dataset_id: Optional[str]¶
Explanation dataset id - specifies which dataset to generate explanations of if the ‘explanation’ evaluation type is included in the scan.
- Getter
Returns the explanation dataset id.
- Setter
Sets the explanation dataset id.
- Type
Optional[str]
- property test_dataset_id: Optional[str]¶
Test dataset id - specifies which dataset to measure metrics on if the ‘performance’ evaluation type is included in the scan.
- Getter
Returns the test dataset id.
- Setter
Sets the test dataset id.
- Type
Optional[str]
- property reference_dataset_id: Optional[str]¶
Reference dataset id - specifies which dataset to use as the reference for computing data quality metrics and drift metrics if the ‘data_statistics’ evaluation type is included in the scan.
- Getter
Returns the reference dataset id.
- Setter
Sets the reference dataset id.
- Return type
Optional[str]
- property prediction_task: certifai.scanner.builder.CertifaiPredictionTask¶
Metadata for the prediction task.
- Getter
Returns the prediction task metadata.
- Setter
Sets the prediction task metadata.
- Type
CertifaiPredictionTask
- property output_path: Optional[str]¶
Output path to which reports will be written. If set to None output will be to ‘./reports’ relative to the scan base path unless explicitly overriden either by the run call or by the SCAN_RESULTS_DIRECTORY environment variable
- Getter
Returns the output path.
- Setter
Sets the output path of the scan.
- Type
Optional[str]
- add_evaluation_type(value: str)¶
Add an evaluation type to the scan.
- Parameters
value (str) – type of evaluation to add. Must be one of - ‘fairness’ - ‘robustness’ - ‘explanation’ - ‘explainability’ - ‘performance’ - ‘data_statistics’
- remove_evaluation_type(value: str)¶
Remove an evaluation type from the scan.
- Parameters
value (str) – type of evaluation to remove.
- property evaluation_types: Iterable[str]¶
Evaluation types included in the scan.
- Getter
Returns the list of included evaluation types.
- Type
Iterable[str]
- property hyper_parameter_overrides: dict¶
Hyper-parameter overrides to apply to the analysis.
- Getter
Returns a dictionary of hyper-parameter overrides.
- Setter
Specifies a dictionary of hyper-parameter overrides.
- Type
dict
- add_fairness_grouping_feature(feature: certifai.scanner.builder.CertifaiGroupingFeature)¶
Add a fairness grouping feature.
- Parameters
feature (CertifaiGroupingFeature) – grouping feature definition.
- remove_fairness_grouping_feature(name: str)¶
Remove a fairness grouping feature.
- Parameters
name (str) – name of the grouping feature to remove.
- property fairness_grouping_features: List[certifai.scanner.builder.CertifaiGroupingFeature]¶
Fairness grouping features defined for the scan.
- Getter
Returns a list of defined grouping features.
- Type
List[CertifaiGroupingFeature]
- property metrics: List[certifai.scanner.builder.CertifaiModelMetric]¶
Performance metrics defined for the scan.
- Getter
Returns a list of defined performance metrics.
- Type
List[ModelMetric]
- add_metric(metric: certifai.scanner.builder.CertifaiModelMetric)¶
Add a performance metric.
- Parameters
metric (CertifaiModelMetric) – metric to add.
- remove_metric(name: str)¶
Remove a performance metric
- Parameters
name (str) – Name of the metric to remove
- property explanation_types: List[str]¶
Explanation types defined for the scan.
- Getter
Returns a list of defined explanation types.
- Type
List[str]
- add_explanation_type(explanation: str)¶
Add a explanation type.
- Parameters
explanation (str) – explanation type to add.
- remove_explanation_type(name: str)¶
Remove an explanation type
- Parameters
name (str) – Name of the explanation type to remove
- property primary_explanation_type: str¶
Explanation type to select for the explainability axis of the ATX score.
- Getter
Returns the name of the selected explanation type for use in ATX calculation.
- Setter
Sets the name of the selected explanation type for use in ATX calculation.
Accepts an str specifying the explanation type on set, returns str of the explanation type on get.
- property fairness_metrics: List[str]¶
Fairness metrics defined for the scan.
- Getter
Returns a list of defined fairness metrics.
- Type
List[str]
- add_fairness_metric(metric: str)¶
Add a fairness metric.
- Parameters
metric (str) – metric to add.
- remove_fairness_metric(name: str)¶
Remove a fairness metric
- Parameters
name (str) – Name of the metric to remove
- property primary_fairness_metric: Optional[str]¶
The fairness metric to use as the Fairness aspect for calculating the ATX score.
- Getter
Returns the name of the selected fairness metric for use in ATX calculation.
- Setter
Sets the name of the selected fairness metric for use in ATX calculation.
Accepts an Optional[str] specifying the primary fairness metric (if any) on set, returns Optional[str] of the fairness metric instance on get.
- property atx_performance_metric: Optional[certifai.scanner.builder.CertifaiModelMetric]¶
Metric to select for the performance axis of the ATX score.
- Getter
Returns the name of the selected performance metric for use in ATX calculation.
- Setter
Sets the name of the selected performance metric for use in ATX calculation.
Accepts an Optional[str] specifying the performance metric name (if any) on set, returns Optional[CertifaiModelMetric] of the metric instance on get.
- add_model(model: certifai.scanner.builder.CertifaiModel)¶
Add a model to the scan.
- Parameters
model (CertifaiModel) – metadata of the model to add.
- remove_model(id: str)¶
Remove a model from the scan.
- Parameters
id (str) – Removes the model with the specified id.
- Returns
- property models: List[certifai.scanner.builder.CertifaiModel]¶
Models included in the scan.
- Getter
Returns a list of included models.
- Type
List[CertifaiModel]
- add_dataset(dataset: certifai.scanner.builder.CertifaiDataset)¶
Add a dataset.
- Parameters
dataset (CertifaiDataset) – the dataset to add.
- remove_dataset(dataset_id: str)¶
Remove a dataset.
- Parameters
dataset_id (str) – Dataset to remove (by id).
- property datasets: List[certifai.scanner.builder.CertifaiDataset]¶
Datasets defined by the scan.
- Getter
Returns a list of defined datasets.
- Type
List[CertifaiDataset]
- property dataset_schema: certifai.scanner.builder.CertifaiDataSchema¶
Dataset schema used by the scan use case.
- Getter
Returns the dataset schema.
- Setter
Sets the dataset schema.
- Type
- property feature_restrictions: Dict[str, certifai.scanner.builder.CertifaiFeatureRestriction]¶
Get restrictions on feature changes made during counterfactual production
- Returns
dictionary of restrictions keyed on feature name
- Return type
Dict[str,CertifaiFeatureRestriction]
- add_feature_restriction(feature_name: str, restriction: certifai.scanner.builder.CertifaiFeatureRestriction)¶
Add a restriction on the changes that can be made to a feature during counterfactual production.
- Parameters
feature_name (str) – feature to restrict
restriction (CertifaiFeatureRestriction) – restriction to apply
- remove_feature_restriction(feature_name: str)¶
Remove a restriction on the changes that can be made to a feature during counterfactual production
- Parameters
feature_name (str) – feature to de-restrict
- property monitored_features: List[Union[str, int]]¶
Monitored features defined for the scan.
- Getter
Returns a list of monitored features.
- Type
List[Union[str, int]]
- add_monitored_feature(feature: Union[str, int])¶
Adds a monitored feature.
- Parameters
feature (Union[str, int]) – feature name or index
- remove_monitored_feature(feature: Union[str, int])¶
Removes a monitored feature.
- Parameters
feature (Union[str, int]) – feature name or index
- extract_yaml() str ¶
Extract the scan as a YAML definition.
- Returns
string containing the scan template encoded as YAML.
- Return type
str
- save(file)¶
Save the scan template to a file.
- Parameters
file – file object opened for write to which the definition is to be saved.
- run_preflight(model_id: Optional[str] = None, base_path: Optional[str] = None, callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>, refresh: bool = True)¶
Run the preflight scan (in-process).
- Parameters
model_id (Optional[str]) – Optional specific model id to restrict the scan to
base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).
callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as preflight checks are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. This is not applicable when refresh is False. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.
refresh (bool) – If True all preflight checks will be run and the latest results will be returned. Otherwise, the results will be computed from existing preflight report data. Defaults to True.
- Returns
a nested dictionary of messages produced during the preflight scan. The top level keys are model ids, second level keys is the message type, within which is a list of strings
- Return type
dict
- run_explain(precalculate: bool = False, fast: bool = False, sampling: bool = False, model_id: Optional[str] = None, base_path: Optional[str] = None, explanation_format: str = 'csv', callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>, write_reports: bool = True, **kwargs)¶
Run an explanation scan (in-process).
- Parameters
precalculate (bool) – If True then precalculation of baselines for the model/usecase will be calculated and stored for use in fast explanations. Defaults to False.
fast (bool) – If True then fast explanations will be used, which is suitable for bulk-explanation of large datasets. Fast explanation requires the precalculate step to have been performed for the model and use case previously (or in the same call). Defaults to False.
sampling (bool) – If true then Counterfactual Sampling will be used. This is suitable for use-cases that have a large representative evaluation dataset. Defaults to False.
model_id (Optional[str]) – Optional specific model id to restrict the scan to
base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).
explanation_format (str) – Format in which to write the explanations, must be one of: ‘csv’, ‘jsonlines’, ‘inline’. If either ‘csv’ or ‘jsonlines’, then explanations will be written in a separate file and the filename will be specified in the scan report. If ‘inline’ the explanations will be included in the scan report. This is not applicable when precalculate is True. Defaults to ‘csv’.
callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as evalutions are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.
write_reports (bool) – Whether to write scan report files or not, default to True. This arguments takes precedence over the explanation_format.
- Returns
a nested dictionary. If precalculate is True, the top level keys are the model ids and each value dictionary with a status, a possible error message, and the location for the persisted calculations. Otherwise, the top level keys are the evaluation type and second level keys are the model ids, within which is the report JSON represented in dictionary format.
- Return type
dict
- run(model_id: Optional[str] = None, report: Optional[str] = None, write_reports: bool = True, base_path: Optional[str] = None, callback: Optional[Callable[certifai.common.progress_task.ProgressUpdate, None]] = <certifai.common.progress_task.ProgressBarListener object>)¶
Run the scan (in-process).
- Parameters
model_id (Optional[str]) – Optional specific model id to restrict the scan to.
report (Optional[str]) – Optional specific report (evaluation type) to restrict the scan to.
write_reports (bool) – Whether to write report files for each model evaluation to the scan’s output directory (by default ‘./reports’ relative to base_path). Default is True.
base_path (Optional[str]) – Optional base path to evaluate relative paths in the scan definition with respect to (if not specified then current working directory is assumed).
callback (Optional[Callable[[ProgressUpdate],None]]) – Optional callback function to receive progress updates as evaluations are completed. If not specified, a default will be used that prints to stdout. Set to None to receive no progress updates. A ProgressUpdate is a NamedTuple with fields units_complete, total_num_units, and summary.
- Returns
nested dictionary of reports. Top level keys are the evaluation type, second level keys are the model ids, within which is the report JSON represented in dictionary format.
- Return type
dict
- static from_file(filename: str) certifai.scanner.builder.CertifaiScanBuilder ¶
Load a scan template from file.
- Parameters
filename (str) – path to template file to read.
- Returns
Instantiated ScanBuilder with metadata from the template that was read.
- Return type
CertifaiScanBuilder
- static from_yaml(as_yaml: str) certifai.scanner.builder.CertifaiScanBuilder ¶
Load a scan template from file.
- Parameters
as_yaml (str) – Definition to load as YAML string.
- Returns
Instantiated ScanBuilder with metadata from the template that was read.
- Return type
CertifaiScanBuilder
- static create(use_case_name: str, use_case_id: Optional[str] = None, evaluation_name: Optional[str] = None, environment: Optional[str] = None, description: Optional[str] = None, prediction_task: certifai.scanner.builder.CertifaiPredictionTask = <certifai.scanner.builder.CertifaiPredictionTask object>, output_path: Optional[str] = None) certifai.scanner.builder.CertifaiScanBuilder ¶
Create a new template builder.
- Parameters
use_case_name (str) – Name of the prediction use case.
use_case_id (Optional[str]) – Id by which the use case will be referenced. Defaults to the name if omitted.
evaluation_name (Optional[str]) – Name of the evaluation. Defaults to the use case name if not provided.
environment (Optional[str]) – Optional opaque string recording scan environment information.
description (Optional[str]) – Optional human readable description of the use case.
prediction_task (str) – Prediction task metadata.
output_path (Optional[str]) – where to write report files to. If a relative path evaluated with respect to the base path at evaluation time. If omitted, reports will be written to ‘./reports’.
- Returns
Instantiated ScanBuilder with metadata from the template that was read.
- Return type
CertifaiScanBuilder