What is Certifai?
Cortex Certifai evaluates AI models for robustness, fairness, and explainability, and allows users to compare different models or model versions for these qualities.
Certifai is available in two Editions:
|Certifai Toolkit||Certifai Enterprise|
|Everything you need to create, run, and view scans locally||Multi-user server deployment running in Kubernetes, plus the Certifai AI Risk Assessment Questionnaire and Policy Select compliance toolset|
How does Certifai work?
Data Scientists create scan definitions, which are comprised of:
- One or more trained models that they want to evaluate
- A single curated dataset (NOTE: For explanations a subset of the dataset may be used.)
Models are evaluated/scored for one or more of the following:
Performance Metric: This is a measurement that the data scientist has provided test data to calculate or pre-calculated scores for in the model definition. (e.g. Accuracy)
Robustness: measures how well models retain an outcome given changes to the data feature values. The more robust a model is, the greater the changes required to alter the outcome.
Fairness by group: measures the difference required to change the outcome for different groups implicit in a feature given the same model and dataset. For example, implicit groups male, female, and nonbinary belong to the feature, "gender". A fair model shows that all 3 groups require a similar amount of change to alter the results.
Explainability: measures the average simplicity of counterfactual explanations provided for each model. An explanation that requires a single changed feature will score 100%. Explanations that require more changed features will score lower.
Explanations: display the prediction provided through the generation of counterfactuals for the change that must occur in a dataset with given restrictions to obtain a different outcome. To alter an outcome some dataset feature values must change while others remain constant. Each observation row of the dataset is displayed in a table that shows the changed features, as well as the original values and counterfactual values for that feature. Users can explore the entire dataset one observation at a time to understand what features changed and by how much to obtain a different result.
Business decision makers and Compliance Officers are able to view the evaluation comparison visualizations and scores to select the best models for business goals and to identify whether or not models meet thresholds for robustness, fairness, and/or explainability.
Data Scientists can use the evaluation results to improve models and model training to provide more trustworthy AI models.
Task types supported in Certifai
Certifai can scan most classification and regression models that use tabular data. Certifai analyzes the model by making batches of prediction requests for different inputs, so the model needs to be able to return prediction results on demand.
- Binary classification: A model that given a set of data elements predicts which one of the two groups the each element in the set belongs to (e.g. Loan granted or Loan denied).
- Regression: A model that is used to estimate the relationships between a dependent variable (the outcome variable) and one or more independent variables (or features). Regression allows models to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given set of values.
- Multiclass classification: A model that predicts which of a specified set of classes each data point belongs to. Certifai is able to provide insights for 3 types of Multiclass classification use cases:
- Where outcomes are neither favorable nor unfavorable
- Where some outcomes are designated as favorable and others are designated as unfavorable
- Where outcomes are on a scale from most favorable thru neutral to most unfavorable
Certifai Toolkit - Local installation with everything you need to create, run, and view scans locally:
- Use Case: Used for local scans run by data scientists with smaller datasets.
- Storage: Datasets must be saved locally (as .csv files with headers).
- Scan definition: Scans may be defined locally using the Certifai CLI or in a Jupyter notebook.
- Run scan: Scans are run from the CLI or a notebook using the API.
- Scan output: Scan results are saved to a local file and displayed in a local instance of the Console.
Certifai Enterprise - Multi-user server deployment running as an Operator in Kubernetes and used in conjunction with Certifai Toolkit:
- Use Case: For multiple users running scans that require more resources because of large datasets or long runtimes.
- Storage: The datasets and scan reports are stored in cloud storage (like S3, Ceph, or other cloud storage) that is configured when the Certifai Operator is installed.
- Scan definition: Created using the Toolkit, as above.
- Run scan: Remote scans are run in the Kubernetes cluster by a user with Kubernetes access permissions. The Certifai CLI may be used to invoke a Kubernetes job that runs the scan. Datasets must be available in cloud storage.
- Scan output: The scan results are delivered to cloud storage and viewed using the Console running in Kubernetes.
Certifai runtime components
- Reference Model server app - Flask app that serves the reference models to help you get started with Certifai
- Certifai CLI - Command line to create and run scans locally or remotely
- Client libraries/API - Python API to run scans against models running in notebooks
- Console app - Flask app that provides a UI for viewing scan results
- Certifai Operator - Installs, configures and manages the lifecycle of the other Certifai components
- Console Service - Kubernetes Service that provides a UI for viewing scan results
- Scanner - Docker image used to run a scan as a Kubernetes job
- Reference Model Service - Kubernetes Service that serves the reference models to help you get started with Certifai
- AI Risk Assessment Questionnaire and Policy Select tool
Supported Object Stores
- S3 compatible object store (Noobaa, Ceph)
- Azure Blob Store