Version: 1.3.14.1

Scan Manager User's Guide

Learn how to use the Certifai Scan Manager to load files that run in the prediction service.

Prerequisites

Scan List

When you login to Scan Manager the landing page is a list of Scans that you have configured and run.

Scan List Details

The Scan List table provides the following details:

  • ScanId: A system-assigned identifier for the scan
  • UseCaseId: The human-readable title entered during Use Case setup
  • Status: Status of the scan reports
    • COMPLETE: Scan has run successfully, and all requested reports were created.
    • FAILED: If any report fails this is the status. The number of reports that succeeded and the number of reports that failed will be indicated.
    • SCANNING: The prediction service is running, and scan reports are pending.
  • Total: The number of reports requested
  • Created: (timestamp) The date/time the scan was created/initiated

Scan options

Click on the vertical ellipsis menu to interact with a scan in the row. Menu options are:

  • View Logs: A log file of the prediction service run is displayed
  • Abort Scan: Stops and removes a scan that is in progress (in a status of "SCANNING"), or removes a completed or failed scan from the list.
  • Scan Results: Opens the Certifai Scan Reports Console where you review the reports (You may be prompted to login.)
  • Scan Details: Opens the Certifai Scan Reports Console where you can view a summary of the scan configuration

Scan Actions

At the top of the Scan List you can take the following actions:

  • Select an existing use case to filter the scan list or to set as the context for configuring a new scan.
  • Click Use Case Setup to configure a new use case definition
  • Click New Scan to configure a new scan definition from an existing use case.

Configure a Use Case

  1. Click Use Case Setup at the top of the Scan List.

  2. On the Use Case Setup Page click + Add Use Case under the Use Case selection field.

    Alternately, you may select an existing Use Case, click Continue, and begin configuring the scan definition.

  3. Select the option Answering questions for the Create a Use Case by ... query.

    Alternately, you may select Uploading a scan definition. In this case follow the instructions here.

  4. Click Continue.

  5. Select an answer to the Do scans have access to a model? query.

    • Select Yes if you are configuring a use case with scans that call one or more models during the scan.

    • Select No (use prediction data only) if you are configuring a use case that references a single model's prediction data (rather than running models at scan time).

      The "No Model Access" scan type requires the following configuration:

      • Model metadata (name, model_id, and model_type) so Certifai can store scan results.
      • Explanation datasets must have a predicted outcome column from which to derive alternative counterfactuals
      • Explanation method must be set to Counterfactual. (Shap is not supported.)
  6. Enter the following Use Case details into the Use Case form:

    Use Case Form

    • Use Case Name: A short, descriptive, human-readable name for the use case.
    • Use Case ID: Must be unique across use cases
    • Use Case Description: A longer human-readable description for the use case
    • Evaluation Dataset: Upload a dataset (in .csv format with headers) to be used by the prediction service.
    • One-Hot Encoded Features: Select one of the following if your dataset has a feature with this pattern:
      • feature_value(pandas)
      • feature.value
      • None (select if no features follow this pattern)
  7. Click Continue.

  8. Select the Outcome Column from a list of features (derived from your Evaluation Dataset) or select "No outcome column" if your dataset has none.

  9. Click Continue.

  10. Configure the Learning Task Type details.

    Learning Task Type form

    • The Learning Task Type is auto-generated and can not be edited. It is derived from the Evaluation Dataset and the selected Outcome column.
    • Enter the Prediction Description. (Ask what are the possible outcomes?)
    • For binary-classification: Enter values for the two possible Prediction (outcome) Values (default is 1,2) and identify the favorable outcome (if there is one) by setting "Is Favorable" to "true" for that value.
  11. Click SAVE.

  12. On the Use Case Summary page you can review and edit the following details for the use case.

  • Use Case

      1. Click the open-form icon to open the Use Case form from step 5 above.
      1. (Optionally) Edit the fields.
      1. Click Continue to return to the summary page.
      1. Click the refresh icon to reset a use case/scan definition, it will reset evaluation information including features, prediction values and trust factors.
  • Scan Definition

    • (Optional) Click the upload icon to upload a scan_definition.yaml file from your local drive if you have one prepared.

    • (Optional) Click the download button to download a scan_definition.yaml template with some fields defined if you want to complete the configuration by preparing the file and uploading it (rather than completing the use case definition wizard in the UI).

  • Models

    You must configure the metadata in the dialog AND upload a model file.

    A use case can have multiple models.

    • a. Click the open-form icon to open a dialog where you define the model details for scans using this use case.

    • b. Click the upload icon to upload your model file.

    • (Optional) Click the trash can icon to delete the model listed below the heading

    • (Optional) Click + Add Model to open the model definition dialog to configure a new model for the use case.

    Model form

    • c. On the form enter:

      • Model Name: A short, descriptive human-readable name for the model.

      • Model ID: Autogenerated to be unique.

      • Model Type: Select from a list of supported model types in Certifai.

        Model type selection drives the defaults for the next 2 fields.

      • Deployment Template: Select from a list of deployment templates that were configured during Scan Manager Setup

      • Prediction Service Image: select from a list of images that were configured during Scan Manager Setup

      • Select Soft Outputs: Select "yes" if your model supports soft scoring. Required for Shap explanations.

        In Certifai classification models may provide soft scores for each row in one of two forms encoded via the optional fields scores and threshold. Both predictions and scores are optional, but at least one must be present.

      • Model Description: A longer human-readable description for the model.

    • d. Click Continue to save the Model definition and return to the Use Case Summary page.

    • e. Review/Edit (for advanced users) Prediction Metadata

      • columns: from the uploaded evaluation dataset
      • outcomes: configured in step 9 above
      • task-type: configured in step 9 above
      • supports_soft_scoring: must match what is configured in the model form in step c above.
  • Datasets

    • Evaluation Dataset: The dataset you uploaded in step 5 is displayed. Click the upload icon to select a different dataset from your local drive to replace that dataset.

    • Test Dataset: (Optional) Upload a test dataset file (in .csv format with headers) for use in performance analyses. This dataset must have an outcome column.

    • Explanation Dataset: (Optional) Upload a smaller dataset (in .csv format with headers) to be used for explanations reports.

      NOTE: For no Model access Explanation Datasets must contain a predicted outcome column from which to derive alternative counterfactuals.

  • Trust Factors

    • Select the Trust Factor reports you want the prediction service to generate.

      NOTE: For no Model access use cases only Fairness (non-burden metrics), Performance, and Explanations may be selected. The predicted outcome column must be present in datasets and for Fairness and Performance ground truth must also be supplied.

    • If you select Fairness, configure Fairness Settings.

    Fairness Features form

      1. Click + Add Fairness Feature to open the Fairness Feature form.
      1. Select from a list of features derived from the Evaluation Dataset.
      1. Open the ellipsis menu and click Edit. This tutorial will help you understand fairness bucketing.
      • Select a Feature Value Type
      • Enter the number of buckets you need (for categorical type)
      • Click Reset to create the buckets in the table below.
      • Drag values between buckets.
      • Click Continue to save Fairness Features and return to the Use Case Summary page.
      1. Repeat for all features that you want predictions for fairness measured for.

      You can delete Fairness Feature by opening the ellipsis menu and clicking Delete.

    • Performance Settings

      • a. Click + Add Metric to open the selection field.

      • b. Select the type of Performance you want predictions measured for.

      • c. Open the ellipsis menu to flag the Performance metric as Primary (or to Delete it).

      • d. Repeat to add other Performance metric types.

        Only ONE performance metric can be marked as Primary.

    • Explanation Settings

      • Select the method for generating the Explanations that you want the prediction service to run.

        • Counterfactuals: This method uses alternative predictive data points to infer an imaginary line representing the boundary between the possible outcomes produced by the model. Distance from the counterfactual to the line is used to calculate report values.

          NOTE: For no Model access use cases select the Counterfactual method for running the prediction service. SHAP is not supported for no Model access.

        • Shap: This method uses a game theoretic approach to explain the output the model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions.

  • Features

    • a. Click the open-form icon to configure the Feature from your dataset.

    • b. For each feature you may edit the following selections (scroll to view additional Features):

      • Fixed: Select for feature values that you do NOT want to change in counterfactual explanations. (e.g. gender - gender is a value that is unlikely to be changeable)
      • Hidden: Select for features that you do NOT want to expose to models. Hidden columns are removed from the dataset entries prior to being applied to the scan.
    • c. Click Continue.

    Completed Use Case Summary

    Completed Use Case Summary

  1. Click Save & Test.

  2. Any issues with the Use Case are exposed during the test. Resolve them by going back through the configuration steps above.

  3. When the Use Case passes the testing, the Scan List page for the Use Case is displayed with the new use case selected in the filter field at the top.

  4. Click New Scan to run a scan for the use case.

Configure a Scan

Scan Configuration Page

  1. Click New Scan at the top of the Scan List page.

  2. The Use Case that was selected in the filter field is selected by default. (Optionally) Select a different Use Case.

  3. (Optional) Click Use Case Setup to edit the configuration of the Use Case from the Use Case Summary page. (Follow the instructions in "Configure a Use Case" - step 11 - Use Case above.)

  4. Select the models you want evaluated by the prediction service. The models listed are derived from the use case configuration.

  5. Under Settings select the Trust Factor reports you want the service to generate.

NOTE: For scans running with no Model access, Certifai is unable to run the standard counterfactual based evaluations that use the Genetic algorithm, including Fast explanations. An alternative counterfactual sampling is generated using predicted outcomes and ground truth. Allowed Evaluation include: Fairness (only non-burden metrics), Performance, and Explanations. All trust factors reports require datasets with a predicted outcome column, and for Fairness and Performance reports a ground truth column is also required. SHAP explanations are not allowed.

  1. Under Explanation Settings select the Explanation Type (or method) you want the prediction service to use.

    • Counterfactuals: This method uses alternative predictive data points to infer an imaginary line representing the boundary between the possible outcomes produced by the model. Distance from the counterfactual to the line is used to calculate report values.

    • Shap: This method uses a game theoretic approach to explain the output the model. It connects optimal credit allocation with local explanations using the classic Shapley values from game theory and their related extensions.

  2. Under Datasets review the datasets that have been configured for the Use Case. You may upload new datasets for the Scan. (Follow the instructions in "Configure a Use Case" - step 11 - Datasets above.)

  3. Click Continue to kick off the scan.

It may take some time for the scan to complete depending upon your Kubernetes configuration and the size of your dataset.

The number of active scans and the number of scans that have succeeded is displayed below the SCANNING status.

While the scan is running you can open the ellipsis menu and select from the following options:

  • View Logs: A log file of the prediction service run is displayed as it runs.
  • Abort Scan: Click to stop a scan that is in progress (in a status of "SCANNING").

The status will change to COMPLETE or FAILED when scan is done running.

View Scan Results

From the Scan List Page:

  1. Find the scan you want to view.

  2. In the scan row on the far right open the vertical ellipsis menu.

    If the scan status is FAILED, you can only click View Logs. A log traceback of the scan is displayed.

    If the scan status is COMPLETE, you can select of the following options:

    • View Logs
    • Scan Results: Opens the Certifai Scan Results Console where you review the reports (you are prompted to login)
    • Scan Details: Opens the Certifai Scan Results Console where you can view the details of the scan configuration.

Manually editing a Scan Definition

To keep the UI simple, there are a set of advanced user actions that require manual editing of the scan definition.

If you need to manually edit a scan definition, you can either:

  • Create the scan definition outside of Scan Manager, and upload it when you create the use case
  • Create the use case in Scan Manager, download the scan definition from the UI, make your edits, and then upload it

Manually edit the scan definition in the following circumstances:

  • You do not have an outcome column in the evaluation dataset that you upload to create the use case, and the use case is not binary classification with outcome values of 0 and 1 (see task_type in the Model Use Case section, and prediction_values in the Evaluation section)
  • You want to add alternative fairness metrics such as equalized odds and demographic parity (see fairness_metrics in the Evaluation section)
  • Some of the columns in your dataset are target-encoded (see target_encodings in the Feature Schemas section
  • Your use case is multi-class with ordered prediction favorability (see prediction_favorability in the Evaluation section
  • You need to add model headers or model secrets
  • You want to use 'shap' rather than 'counterfactual' explanations for the explainability score (see primary_explanation_type in the Evaluation section

Next steps

To interpret scan reports read: