Version: 1.3.14

Azure: Run remote scans

Follow the steps below to run a scan job in Certifai Pro on Azure.

Prerequisites

You have downloaded the certifai-kubeconfig.json file.
You have imported the configuration into your Certifai toolkit.
A folder - certifai_assets - has been created in your local drive where you store scan definition files and datasets for easy access.
Datasets from the certifai_toolkit/examples folder in your local drive. (These are created when you download and install the Toolkit).
- german_credit_explan.csv
- german_credit_eval.csv
Downloaded a copy of the scan definition example file german_credit_scanner_definition.yaml, to be used as a template.

Copy and paste the file into a text editor window where you can make changes and save to your local drive.

model_use_case:
  atx_performance_metric_name: Accuracy
  author: info@cognitivescale.com
  description: 'In this use case, each entry in the dataset represents a person who takes a credit loan from a bank. The learning task is to classify each person as either a good or bad credit risk according to the set of attributes.
  This dataset was sourced from Kaggle: https://www.kaggle.com/uciml/german-credit. The original source is: https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29.'
  model_use_case_id: c12e/datasciencelab/german_credit
  name: 'Banking: Loan Approval'
  performance_metrics:
  - metric: Accuracy
    name: Accuracy
  task_type: binary-classification
evaluation:
  description: This evaluation compares the robustness, accuracy, fairness and explanations for 4 candidate models.
  evaluation_dataset_id: eval
  evaluation_types:
  - fairness
  explanation_dataset_id: explan
  test_dataset_id: eval
  fairness_grouping_features:
    - name: age
    - name: status
    - name: foreign
  feature_restrictions:
  - feature_name: age
    restriction_string: no changes
  - feature_name: status
    restriction_string: no changes
  - feature_name: foreign
    restriction_string: no changes
  name: Baseline evaluation of 4 models
  prediction_description: Will a loan be granted?
  prediction_values:
  - favorable: true
    name: Loan Granted
    value: 1
  - favorable: false
    name: Loan Denied
    value: 2
models:
  - author: ''
    description: Scikit-learn LogisticRegression classifier using lbfgs solver
    model_id: svm
    name: Logistic Regression
    predict_endpoint: http://certifai-ref-models.certifai.svc.cluster.local:5111/german_credit_logit/predict
datasets:
  - dataset_id: eval
    description: 1000 row representative sample of the full dataset
    file_type: csv
    has_header: true
    name: Evaluation dataset
    url: abfs://<scan-directory-name>/datasets/german_credit_eval.csv
  - dataset_id: explan
    description: ''
    file_type: csv
    has_header: true
    name: 100 row explanation dataset
    url: abfs://<scan-directory-name>/datasets/german_credit_explan.csv
dataset_schema:
  feature_schemas:
  - feature_name: age
  - feature_name: status
  - feature_name: foreign
  outcome_column: outcome

Define scan config files and move to blob storage

Save this file to a folder named definitions that you must create inside your certifai_assets folder (created as a prerequisite).
Open the scan definition example file: german_credit_scanner_definition.yaml in a text editor and edit the following fields:
- datasets: url: (NOTE: There are 2 instances of this that must be modified in the file.) <scan-directory-name> in the example URL below must be changed to match the Scan Directory Name that was created during Console Configuration.)

url: abfs://<scan-directory-name>/datasets/german_credit_explan.csv

and

url: abfs://<scan-directory-name>/datasets/german_credit_eval.csv

Save this file in the certifai_assets/definitions folder with the job definition file.
Copy following datasets from certifai_toolkit/example/datasets to a folder named datasets that you must create inside your certifai_assets folder.
Move datasets to your Azure blob storage bucket (Scan Directory). There are several ways these files may be moved. This is one of them (NOTE: You will perform this operation 2 times once for each of the required files):
- dataset: german_credit_explan.csv that was included with the toolkit (certifai_toolkit/examples/datasets)
- dataset: german_credit_eval.csv that was included with the toolkit (certifai_toolkit/examples/datasets)
  - a. Change your terminal or PowerShell context to the folder where the file is located.
  - b. Copy and paste the command below into a text editor
  - c. Replace the variables with the details from the file you are moving to blob storage.
  - d. Copy and paste the command from the text editor to the terminal or PowerShell window and run it.
  - e. In your Azure portal go to the blob storage container and verify that file has been moved.

az storage blob upload \
 --account-name <storage-account> \
 --container-name <scan directory name> \
 --name <folder/scan-definition-file-name.yaml> \
 --file <scan-definition-file-name.yaml> \
 --auth-mode key \
 --account-key <access-key>

Example:

az storage blob upload \
--account-name mscottblob \
--container-name scans-rc2 \
--name definitions/diabetes_scanner_definition.yaml \
--file diabetes_scanner_definition.yaml \
--auth-mode key \
--account-key abcd

Run the remote scan job

In a new terminal or PowerShell window, run the following command to start your job:

certifai remote scan -m svm -o abfs://<scan-directory-name> -f abfs://certifai_assets/definitions/german_credit_scanner_definition.yaml

Optionally, you can manage the remote job through the CLI.
Verify reports have been added to the Use Case in the remote Console.
- a. In a browser window (Chrome is recommended) enter the https://<Public IP address of your Certifai VM>. (A warning message may be displayed telling you that the connection is not private. Click on the link that exposes the Advanced settings. Click the link at the bottom that says "Proceed to <IP address>".)
- b. Login using the password that was created during Console configuration. (NOTE: Do NOT change the user name from certifai)
- c. In the row of the Use Case (Banking: Loan Approval) click the menu icon on the far right and select SCAN DETAILS.
- d. A scan with the name and date of this process is listed when the scan report is complete.
- e. Click VIEW to see the report visualizations.

#Prerequisites

#Define scan config files and move to blob storage

#Run the remote scan job

Prerequisites

Define scan config files and move to blob storage

Run the remote scan job