Version: 1.3.16

Use bucketing for fairness grouping features

Follow the instructions below to learn how to perform a fairness evaluation and use bucketing to define custom groups within a fairness grouping feature.

Prerequisites

  1. You have accepted the license agreement and downloaded the Toolkit .zip file to your local system.
  2. You have installed the reference models package to your local system.

Tutorial Instructions

To learn more about how Certifai calculates fairness refer to here.

  1. Set your working directory to the folder where your Certifai Toolkit was unzipped.

    cd <toolkit-location>
  2. Activate the virtual environment you created for certifai when installing the Certifai CLI.

    conda activate certifai
  3. Copy and save the following starter scan definition into a text file named fairness_bucketing_tutorial_scan_definition.yaml in the examples/definitions directory.

    scan:
    output:
    path: ../fairness_bucketing_tutorial_reports
    model_use_case:
    author: info@cognitivescale.com
    description: 'In this use case, each entry in the dataset represents an auto insurance
    claim. The learning task is to predict the final settled claim amount.
    This dataset was originally sourced from Emcien: https://www.sixtusdakurah.com/resources/The_Application_of_Regularization_in_Modelling_Insurance_Claims.pdf
    '
    model_use_case_id: c12e/datasciencelab/auto_insurance
    name: 'Insurance: Auto Insurance Claims'
    task_type: regression
    evaluation:
    description: Example evaluation against a single model.
    evaluation_dataset_id: eval
    evaluation_types:
    - robustness
    name: Example Auto Insurance evaluation
    prediction_description: Amount of Settled Claim
    prediction_favorability: ordered
    favorable_outcome_value: increased
    regression_standard_deviation: 0.5
    models:
    - model_id: linl1
    description: Scikit-learn linear regression using Lasso with L1 regularization
    name: L1 Linear Regression
    predict_endpoint: http://127.0.0.1:5111/auto_insurance_linl1/predict
    datasets:
    - dataset_id: eval
    description: 1000 row representative sample of the full dataset
    file_type: csv
    has_header: true
    name: Evaluation dataset
    url: file:../datasets/auto_insurance_eval.csv
    dataset_schema:
    outcome_column: Total Claim Amount

    The above scan definition is a simplified version of the auto_insurance_scanner_definition.yaml file provided in the toolkit. This definition contains minimal information to only perform a robustness evaluation on a single model (made available by the reference model server).

    Note: The dataset path in the above YAML is relative to the location of the scan definition. The path is assuming your YAML file is located at <toolkit-location>/examples/definitions. You may have to adjust the dataset path if you saved your scan definition in a different location.

  4. Add fairness_grouping_features names to the example scan definition to perform a fairness evaluation.

    Fairness grouping features are specified at the evaluation level of a scan definition, under the fairness_grouping_features field. Each grouping feature must specify a name that must correspond to a column in the evaluation dataset. Additionally, each grouping feature provides a buckets field where you can define custom groups within that feature. The syntax for defining a bucket depends on whether the grouping feature is numerical or categorical.

    The scan definition in this tutorial analyzes the Income and EmploymentStatus features in its evaluation. Update the evaluation section of the scan definition by adding "fairness" as an evaluation_type, and listing Income and EmploymentStatus under fairness_grouping_features.

    Note: Fairness grouping features are case-sensitive and must match the column names in the specified dataset.

    The evaluation section should look like the following before proceeding to the next step:

    evaluation:
    description: Example evaluation against a single model.
    evaluation_dataset_id: eval
    evaluation_types:
    - robustness
    - fairness
    name: Example Auto Insurance evaluation
    prediction_description: Amount of Settled Claim
    prediction_favorability: ordered
    favorable_outcome_value: increased
    regression_standard_deviation: 0.5
    fairness_grouping_features:
    - name: Income
    - name: EmploymentStatus
  5. Define buckets for the numeric feature: Income.

    By default Certifai will treat each distinct value as a separate class within a grouping feature. With regards to Income, this default behavior would result in many classes containing only a handful samples. For example, only one person in the evaluation dataset has an income of $50,333. Furthermore, performing a fairness evaluation on a feature with classes with such minimal sample size would likely result in an unreliable fairness score.

    Using bucketing to define classes based on income ranges provides larger sample sizes. The result of this evaluation illuminates any bias the model has towards individuals in lower or higher income ranges.

    Below is a table with the three classes to define for the Income feature and the number of samples in each class.

    Income RangeNumber of instances
    $0 - $25,000376
    $25,000 - $60,000320
    $60,000+304

    Buckets for numerical features are based on upper bound limits that can specify an optional max field. Values belong to the bucket with the lowest upper bound greater than or equal to the value, and exactly one bucket must omit an upper bound to act as a "catch-all" bucket. Additionally, each bucket must have a description field that is used as the group name in the fairness report.

    The YAML equivalent for the classes defined above is shown below. Replace the Income item in the fairness_grouping_features list of your scan definition with the YAML snippet below.

    name: Income
    buckets:
    - description: "$0 - $25,000"
    max: 25000
    - description: "$25,000 - $60,000"
    max: 60000
    - description: "$60,000+"

    Note: The maximum is inclusive. For example, an income of exactly 25000 would belong to the class "$0 - $25,000".

    Note: The class "$60,000+" does not specify a maximum value and therefore includes all samples with an income of $60,000 or greater.

    The fairness_grouping_features section should look like the following before proceeding to the next step:

    fairness_grouping_features:
    - name: Income
    buckets:
    - description: "$0 - $25,000"
    max: 25000
    - description: "$25,000 - $60,000"
    max: 60000
    - description: "$60,000+"
    - name: EmploymentStatus
  6. Define buckets for the categorical feature EmploymentStatus. Below is the distribution of values for the EmploymentStatus feature in the evaluation dataset:

    GroupNumber of instances
    Employed635
    Unemployed233
    Retired36
    Disabled44
    Medical Leave52

    The goal of this evaluation is to illuminate the models' bias towards individuals who are actively working, versus those who are not actively working, regardless of unemployment category (unemployed, retired, disabled, or on medical leave).

    Define the following buckets for the EmploymentStatus grouping feature:

    • Actively Working - Includes: Employed
    • Not Actively Working - Includes: Unemployed, Retired, Disabled, and Medical Leave

    Buckets for categorical features are defined by lists of labels comprising the bucket. Each bucket must specify a values field with the list of labels in the bucket and a description field that will be used as the group name in the fairness report.

    The YAML equivalent for the classes defined above is shown below. Replace the EmploymentStatus item in the fairness_grouping_features list of your scan definition with the YAML snippet below.

    name: EmploymentStatus
    buckets:
    - description: "Actively Working"
    values:
    - Employed
    - description: "Not Actively Working"
    values:
    - Unemployed
    - Retired
    - Disabled
    - Medical Leave

    The fairness_grouping_features field should look like the following before proceeding to the next step:

    fairness_grouping_features:
    - name: Income
    buckets:
    - description: "$0 - $25,000"
    max: 25000
    - description: "$25,000 - $60,000"
    max: 60000
    - description: "$60,000+"
    - name: EmploymentStatus
    buckets:
    - description: "Actively Working"
    values:
    - Employed
    - description: "Not Actively Working"
    values:
    - Unemployed
    - Retired
    - Disabled
    - Medical Leave
  7. Open a new terminal and activate the virtual environment where you installed the reference model server.

    conda activate certifai-reference-models

    Start the reference model server.

    startCertifaiModelServer
  8. (Optional) Validate and test your scan definition before running the scan. Make sure to switch to the original terminal you were using for this tutorial and save the scan definition you have been working on.

    Validate that the scan definition is syntactically correct. If you encounter any errors, verify that you have correctly followed the steps above and updated your scan definition. If the validation is successful, continue to testing your definition.

    certifai definition-validate -f examples/definitions/fairness_bucketing_tutorial_scan_definition.yaml

    Test that the scan definition correctly connects to the model hosted by the reference model server. If you encounter any errors, verify that the reference model server is running and the model definition matches the result of the previous steps. If the test is successful continue to the next step.

    certifai definition-test -f examples/definitions/fairness_bucketing_tutorial_scan_definition.yaml
  9. Run the scan:

    certifai scan -f examples/definitions/fairness_bucketing_tutorial_scan_definition.yaml

    Note: The scan may take a few minutes to finish.

    After the scan completes, you should see output similar to the following:

    ...
    Scan Completed
    ====== Report Summary ======
    Total number of evaluations performed: 3
    Number of successful reports: 3
    Number of failed reports: 0
  10. Start the Certifai Console and navigate to the fairness results for this scan.

    certifai console examples/fairness_bucketing_tutorial_reports
  11. The Console is available at: http://localhost:8000. Click the URL or copy it into a browser to view your scan result visualizations.

  12. The Console opens on the Use Case list page. Click the menu icon on the far right of the row with the model use case id c12e_datasciencelab_auto_insurance. Then click the Scan List button to view the list of scans for the model use case.

    Use Case List

  13. From the Scan List page, find the row with the Scan ID of the scan you ran in step 9. Then click the menu icon on the far right of the row and click the Results button.

    Scan List

  14. At the top right of the page, you can toggle between the Model and Evaluation views. View by Model is the default.

    Results Page by Model

  15. Last, scroll down to the Fairness Breakdown by Grouping Feature section of the results page.

    Income Feature Fairness Results

    For more information on navigating the console refer to here.

  16. Interpret the fairness results per grouping feature in the console.

    Note: The results in your Console view may differ slightly from the images provided. The explanations below correspond to the results of the scan at the time of writing this tutorial.

    Select Income from the "Grouping Feature" dropdown menu to view the group burdens for the Income feature. The fairness breakdown graph displays the burden for each group we defined in step 5.

    Income Feature Fairness Results

    According to the results, the groups with an income ranges of "$0 - $25,000" and $25,000 - $60,000" have a slightly higher burden than the group of individuals that earn more than $60,000. A higher burden value in this context means that more change is generally required to receive an increased settled claim amount.

    The fairness score for the Income group is 95.21 out of 100 because the burden across the three groups is relatively close. This may be interpreted to mean that the model is generally fair across the three income groups.

    Select EmploymentStatus from the Grouping Feature dropdown menu to view the group burdens for the EmploymentStatus feature. The fairness breakdown graph displays the burden for each group defined in step 6.

    EmploymentStatus Feature Fairness Results

    According to the results, the "Not Actively Working" group has a higher burden than the "Actively Working" group. Again, a higher burden value in this context means that more change is generally required to receive an increased settled claim amount.

    The fairness score for the EmploymentStatus group is 94.08 out of 100 because the burden across the two groups is relatively close. This may be interpreted to mean that the model is generally fair to both groups.