Scan Manager Setup
Scan Manager provides Certifai users with an easy to use web-interface for configuring use cases and scans.
This guide is for system administrators who are configuring Scan Manager to work with Certifai Enterprise on a Kubernetes cluster.
This document walks you through:
- Obtaining the setup artifacts (templates etc).
- Creating and pushing different model-type base images to your private registry.
- Adding the base images to templates.
Prerequisites
- An instance of Cortex Certifai Enterprise has been deployed and is running on Kubernetes with access to the container registry configured for use with that cluster.
- You have configured object storage that is dedicated to the Certifai Scan Manager. (e.g.
s3://bucket/certifai/data
is dedicated for scan manager ands3://bucket/certifai/results
for scan results.)
Templates and files
This section describes the list of available editable Scan Manager templates and .yaml files that you use to set up Certifai Scan Manager.
These artifacts are stored in the setup_artifacts directory of the Certifai examples repository on GitHub.
The deployment
folder contains:
- Deployment templates:
.yaml
templates that provide the configuration templates for deploying each of the specified model types on Kubernetes:scikit_0.23
: usespython3.6
base image withscikit-learn v0.23
pre-installedh2o_mojo
: usespython3.6
base image withdaimojo-2.4.8-cp36
whl pre-installedr_model
: usesrocker/r-apt:bionic
base image withr-cran-randomforest
pre-installedhosted_model
: usespython3.6
config.yml
: The configuration specification for defining the template/base image relationship
If necessary, create a folder named files
to contain additional files required to run Certifai Scan Manager.
license.txt
file for h2o-mojoNOTE
When you are working with h2o-mojo models, you must update this file.
The k8s_definitions
folder contains:
scan-manager-configmap.yaml
: This file is used to configure Kubernetes parameters like scan concurrency, cpu, and memory resource requests.
Add Templates
You can create and upload additional templates to your dedicated object storage any time.
IMPORTANT
Containerized models must be compatible with the configured prediction service image in two ways:
The model artifact must work with the installed library versions for the selected model type and image.
For example, the default scikit prediction service image includes a specific version of scikit-learn, pandas, and numpy. However, each installation may have its own model type and images. Your MLOps team can provide this information.
The way the model has been saved to file must be compatible with the way the prediction service loads it.
For example, the default python prediction service images provided with Certifai expect the model to have been saved as a dictionary using code similar to:
import picklemodel_obj = {'model': dtree,# If the model does its own encoding, you can omit the encoder below'encoder': encoder}with open('dtree_model.pkl', 'wb') as file:pickle.dump(model_obj, file)
To create a new template to correspond with a new model_type:
Refer to the template creation instructions.
To use an existing template for a new model_type:
- Copy a similar template from the setup_artifacts directory
- Rename it to reflect the model type.
- Add or update environment variables that the model service deployment requires. Environment variables are injected when a service is deployed.
- Update
<model_type>.deployment
value inconfig.yml
file (described [here](/cortex-certifai/docs/1.3.13/enterprise/scan-manager/scan-manager-setup#add-base images)) to the filename you created above. (config.yml example) - Save the file to your local version of the templates and files. Keep the deployment file and template file in the same directory.
- Use the shell script
bash upload_artifact.sh <END_POINT> <ACCESS_KEY> <SECRET_KEY> <BUCKET_NAME>
to upload the new template to your Scan Manager registry (<AWS_BUCKET>
). - Add the corresponding base images to the registry. (Follow instructions below.)
Refer to the guide on Model Secrets.
Generate base images
Base images are containers with pre-installed dependencies required to run model predictions as a service.
To create base images for the existing templates in cortex-certifai-examples:
Create a template directory structure for the given model type (e.g. python scikit)
To generate the code template (
generated-container-model
) in your current directory with the generated code for containerization of your model run:Model-type Generate command to run H2O Mojo ./generate.sh -i certifai-model-container:latest -m h2o_mojo
Python ./generate.sh -i certifai-model-container:latest -m python
Proxy ./generate.sh -d generated-container-proxy -i certifai-proxy-container:latest -m proxy
R ./generate.sh -i certifai-model-container:latest -m r_model -b rocker/r-apt:bionic
The template generated from the above commands is designed to work with standard scikit-learn, XGBClassifier or XGBRegressor models, H2O MOJO, and R based models
For an xgboost model using DMatrix, replace
-m python
with-m python_xgboost_dmatrix
.For additional
generate
options run:./generate.sh --help(For H2O Mojo and Python templates only) Update and test the prediction service.
The prediction service works out of the box with a standard scikit-learn model and with an XGBClassifier or XGBRegressor model.
For other models, you may need to update the
set_global_imports
method ingenerated-container-model/src/prediction_service.py
to import any required dependencies, and thepredict
and/orsoft_predict
methods to predict using the model and return results in the expected format.If you are using
soft_predict
(e.g. for Shap), make suresupports_soft_scoring: true
is specified for the model ingenerated-container-model/model/metadata.yml
and in your scan definition.To test the prediction service:
a. Copy the model into the generated container folder by running:
cp mymodel.pkl generated-container-model/model/model.pklb. Start the service by running:
python generated-container-model/src/prediction_service.pyc. Test the service by making a request to
http://127.0.0.1:8551/predict
with the respective parameters(See e.g.
app_test.py
in the iris example), or use Certifai to test the endpoint against your scan definition by running:certifai definition-test -f scan_def.yaml
(For R templates only) Configure Prediction Service.
a. To load model dependencies add
library(packageName)
to filesrc/prediction_service.R
(e.g.library(randomForest)
to load random forest package for prediction)b. The R model is assumed to be an .rds file type. However, a persisted model may contain additional functions and artifacts for data transformation at runtime. The supported list includes:
encoder
: a function to encode (scale etc.) incoming data that is accessed usingmodel$encoder
artifacts
: an optional object that may be passed to encoder along with new data that is accessed usingmodel$artifacts
(e.g.
predict(model$model, newdata=model$encoder(test_data, model$artifacts))
- Refer to concrete example in models/r-models)
To copy the Cortex Certifai
packages
folder from inside the toolkit into thegenerated-container-proxy
directory from step 1, run:cp -r <certifai-toolkit-path>/packages generated-container-proxy/packagesNOTE
The entire
packages
folder of the Certifai Toolkit is copied for convenience, but only thecortex-certifai-common
andcortex-model-sdk
packages are built into the Docker image.(Optional) To install other dependencies run the following model-type specific commands.
Model-type Dependency Description Action/ Command H2O Mojo Download and copy the daimojo MOJO Python runtime linux dependency ( .whl
file) toext_packages
foldercp <path-to-linux-daimojo-file>.whl generated-container-model/ext_packages/
Python If you are using scikit-learn version other than 0.23.2 or other dependencies Update generated-container-model/requirements.txt
with the corresponding version # or relative dependencies and install dependencies withpip install
R For binary dependencies Add to requirements_bin.txt
(e.g.r-cran-randomforest
)R For dependencies other than binary Add to requirements_src.txt
(e.g.install.packages('custom-non-binary-package')
)(For Proxy template only) Configure a hosted model URL.
a. Add the
HOSTED_MODEL_URL
env variable and any required auth/secret header tokens togenerated-container-proxy/environment.yml
file, which is used at runtime.b. Add the same env variables to the
src/prediction_service.py
file.
(For Proxy template only) Update request/response transformer methods in the
src/prediction_service.py
file.a. Update the
transform_request_to_hosted_model_schema
method to apply custom transformation to a hosted model service request (/POST
).b. Update the
transform_response_to_certifai_predict_schema
method to apply custom transformation to a hosted model service response to the Certifai predict schema Refer to src/prediction_service.py for more information.
To build the base image used by the prediction service docker image.
./generated-container-model/container_util.sh buildA Docker image with the name specified in Step 1 with -i parameter (
certifai-model-container:latest
in this case) is created.
Push the image to your private Docker registry so the k8s cluster is able to pull it when you run a scan.
a. Make sure you have write-access to the registry and that you are authenticated to it.
b. Tag the Docker image.
docker tag <docker-image> <your-private-registry-url>:<docker-image-with-tag>c. To push the image to your private Docker registry run:
docker push <your-private-registry-url>:<docker-image-with-tag>Your private registry must be accessible to the deployed instance of Certifai Scan Manager.
NOTE
To deploy the prediction service outside of scan manager set cloud storage credentials, MODEL_PATH
, and (if needed) H2O_LICENSE_PATH
in the generated-container-model/environment.yml
file.
Add base images
For a new model-type
To add base image of a new model type:
- Create new base image following the instructions above.
- Update the
config.yml
as follows:- a. Add a new model type item (e.g.
pytorch_1.8
). - b. Add model_type_deployment with the corresponding deployment template name (e.g.
pytorch_1.8_deployment.yml
. - c. Create the
pytorch_1.8_deployment.yml
file in the same directory. - d. Add
model_type.default_base_image.name
with human-readable name (pytorch_1.8
) - e. Add
model_type.default_base_image.value
with the fully qualified name of the image pushed above. - f. Update the list of
model_type.available_base_images
with the same image value.
pytorch_1.8:deployment: pytorch_1.8_deployment.ymldefault_base_image:name: pytorch_1.8value: gcr.io/certifai-dev/certifai-pytorch-container:tagavailable_base_images:- name: pytorch_1.8value: gcr.io/certifai-dev/certifai-pytorch-container:tag - a. Add a new model type item (e.g.
For an existing model-type
To add a base image to an existing model type update the list of model_type.available_base_images
with the newly added base image.
Updating base images
To update base image of a given model type:
- Create a new base image. (Follow the instructions above.)
- Update the values for
<model_type>
in theconfig.yml
including:
default_base_image.value
: with the fully qualified name of the image pushed above- The corresponding name-value item pair list
available_base_images.value
: with the fully qualified name of the image pushed above
Upload setup artifacts
Run the following script from the setup_artifacts directory to load the Certifai Scan Manager setup artifacts to your Scan Manager container registry:
bash upload_artifact.sh <END_POINT> <ACCESS_KEY> <SECRET_KEY> <BUCKET_NAME>
Where:
<BUCKET_NAME>
is the URL of the S3 path that you configured when you installed Scan Manager.
Example
bash upload_artifact.sh https://s3.amazonaws.com SOMEACCESSKEY SOMESECRETKEY BUCKETNAME
The setup artifacts and templates are now available for selection and use in the Certifai Scan Manager application.
Deploy the Prediction Service to a Different Namespace
Using a different namespace for your model deployment is recommended.
To deploy the prediction services to a different namespace (for admins only):
Update/Add the
deployment-namespace
field incertifai-scan-manager ConfigMap
in the current namespace as follows:deployment-namespace: <deployment-namespace>
.Set S3-Bucket credentials (access and secret keys) to a different namespace and kube secret.
Create secrets using following command
kubectl create secret generic s3-bucket-access-key --from-literal=accesskey=<BUCKET_ACCESS_KEY> -n <deployment-namespace>kubectl create secret generic s3-bucket-secret-key --from-literal=secretkey=<BUCKET_SECRET_KEY> -n <deployment-namespace>Create a role and role-binding in the new namespace by saving the following snippet to a file and naming it:
deployment-namespace-roles.yaml
.apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata:name: certifai-deploymentnamespace: <deployment-namespace>rules:- apiGroups: [ "apps" ]resources: [ "deployments" ]verbs: [ "list", "get" , "patch", "create" ]- apiGroups: [ "" ]resources: [ "services" ]verbs: [ "list", "get", "patch", "create" ]- apiGroups: [ "" ]resources: [ "pods", "pods/log" ]verbs: [ "list", "get" ]---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata:name: certifai-scan-managernamespace: <deployment-namespace>roleRef:apiGroup: rbac.authorization.k8s.iokind: Rolename: certifai-deploymentsubjects:- kind: ServiceAccountname: certifai-scan-managernamespace: <current-namespace>Edit and save the
deployment-namespace-roles.yaml
file you just created as follows:
- Edit
namespace
fields in the above snippet as needed.<current-namespace>
: Name of the current namespace Certifai is installed in.<deployment-namespace>
: Name of the deployment namespace.
- Apply the file by running:When you create a new use case using ScanManager, the prediction services runs inkubectl apply -f deployment-namespace-roles.yaml -n <deployment-namespace>
<deployment-namespace>
namespace as recommended.
Kube Setup
To configure Scan Manager:
Create a kube
ConfigMap
usingsetup_artifacts/k8s_definitions/scan-manager-configmap.yaml
file as a template.In the file set the following:
scan-config
- These settings provide configuration for parallel scanning (scan concurrency), cpu, and memory. You may also specify use case level configurations for these properties.deployment-namespace
- This is the namespace used to deploy models.Scan manager verifies whether or not deployment-namespace is set. If it is not set (or not present), deployment namespace is set to the
NAMESPACE
env variable on startup.scan-config:default:parallel: 1cpu-req: "1000m"mem-req: "500Mi"usecase:<usecase_id>:parallel: 2cpu-req: "1000m"mem-req: "500Mi"deployment-namespace: certifaiTo apply default scan manager ConfigMap run:
kubectl apply -f setup_artifacts/k8s_definitions/scan-manager-configmap.yaml -n <NAMESPACE>
Secret Configuration
Save s3 bucket access credentials (BUCKET_ACCESS_KEY
and BUCKET_SECRET_KEY
) to Kubernetes secrets using following commands:
kubectl create secret generic s3-bucket-access-key --from-literal=accesskey=<BUCKET_ACCESS_KEY> -n <NAMESPACE>
AND
kubectl create secret generic s3-bucket-secret-key --from-literal=secretkey=<BUCKET_SECRET_KEY> -n <NAMESPACE>
Kubernetes secrets are injected as valueFrom
environment variables when the service is deployed.
NOTE
You can store both access key and secret key in the same k8s secret.
Troubleshooting
Issue RequestEntityTooLarge
When you get this message, you have likely tried to upload a file that exceeds Scan Manager's upload limit.
Scan Manager supports upload of files up to 1GB.
Solution:
Add annotation to the Ingress that increases max-body-size in the Certifai custom resource.