certifai.model.sdk package

Submodules

Module contents

Abstractions for exposing machine learning models as prediction services, whose API supports scanning with Certifai.

Refer to example usage in cortex-certifai-examples Github repo.

Refer to the Cortex Certifai documentation for the details of the Predict API.

class certifai.model.sdk.ComposedModelWrapper(port: int = 8551, host: str = '127.0.0.1')

ComposedModelWrapper provides a Flask app that dispatches to multiple SimpleModelWrapper’s.

Parameters
  • model (IBaseModel) – any predictor object that has a predict method which takes a sequence of data vectors as a numpy array and returns a sequence of corresponding predicted values. To override default predict behaviour see SimpleModelWrapper.predict().

  • endpoint_url (Optional[str]) – valid url route string to create POST endpoint for model invoke e.g. /api/model/predict. defaults to /predict.

  • port (Optional[int]) – the port of the webserver. Defaults to 8551

  • host (Optional[str]) – the hostname to listen on. Set this to ‘0.0.0.0’ to have the server available externally as well. Defaults to ‘127.0.0.1’.

  • encoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the model’s input (e.g. - to perform one-hot encoding and so on).

  • decoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the model’s output (e.g. - to binarize with some threshold).

  • supports_soft_scores (Optional[bool]) – True, if model supports soft scores. default is False

  • score_labels (Optional[list]) – ordered list of class labels corresponding to each predicted score array in-case of soft scoring model

  • threshold (Optional[float]) – value at which prediction to be considered positive; only used in binary-classification when model returns simple list of scores for the positive class

  • model_type (Optional[ModelTypesEnum]) – type of third-party model to import. currently supported ‘h2o_mojo’

  • model_path (Optional[str]) – disk path of third-party model to import. currently supported ‘h2o_mojo’

add_wrapped_model(mount_prefix: str, wrapped_model: certifai.model.sdk.simple_wrapper.SimpleModelWrapper) None

Adds a wrapped simple model to create ComposedModelWrapper with multiple dispatch route endpoints. For example, a mount_prefix of /models/svm and endpoint_url (from SimpleModelWrapper.endpoint_url) of /predict will create a route POST endpoint /models/svm/predict

Parameters
  • wrapped_model (SimpleModelWrapper) – wrapped simple model to add

  • mount_prefix (str) – prefix to be appended to wrapped simple model’s route (POST endpoint)

Returns

None

run(production: Optional[bool] = False, worker_class: Optional[certifai.model.utils.gunicorn_conf.WorkerTypeEnum] = WorkerTypeEnum.gevent, log_level: Optional[certifai.model.utils.gunicorn_conf.LogLevelEnum] = LogLevelEnum.info, num_workers: Optional[int] = 3, timeout_secs: Optional[int] = 20)

Start the prediction service.

Parameters
  • production (Optional[bool]) – start gunicorn server if True else run native Flask app. default is False

  • worker_class (Optional[str]) – type of gunicorn worker. default is gevent. supported type (gthread,gevent,sync)

  • log_level (Optional[str]) – logging level. default is info.

  • num_workers (Optional[int]) – number of gunicorn worker processes to start. default is 3

  • timeout_secs (Optional[int]) – gunicorn worker timeout in secs. default is 20

Returns

None

predict(npinstances: numpy.ndarray) numpy.ndarray

Override this method to change the way the model is called. The default implementation calls model.predict(npinstances).

Parameters

npinstances (np.ndarray) – numpy array of shape (n_samples, n_features) to predict on

Returns

numpy array of model predictions of shape (n_samples,)

Return type

np.ndarray

class certifai.model.sdk.PandasModelWrapper(pandas_kwargs: Optional[dict] = None, **kwargs)

Provides a Flask app that runs a single model. It is optimized for models that accept as input a pandas.DataFrame of instances from the dataset, and returns an array-like object of predictions. The expected output of the model can be any type of Iterable, such as a list, numpy array, pandas DataFrame, or pandas Series.

If an encoder is set, then it will also receive as input a pandas.DataFrame.

Parameters for creating the `pandas.DataFrame` can be specified in the `pandas_kwargs` dictionary. Refer to the pandas documentation for available keyword arguments. For example,

m = PandasModelWrapper(model=model, pandas_kwargs={'columns': ['a', 'b', 'c', 'd']})
Parameters
  • pandas_kwargs – Dictionary with keyword arguments to provided to the pandas.DataFrame constructor, such as: columns, dtype, copy, or index.

  • kwargs – Keyword arguments for configuring the prediction service. Refer to the parameters of the SimpleModelWrapper.

predict_raw(instances: List) certifai.model.sdk.simple_wrapper.PredictResponse

Override this method if the model doesn’t use pandas DataFrame’s for prediction input.

Parameters

instances (List) – {array-like, list} of data instances of shape (n_samples, n_features)

Returns

NamedTuple (PredictResponse) of model predictions, scores, labels and threshold

Return type

PredictResponse

NamedTuple(predictions: np.ndarray
           scores:      Optional[np.ndarray]
           labels:      Optional[list]
           threshold:   Optional[float]
           )

predict(df: pandas.core.frame.DataFrame) numpy.ndarray

Override this method to change the way the model is called. The default implementation calls model.predict(df).

Parameters

dfDataFrame of shape (n_samples, n_features) to predict on

Returns

array-like collection of model predictions of shape (n_samples,).

Return type

np.ndarray

soft_predict(df: pandas.core.frame.DataFrame) numpy.ndarray

Computes soft scores along with ordered list of score labels if supports_soft_scores is enabled. Override this method to change to how soft scores are computed. The default implementation calls model.predict_proba(df).

Parameters

dfDataFrame of shape (n_samples, n_features) to predict on.

Returns

model predict scores in an array-like collection of shape (n_samples,n_classes)

Return type

np.ndarray

class certifai.model.sdk.SimpleModelWrapper(endpoint_url: str = '/predict', port: int = 8551, host: str = '127.0.0.1', model: Optional[certifai.common.hosted_model.IBaseModel] = None, encoder: Optional[Callable[[Sequence], Sequence]] = None, decoder: Optional[Callable[[Sequence], Sequence]] = None, supports_soft_scores: bool = False, score_labels: Optional[list] = None, threshold: Optional[float] = None, model_type: Optional[str] = None, model_path: Optional[str] = None)

Provides a Flask app that runs a single model. It is optimized for models that accept a numpy array of instances from the dataset, and returns a numpy array of the predictions. For a model matching that pattern, simply invoke the /predict endpoint with numpy array of instances to get JSON encoded ordered list of predictions as response.

Parameters
  • model (IBaseModel) – any predictor object that has a predict method which takes a sequence of data vectors as a numpy array and returns a sequence of corresponding predicted values. To override default predict behaviour see SimpleModelWrapper.predict().

  • endpoint_url (Optional[str]) – valid url route string to create POST endpoint for model invoke e.g. /api/model/predict. defaults to /predict.

  • port (Optional[int]) – the port of the webserver. Defaults to 8551

  • host (Optional[str]) – the hostname to listen on. Set this to ‘0.0.0.0’ to have the server available externally as well. Defaults to ‘127.0.0.1’.

  • encoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the model’s input (e.g. - to perform one-hot encoding and so on).

  • decoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the model’s output (e.g. - to binarize with some threshold).

  • supports_soft_scores (Optional[bool]) – True, if model supports soft scores. default is False

  • score_labels (Optional[list]) – ordered list of class labels corresponding to each predicted score array in-case of soft scoring model

  • threshold (Optional[float]) – value at which prediction to be considered positive; only used in binary-classification when model returns simple list of scores for the positive class

  • model_type (Optional[ModelTypesEnum]) – type of third-party model to import. currently supported ‘h2o_mojo’

  • model_path (Optional[str]) – disk path of third-party model to import. currently supported ‘h2o_mojo’

predict_raw(instances: List) certifai.model.sdk.simple_wrapper.PredictResponse

Override this method if the model doesn’t use numpy arrays for predict input/output.

Parameters

instances (List) – {array-like, list} of data instances of shape (n_samples, n_features)

Returns

NamedTuple (PredictResponse) of model predictions, scores, labels and threshold

Return type

PredictResponse

NamedTuple(predictions: np.ndarray
           scores:      Optional[np.ndarray]
           labels:      Optional[list]
           threshold:   Optional[float]
           )

set_global_imports()

Override this method to perform external imports in-case prediction requires additional dependencies sets the third-party helper modules to be used throughout. Note: Imports must be marked global, for example

global dt
import datatable as dt
Returns

None

predict(npinstances: numpy.ndarray) numpy.ndarray

Override this method to change the way the model is called. The default implementation calls model.predict(npinstances).

Parameters

npinstances (np.ndarray) – numpy array of shape (n_samples, n_features) to predict on

Returns

numpy array of model predictions of shape (n_samples,)

Return type

np.ndarray

soft_predict(npinstances: numpy.ndarray) numpy.ndarray

Computes soft scores along with ordered list of score labels if supports_soft_scores is enabled Override this method to change how soft scores are computed. The default implementation calls model.predict_proba(npinstances).

Parameters

npinstances (np.ndarray) – numpy array of shape (n_samples, n_features) to predict on

Returns

model predict scores np.ndarray of shape (n_samples,n_classes)

Return type

np.ndarray(n_samples,n_classes)

run(production: Optional[bool] = False, worker_class: Optional[certifai.model.utils.gunicorn_conf.WorkerTypeEnum] = WorkerTypeEnum.gevent, log_level: Optional[str] = 'info', num_workers: Optional[int] = 3, timeout_secs: Optional[int] = 20)

Start the prediction service.

Parameters
  • production (Optional[bool]) – start gunicorn server if True else run native Flask app. default is False

  • worker_class (Optional[str]) – type of gunicorn worker. default is gevent. supported type (gthread,gevent,sync)

  • log_level (Optional[str]) – logging level. default is info.

  • num_workers (Optional[int]) – number of gunicorn worker processes to start. default is 3

  • timeout_secs (Optional[int]) – gunicorn worker timeout in secs. default is 20

Returns

None

class certifai.model.sdk.PredictResponse(predictions: numpy.ndarray, scores: Optional[numpy.ndarray], labels: Optional[list], threshold: Optional[float])

Representation of model prediction response, allowing for optional soft scoring information.

Create new instance of PredictResponse(predictions, scores, labels, threshold)

predictions: numpy.ndarray

Alias for field number 0

scores: Optional[numpy.ndarray]

Alias for field number 1

labels: Optional[list]

Alias for field number 2

threshold: Optional[float]

Alias for field number 3