certifai.model.sdk package¶

Submodules¶

Module contents¶

Abstractions for exposing machine learning models as prediction services, whose API supports scanning with Certifai.

Refer to example usage in cortex-certifai-examples Github repo.

Refer to the Cortex Certifai documentation for the details of the Predict API.

class certifai.model.sdk.ComposedModelWrapper(port: int = 8551, host: str = '127.0.0.1')¶

ComposedModelWrapper provides a Flask app that dispatches to multiple SimpleModelWrapper’s.

Parameters

model (IBaseModel) – any predictor object that has a predict method which takes a sequence of data vectors as a numpy array and returns a sequence of corresponding predicted values. To override default predict behaviour see SimpleModelWrapper.predict().
endpoint_url (Optional[str]) – valid url route string to create POST endpoint for model invoke e.g. /api/model/predict. defaults to /predict.
port (Optional[int]) – the port of the webserver. Defaults to 8551
host (Optional[str]) – the hostname to listen on. Set this to ‘0.0.0.0’ to have the server available externally as well. Defaults to ‘127.0.0.1’.
encoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the model’s input (e.g. - to perform one-hot encoding and so on).
decoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the model’s output (e.g. - to binarize with some threshold).
supports_soft_scores (Optional[bool]) – True, if model supports soft scores. default is False
score_labels (Optional[list]) – ordered list of class labels corresponding to each predicted score array in-case of soft scoring model
threshold (Optional[float]) – value at which prediction to be considered positive; only used in binary-classification when model returns simple list of scores for the positive class
model_type (Optional[ModelTypesEnum]) – type of third-party model to import. currently supported ‘h2o_mojo’
model_path (Optional[str]) – disk path of third-party model to import. currently supported ‘h2o_mojo’

add_wrapped_model(mount_prefix: str, wrapped_model: certifai.model.sdk.simple_wrapper.SimpleModelWrapper) → None¶

Adds a wrapped simple model to create ComposedModelWrapper with multiple dispatch route endpoints. For example, a mount_prefix of /models/svm and endpoint_url (from SimpleModelWrapper.endpoint_url) of /predict will create a route POST endpoint /models/svm/predict

Parameters

wrapped_model (SimpleModelWrapper) – wrapped simple model to add
mount_prefix (str) – prefix to be appended to wrapped simple model’s route (POST endpoint)

Returns

None

run(production: Optional[bool] = False, worker_class: Optional[certifai.model.utils.gunicorn_conf.WorkerTypeEnum] = WorkerTypeEnum.gevent, log_level: Optional[certifai.model.utils.gunicorn_conf.LogLevelEnum] = LogLevelEnum.info, num_workers: Optional[int] = 3, timeout_secs: Optional[int] = 20)¶

Start the prediction service.

Parameters

production (Optional[bool]) – start gunicorn server if True else run native Flask app. default is False
worker_class (Optional[str]) – type of gunicorn worker. default is gevent. supported type (gthread,gevent,sync)
log_level (Optional[str]) – logging level. default is info.
num_workers (Optional[int]) – number of gunicorn worker processes to start. default is 3
timeout_secs (Optional[int]) – gunicorn worker timeout in secs. default is 20

Returns

None

predict(npinstances: numpy.ndarray) → numpy.ndarray¶

Override this method to change the way the model is called. The default implementation calls model.predict(npinstances).

Parameters: npinstances (np.ndarray) – numpy array of shape (n_samples, n_features) to predict on
Returns: numpy array of model predictions of shape (n_samples,)
Return type: np.ndarray

class certifai.model.sdk.PandasModelWrapper(pandas_kwargs: Optional[dict] = None, **kwargs)¶

Provides a Flask app that runs a single model. It is optimized for models that accept as input a pandas.DataFrame of instances from the dataset, and returns an array-like object of predictions. The expected output of the model can be any type of Iterable, such as a list, numpy array, pandas DataFrame, or pandas Series.

If an encoder is set, then it will also receive as input a pandas.DataFrame.

Parameters for creating the `pandas.DataFrame` can be specified in the `pandas_kwargs` dictionary. Refer to the pandas documentation for available keyword arguments. For example,

m = PandasModelWrapper(model=model, pandas_kwargs={'columns': ['a', 'b', 'c', 'd']})

Parameters

pandas_kwargs – Dictionary with keyword arguments to provided to the pandas.DataFrame constructor, such as: columns, dtype, copy, or index.
kwargs – Keyword arguments for configuring the prediction service. Refer to the parameters of the SimpleModelWrapper.

predict_raw(instances: List) → certifai.model.sdk.simple_wrapper.PredictResponse¶

Override this method if the model doesn’t use pandas DataFrame’s for prediction input.

Parameters

instances (List) – {array-like, list} of data instances of shape (n_samples, n_features)

Returns

NamedTuple (PredictResponse) of model predictions, scores, labels and threshold

Return type

PredictResponse

NamedTuple(predictions: np.ndarray
           scores:      Optional[np.ndarray]
           labels:      Optional[list]
           threshold:   Optional[float]
           )

predict(df: pandas.core.frame.DataFrame) → numpy.ndarray¶

Override this method to change the way the model is called. The default implementation calls model.predict(df).

Parameters: df – DataFrame of shape (n_samples, n_features) to predict on
Returns: array-like collection of model predictions of shape (n_samples,).
Return type: np.ndarray

soft_predict(df: pandas.core.frame.DataFrame) → numpy.ndarray¶

Computes soft scores along with ordered list of score labels if supports_soft_scores is enabled. Override this method to change to how soft scores are computed. The default implementation calls model.predict_proba(df).

Parameters: df – DataFrame of shape (n_samples, n_features) to predict on.
Returns: model predict scores in an array-like collection of shape (n_samples,n_classes)
Return type: np.ndarray

class certifai.model.sdk.SimpleModelWrapper(endpoint_url: str = '/predict', port: int = 8551, host: str = '127.0.0.1', model: Optional[certifai.common.hosted_model.IBaseModel] = None, encoder: Optional[Callable[[Sequence], Sequence]] = None, decoder: Optional[Callable[[Sequence], Sequence]] = None, supports_soft_scores: bool = False, score_labels: Optional[list] = None, threshold: Optional[float] = None, model_type: Optional[str] = None, model_path: Optional[str] = None)¶

Provides a Flask app that runs a single model. It is optimized for models that accept a numpy array of instances from the dataset, and returns a numpy array of the predictions. For a model matching that pattern, simply invoke the /predict endpoint with numpy array of instances to get JSON encoded ordered list of predictions as response.

Parameters

model (IBaseModel) – any predictor object that has a predict method which takes a sequence of data vectors as a numpy array and returns a sequence of corresponding predicted values. To override default predict behaviour see SimpleModelWrapper.predict().
endpoint_url (Optional[str]) – valid url route string to create POST endpoint for model invoke e.g. /api/model/predict. defaults to /predict.
port (Optional[int]) – the port of the webserver. Defaults to 8551
host (Optional[str]) – the hostname to listen on. Set this to ‘0.0.0.0’ to have the server available externally as well. Defaults to ‘127.0.0.1’.
encoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the model’s input (e.g. - to perform one-hot encoding and so on).
decoder (Optional[Callable[[Sequence],Sequence]]) – optional function used to transform the model’s output (e.g. - to binarize with some threshold).
supports_soft_scores (Optional[bool]) – True, if model supports soft scores. default is False
score_labels (Optional[list]) – ordered list of class labels corresponding to each predicted score array in-case of soft scoring model
threshold (Optional[float]) – value at which prediction to be considered positive; only used in binary-classification when model returns simple list of scores for the positive class
model_type (Optional[ModelTypesEnum]) – type of third-party model to import. currently supported ‘h2o_mojo’
model_path (Optional[str]) – disk path of third-party model to import. currently supported ‘h2o_mojo’

predict_raw(instances: List) → certifai.model.sdk.simple_wrapper.PredictResponse¶

Override this method if the model doesn’t use numpy arrays for predict input/output.

Parameters

instances (List) – {array-like, list} of data instances of shape (n_samples, n_features)

Returns

NamedTuple (PredictResponse) of model predictions, scores, labels and threshold

Return type

PredictResponse

NamedTuple(predictions: np.ndarray
           scores:      Optional[np.ndarray]
           labels:      Optional[list]
           threshold:   Optional[float]
           )

set_global_imports()¶

Override this method to perform external imports in-case prediction requires additional dependencies sets the third-party helper modules to be used throughout. Note: Imports must be marked global, for example

global dt
import datatable as dt

Returns: None

predict(npinstances: numpy.ndarray) → numpy.ndarray¶

Override this method to change the way the model is called. The default implementation calls model.predict(npinstances).

Parameters: npinstances (np.ndarray) – numpy array of shape (n_samples, n_features) to predict on
Returns: numpy array of model predictions of shape (n_samples,)
Return type: np.ndarray

soft_predict(npinstances: numpy.ndarray) → numpy.ndarray¶

Computes soft scores along with ordered list of score labels if supports_soft_scores is enabled Override this method to change how soft scores are computed. The default implementation calls model.predict_proba(npinstances).

Parameters: npinstances (np.ndarray) – numpy array of shape (n_samples, n_features) to predict on
Returns: model predict scores np.ndarray of shape (n_samples,n_classes)
Return type: np.ndarray(n_samples,n_classes)

run(production: Optional[bool] = False, worker_class: Optional[certifai.model.utils.gunicorn_conf.WorkerTypeEnum] = WorkerTypeEnum.gevent, log_level: Optional[str] = 'info', num_workers: Optional[int] = 3, timeout_secs: Optional[int] = 20)¶

Start the prediction service.

Parameters

production (Optional[bool]) – start gunicorn server if True else run native Flask app. default is False
worker_class (Optional[str]) – type of gunicorn worker. default is gevent. supported type (gthread,gevent,sync)
log_level (Optional[str]) – logging level. default is info.
num_workers (Optional[int]) – number of gunicorn worker processes to start. default is 3
timeout_secs (Optional[int]) – gunicorn worker timeout in secs. default is 20

Returns

None

class certifai.model.sdk.PredictResponse(predictions: numpy.ndarray, scores: Optional[numpy.ndarray], labels: Optional[list], threshold: Optional[float])¶

Representation of model prediction response, allowing for optional soft scoring information.

Create new instance of PredictResponse(predictions, scores, labels, threshold)

predictions: numpy.ndarray¶: Alias for field number 0

scores: Optional[numpy.ndarray]¶: Alias for field number 1

labels: Optional[list]¶: Alias for field number 2

threshold: Optional[float]¶: Alias for field number 3