ServeConfig¶
ServeConfig
is a Kale object, which you can use to configure an
InferenceService
. Within a ServeConfig
object you can define the
backend you want to use to serve your model, limit its resources, and set
the serving account for your predictor and transformer pods.
Overview
Import¶
The object lives in the kale.serve
module. Import it as follows:
Attributes¶
Name | Type | Default | Description |
---|---|---|---|
env |
List[V1EnvVar] |
[] |
Extends the env field of a container |
env_from |
List[V1EnvFromSource] |
[] |
Extends the envFrom field of the container |
requests |
Dict[str, str] |
{} |
Sets resources.requests for the container |
limits |
Dict[str, str] |
{} |
Sets resources.limits for the container |
annotations |
Dict[str, str] |
{} |
Sets annotations for the Pod |
predictor |
Dict[str, Any] |
{} |
Sets the predictor’s spec, and the predictor’s Pod affinity ,
tolerations and node_selector fields |
transformer |
Dict[str, Any] |
{} |
Sets the transformer’s spec, and the transformer’s Pod affinity ,
tolerations and node_selector fields |
labels |
Dict[str, str] |
{} |
Sets labels for the Pod |
node_selector |
Dict[str, str] |
{} |
Sets the node_selector for the Pod |
affinity |
V1Affinity |
None |
Sets the affinity of the Pod |
tolerations |
List[V1Tolerations] |
[] |
Sets tolerations for the Pod |
protocol_version |
str |
None |
The protocol version of the predictor |
Important
If you set any of the env
, env_from
, requests
, limits
,
affinity
or tolerations
fields of the ServeConfig object, they
populate the according predictor
and transformer
fields. This
functionallity allows you to define values for both the predictor and
transformer Pods and containers at the same time. For example, if you want
the limits
field to be equal to {"memory": "4Gi"}
for both the
predictor and transformer containers, the ServeConfig
object can be the
following:
Otherwise, you can set specific values to each Pod and container. If you want
the limits
field to be different for the predictor and transformer
containers the ServeConfig
object should be the following:
The way each generic field gets populated is the following:
- If a generic value is defined and a specific one is not, then the specific value gets populated with the generic one.
- For the
env
,env_from
andtolerations
fields, if both the generic and specific fields are defined, then the two fields get merged. - For the
affinity
,request
,limits
andnode_selector
fields, if the specific field is defined, the generic one is ignored.
See also
In the table above, we also mention objects that are part of the Kubernetes Python client library, as well as the KServe Python client library. For details on the structure of the Kubernetes and KServe objects refer to:
Initialization¶
You may initialize a ServeConfig
similarly to any other Python object:
However, you can also initialize a field that expects Kubernetes objects by passing a dictionary, which Kale will then deserialize into the corresponding Kubernetes object. For example:
To configure an Inference service using a ServeConfig
object, you can pass
it to the serve()
function located in the same package:
To learn more about the frequent uses of the ServeConfig
object you can
follow the user guides for the supported ML frameworks. For example:
- Use the
ServeConfig
object to retrieve a model stored in an external object storage service, like S3, by following the PyTorch and Triton user guides. - Use the
ServeConfig
object to serve custom predictors and transformers by following the user guides in the custom inference services section. - Use the
ServeConfig
object to configure common parameters for the predictor and transformer Pods by following the InferenceServe configuration user guide.