Labels and Taints
To make scheduling more efficient and compatible with Kubernetes, Ocean supports the Kubernetes constraint mechanisms for scheduling pods:
- Node Selector: Constrains pods to nodes with particular labels.
- Node Affinity: Constrains nodes for pod scheduling eligibility based on node labels. Spot supports hard and soft affinity (
requiredDuringSchedulingIgnoredDuringExecution,preferredDuringSchedulingIgnoredDuringExecution). - Pod Affinity and Pod Anti-Affinity: This function schedules a pod based on whether other pods run on a node.
- Pod Port Restrictions: Validates that each pod will have required ports available on the machine.
- Well-Known Labels.
Spot Labels
Spot labels let you adjust Ocean's default scaling behavior. Add them to your pods to control the node termination process or lifecycle.
spotinst.io/azure-premium-storage
The AKS scheduler does not guarantee that pods requiring premium storage will schedule on nodes that support premium storage disks.
The Spot Ocean label spotinst.io/azure-premium-storageis injected into every node in a node pool that supports premium storage.
We recommend using spotinst.io/azure-premium-storage on your pods when the pod requires premium storage disks.
This ensures pods are scheduled on nodes that meet their storage requirements.
For more information, see Azure premium storage.
spotinst.io/restrict-scale-down
Some workloads tolerate spot instance replacements less than others. To minimize node replacements for these workloads while maintaining spot instance cost savings, use the spotinst.io/restrict-scale-down label (set to true) to block proactive scaling down of instances for bin packing purposes. This will leave the instance running as long as possible. The instance will be replaced only if it becomes unhealthy or if forced by a cloud provider interruption.
spotinst.io/node-lifecycle
Ocean uses the spotinst.io/node-lifecycle label key to indicate a node's lifecycle. It is applied to all Ocean-managed nodes and has a value of od (on-demand).
This label is useful for workloads that are not resilient to spot instance interruptions and must run on on-demand instances at all times.
By applying node affinity to the spotinst.io/node-lifecycle label with the value od, you can make sure that these workloads are scheduled only on on-demand instances.
spotinst.io/node-lifecycle:spot affinity is not supported, and unless spotinst.io/node-lifecycle:od affinity is applied, Ocean will continue to try to provide excess compute capacity (spot instances) for all workloads in the cluster.
spotinst.io/gpu-type
This label helps create direct affinity to specific types of GPU hardware, freeing you from the need to explicitly set and manage a list of VMs that contain the required hardware. Ocean automatically matches the relevant VMs (AWS and GCP) for workloads having affinity rules using this label. Valid label values are:
nvidia-tesla-v100nvidia-tesla-p100nvidia-tesla-k80nvidia-tesla-p4nvidia-tesla-t4nvidia-tesla-a100(Only for AWS)nvidia-tesla-m60amd-radeon-v520nvidia-tesla-t4gnvidia-tesla-a10
Don't add Spot labels under the virtual node group (launch specification) node labels section. Add these labels to the pod configuration only.
Instance Types Labels
Format: aws.spot.io/instance-<object>, for example, aws.spot.io/instance-category
Apply these labels to a workload's constraints (nodeSelector, node affinity, etc.) to reflect instance type properties. For example, constrain workloads to run on any M6, M7, or R7 family. This avoids manually listing all instance types per family.
The instance labels are as follows:
aws.spot.io/instance-category: Reflects the category of the instance (for example., c).aws.spot.io/instance-family: Reflects the family of the instance (for example., c5a).aws.spot.io/instance-generation: Reflects the generation of the instance (for example., 5).aws.spot.io/instance-hypervisor: Reflects the hypervisor the instance uses (for example., nitro).aws.spot.io/instance-cpu: Reflects the CPU the instance uses (for example., 2).aws.spot.io/instance-memory: Reflects the instance's memory (for example., 4096).
These labels only launch nodes that match the required pod labels.
Examples
Using restrict scale-down label:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spotinst.io/restrict-scale-down: "true"
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
resources:
requests:
memory: "2Gi"
cpu: "2"
limits:
memory: "4Gi"
cpu: "4"
Using od nodeSelector:
apiVersion: v1
kind: Pod
metadata:
name: with-node-selector
spec:
containers:
- name: with-node-selector
image: registry.k8s.io/pause:2.0
imagePullPolicy: IfNotPresent
nodeSelector:
spotinst.io/node-lifecycle: od
Using od nodeAffinity:
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: spotinst.io/node-lifecycle
operator: In
values:
- od
containers:
- name: with-node-affinity
image: registry.k8s.io/pause:2.0
Startup Taints
Cloud service provider relevance: AWS Kubernetes
Startup taints are temporary taints applied to a node during its initialization phase. The Ocean autoscaler recognizes these taints and avoids scaling up additional nodes for pending pods that match this node, anticipating the taint's removal. Once the startup taint is removed, any pod without a matching toleration can be scheduled on the node without requiring new nodes to be launched.
When to Use Startup Taints
Use startup taints to make sure a specific pod deploys to a node before other pods. Once that initialization pod is ready or has completed a defined procedure (such as configuring networking), other pods can be scheduled on the node.
Example - Cilium: Cilium recommends applying a taint such as node.cilium.io/agent-not-ready=true:NoExecute to prevent other pods from starting before Cilium has configured networking on the node.
Only the initialization pod will have a toleration for this taint. Once the node is ready, the application running in the initialization pod removes the taint from the node.
If the startupTaint attribute has not been removed for a specific node by the end of the cluster's grace period, a new node will be launched for any pending pods. The grace period starts when a node is created; its default is 5 minutes, and you can configure it in the cluster under cluster.strategy.gracePeriod.
Configure Startup Taints in the Spot API
AWS Kubernetes only
Prerequisite: Ocean controller version v2.0.68 or above.
Configure Ocean to consider your startup taints using the startupTaints attribute at the Ocean cluster and virtual node group levels.
-
Cluster: under
cluster.compute.launchSpecification -
Virtual node group: under
launchSpec
You must also set the startupTaint as a regular taint in the userData for the cluster or virtual node group. This is because Ocean does not add or remove configured startup taints.