This is the first of hopefully many posts talking about my learnings with Kubernetes(k8s).
This link can provide some context on teh specific application this applies to.
A little bit about the simulation service which we will be calling AtriarchSimc. Note: This isn’t a brand but it helps me differentiate between SimulationCraft, RaidBots, and my own homebrewed application.
The AtriarchSimc app engine internally runs SimulationCraft for each instance deployed. This is important because, one of the reasons RaidBots is helpful is that typical home PCs may not have enough RAM to support very large simulations. I personally wonder if it’s a drawback to simulation craft and if I get the time I’d like to go see if I can batch simulation results to disk and see if I can make some changes with a focus on conservation of RAM. However, there are a lot of good people contributing to SimC so I’m not optimistic that it will be so easy. But I digress, because of this high RAM usage, my service limits simulation size to prevent cluster nodes from being overburdened. This limits is only a half fix however, since the RAM usage is not immediate, the utilization builds up over time as the simulation runs. This allows a case where if multiple simulations are started at the same time, when the app engine scales up, k8s can place multiple instances of the engine on the same node. If the simulations are sufficiently large they can run the node out of memory. It’s also possible that k8s will evict one of the engines, stopping the simulation, requiring the system to allocate the simulation to the next available engine and loosing any progress.
Enter Kubernetes Documentation, and as good as it is… it can be a little much to digest for someone who is still learning. So, I’m hoping to answer in post the narrow question “How do I ensure that I never have the same pod more than once on the same node?”
One solution I came across was Pod Topology Constraints. This was a pretty useful too for enforcing relatively even distribution of pods across nodes, however for my use case, this wasn’t able to be used to force no more than one instance of a specific pod on any node.
TLDR:
At the end of my searching I found pod affinities/anti-affinities. This solution was specifically what I was looking for to fit my use case and I finally settled on the following deployment yaml. (Note this yaml has irrelevant sections removed for brevity)
apiVersion: apps/v1
kind: Deployment
metadata:
name: atriarch-simc-engine
spec:
selector:
matchLabels:
app: atriarch-simc-engine
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- atriarch-simc-engine
topologyKey: "kubernetes.io/hostname"
Using pod anti-affinities I was able to inform the default scheduler that if a node already has a pod with the label “atriarch-simc-engine” scheduled on it then do not schedule another pod with the same label. The node is identified by the topology key kubernetes.io/hostname and ensures that not two nodes with the same hostname will have a duplicate of the same pod. This solution allowed my engine pods to scale to the number of worker nodes in the cluster and distribute the load on RAM as multiple engines are running simulations.

