k8s stands for kubernetes, a few kubernetes related projects, cautionary tales and discussions are listed below.
talos and k8s
alpine and k8s
k3s, k0s, kind and microk8s
There are certified distributions which are not too resource hungry, especially if you need to self-host clusters, for example kind (kind.sigs.k8s.io),k3s (https://k3s.io/), k0s (https://k0sproject.io/) and microk8s. A list of excellent comparisons can be found here: https://blog.flant.com/small-local-kubernetes-comparison/ https://blog.radwell.codes/2021/05/best-kubernetes-distribution-for-local-environments/
Kubeflow makes deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. The goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures.
https://polyaxon.com/ deployed on GKE can be used to queue.
Auto-magical CI/CD to streamline an ML workflow. Experiment Manager, MLOps and data-Management
Open source platform for the machine learning lifecycle. one part experiment manager https://mlflow.org/
One part experiment tracking similar to MLFLOW, and one part (trains-agent) that helps schedule jobs into GPU servers.
kubernetes work queue
use a kubernetes work queue and just have them queue jobs.
Redis or rabbitMQ can be used to store the queue and then k8s can be used to spawn jobs from that.
minikube quickly sets up a local Kubernetes cluster on macOS, Linux, and Windows. Needs container or virtual machine manager, such as: Hyperkit, Hyper-V, KVM or Podman. It runs on Linux arm64 and with tweaks with Alpine.
In the case of kind, k3d, and minikube, you can go for one Linux VM (for a basic cluster). You can also use minikube by running Talos directly in Docker or QEMU with “talosctl”. More importantly can cache: https://minikube.sigs.k8s.io/docs/handbook/offline/
dkube is commercial an End-to-End MLOps Platform
https://docs.podman.io/en/latest/ podman can be installed on alpine and talos.
portainer and rancher
Rancher (https://rancher.com/) or Portainer (https://www.portainer.io/) for easier management and/or dashboard functionality. For example, you can create a deployment through the UI by following a wizard that also offers you configuration that you might want to use (e.g. resource limits) and then later retrieve the YAML manifest. They also make working with pre-made Helm charts (packages) more easy.