Skip to content

Conversation

@pauloguilhermepp
Copy link

@pauloguilhermepp pauloguilhermepp commented Nov 6, 2025

Deploy Codabench Using Kubernetes

This PR aims to make Codabench deployable using Kubernetes.

Main Changes

To deploy Codabench using Kubernetes, we:

  • Added Helm charts for all core components.
  • Updated Dockerfiles to improve compatibility with Kubernetes deployments.

Issues this PR resolves

How to Test it

  1. Prerequisites
  2. Start your cluster (eg, minikube start)
  3. Fill in values.yaml properly
  4. Deploy the Helm chart
helm install <project_name> . -n <namespace> --create-namespace -f <path_to_values>
  1. Check the status of your pods
kubectl get pods

Observations

  • Tests mainly made at 1.19.

  • To expose the codabench UI, you will have to install an ingress controller and create an Ingress.

  • This PR does not include a template for Minio yet. For now, testing requires an external S3 instance. We can also add the Minio chart if needed.

  • The helm chart passes secret values directly via environment variables, which is not the best practice, as it will make them visible in deployment tools like ArgoCD. If you want, we can update this PR (or open a future one) with this change.

  • Inside values.yaml, there is a section (env) with environment variables that are currently passed to multiple pods. This is similar to the .env setup in docker-compose, but as we sometimes had a hard time debugging which variable is used where, we would like to separate those in a later PR.

  • This PR is making Codabench deployable using Kubernetes, but it is not changing how the compute-worker is running the user submissions. We plan to open another PR in the future, updating the compute-worker to run submissions using Kubernetes. We plan to make it configurable and using docker as default so that it will not break any current deployments. Please note that for the current way the compute worker works, the Kubernetes Pods need to mount a volume from the host node, to mount the Docker socket from the node, and run privileged Docker containers, which is not the best practice. This will mess with the storage on the node and will mess with the scheduling, as containers are directly created, bypassing the Kubernetes scheduler. Therefore, we will open another PR.

  • The PR also contains the Dockerfiles that we are using. Ideally, we should see if they are still compatible with deploying Codabench using the original docker-compose setup and keep a single Dockerfile for each component. The images were updated because, ideally, we don't want to mount the code from a PersistentVolume or the host node but rather have the image contain the code.

Checklist

  • Code review by me
  • Hand tested by me
  • I'm proud of my work
  • Code review by reviewer
  • Hand tested by reviewer
  • CircleCi tests are passing
  • Ready to merge

dependencies:
- name: rabbitmq
version: "14.7.0"
repository: "oci://registry.cern.ch/kubeflow/charts"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will replace this one with the original upstream chart https://hub.docker.com/r/bitnamicharts/rabbitmq. Same for redis below.

RUN poetry install

# Copy the rest of the application code
COPY . /app
Copy link

@hahahannes hahahannes Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the discussion about the Dockerfiles. We saw that the django component was missing the code (https://github.com/codalab/codabench/blob/develop/Dockerfile) also in the latest develop branch. Therefor we added node builder image here as well as the copying of the code.

As long as the container is self contained and does not required a volume mount, it will be compatible with this chart.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants