Fraud Blocker

Clusterone's Kubernetes Deployment

Case Study

Devopsbay, a custom software development company specializing in Kubernetes applications and machine learning software development, collaborated with Clusterone to deploy a Kubernetes cluster on varied hardware. Utilizing a custom Ansible installer, Devopsbay ensured seamless integration and operational efficiency, enabling effective machine learning execution within the Kubernetes framework.

decor

Key aspects of the project

  • Technical Challenges

    Deploying Kubernetes on heterogeneous hardware with different Nvidia graphics card models and restricted infrastructure access.

  • Cost Savings

    Efficient use of existing hardware and infrastructure, reducing the need for additional investments.

  • Project Results

    Successful deployment of a Kubernetes cluster, enabling automated distribution of ML tasks and maximizing hardware utilization.

Technical challenges

Clusterone faced significant challenges in deploying a Kubernetes cluster on hardware from various manufacturers, each equipped with different Nvidia graphics card models requiring unique driver versions. The complexity was further heightened by restricted infrastructure access, limited to a single SSH port redirection through the firewall. This scenario demanded a sophisticated approach to ensure seamless integration and operational efficiency for running Clusterone's solution.

Project implementation stages

  • 1

    Initial Phase

    Preparation of an elaborate inventory for the custom Ansible installer, detailing specific hardware configurations and ensuring comprehensive access to each server.

  • 2

    Harmonization

    Standardizing operating system versions across all servers and installing essential software and tools.

  • 3

    Containerization and networking

    Setting up containerization and networking support, culminating in the creation of a Kubernetes cluster.

  • 4

    Integration

    Integrating all servers as nodes and establishing a distributed disk storage system using Ceph.

  • 5

    Configuration

    Configuring Persistent Volumes for Ceph within Kubernetes, installing appropriate GPU drivers, and enabling GPUs for ML tasks executed as Pods in the cluster.

  • decor

Problems and solutions

  • Hardware Diversity

    The varied Nvidia graphics card models required unique driver versions. The custom Ansible installer was meticulously prepared to handle these variations.

  • Restricted Access

    Limited to a single SSH port redirection through the firewall, the team ensured secure and efficient access to the infrastructure.

  • Storage Solutions

    While MooseFS and LizardFS were considered, their lack of native support for Persistent Volume functionality in Kubernetes clusters led to the adoption of Ceph, which was optimized for performance.

Team involvement

  • Project Results
  • Machine Learning Engineers
  • QA Engineers
  • Front-end Developers
  • Project Managers

Technological insights

  • Ceph

    Successful deployments of Ceph tailored for both virtualization and Kubernetes cluster environments, optimizing configurations to meet specific performance criteria.

  • Infrastructure Provisioning

    Proficiency with Ansible, complemented by tools like Foreman, Forklift, Terraform, and Packer, played a pivotal role in the project's success.

Conclusion

This case study highlights Devopsbay's technical prowess in navigating complex infrastructure challenges and their commitment to delivering solutions that meet and exceed client expectations. By leveraging advanced technologies and automation strategies, Devopsbay enabled Clusterone to fully utilize their available infrastructure for effective machine learning execution within a Kubernetes framework.

Devopsbay © 2024. All rights reserved. Designed and made by Devopsbay