Datarobot Common Infra - the fortune 500 company's ongoing journey to improve its infrastructure
Datarobot, a leader in AI, needed a unified infrastructure for its applications. The project included consolidation of services, migration of infrastructure, and implementation of advanced DevOps solutions so that all products offered by this company use the same infrastructure.
- Cloud
- Kubernetes
- Infrastructure
- +3CloudKubernetesInfrastructure
about the project
Datarobot is an artificial intelligence company that provides advanced solutions for business.
It specialises in developing AI platforms and applications that help organisations maximise the impact and minimise the risks associated with implementing artificial intelligence. Datarobot serves a variety of industries, including energy, financial services, healthcare, manufacturing and the public sector, offering solutions tailored to the specific needs of each sector.
The three main solutions offered by Datarobot are:
- AI Platform - a comprehensive tool for creating, deploying and managing AI models, supporting both generative and predictive AI.
- AI Application - ready-to-use AI applications that integrate with key business processes, enabling teams to effectively use AI in day-to-day operations
- Automated machine learning - A platform that automates the complex steps of the machine learning process, enabling the rapid building and deployment of predictive models.
the challenge
Stability and performance under increasing load without additional costs.
technologies
Kubernetes
Helm
Karpenter
Terraform
Prometheus
Grafana
Vault
Results
- Kubernetes cluters in three major regions -
- Single-client clustering processes
- Unification of the platform
- Effective resources management
Kubernetes cluters in three major regions -
Deployment and management of 6 production clusters in the US, EU and Japan, ensuring high availability and performance for users worldwide.
Single-client clustering processes
Create processes to enable single clusters for customers on Azure and GCP platforms, increasing the flexibility of the offering.
Unification of the platform
Consolidation of all DataRobot services on a common infrastructure, which has simplified management and reduced operating costs.
Effective resources management
Implementation of advanced auto-scaling mechanisms using Karpenter, ensuring optimal use of resources and cost control
Kubernetes cluters in three major regions -
Deployment and management of 6 production clusters in the US, EU and Japan, ensuring high availability and performance for users worldwide.
Benefits
- 1
Cost reduction
Significant reduction in operational expenditure through consolidation of services and efficient management of resources through automatic scaling.
- 2
Platform stability and reliability
Management of 6 production clusters in 3 regions of the world has ensured high availability and reliability of DataRobot services
- 3
Reducing work delays on Datarobot products
Strategic deployment of clusters in the US, EU and Japan has minimised delays for end users, providing faster access to services
- 4
Flexibility in implementation
Expanding the platform to include Azure and GCP has enabled the offering to be tailored to a wider range of customers and their specific infrastructure requirements
- 5
Simplifying platform management
Consolidation of services on a common infrastructure has significantly simplified operational processes and reduced the complexity of managing the platform
you may also like
Change the data from chaos to clarity
Devopsbay helped a multinational manufacturing company on a project to speed up the data preparation process by 70% by implementing DataRobot Data Prep. The project focused on automating the cleaning and transformation of data from multiple sources, significantly reducing the time required to prepare data for analysis.
Enhancing advance MLOps platform
Devopsbay worked with Algorithmia on a platform for managing AI/ML models. We implemented central management and flexible deployment options. We added integrations with Kafka and Bitbucket SCM. The results were faster model deployment, better scalability and lower operational costs. The client gained a comprehensive tool for managing the lifecycle of AI/ML models.